CN117348977A - Method, device, equipment and medium for controlling transaction concurrency in database - Google Patents

Method, device, equipment and medium for controlling transaction concurrency in database Download PDF

Info

Publication number
CN117348977A
CN117348977A CN202311109428.5A CN202311109428A CN117348977A CN 117348977 A CN117348977 A CN 117348977A CN 202311109428 A CN202311109428 A CN 202311109428A CN 117348977 A CN117348977 A CN 117348977A
Authority
CN
China
Prior art keywords
transaction
lock
conflict
transactions
write
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311109428.5A
Other languages
Chinese (zh)
Inventor
王文学
有伟东
徐冠军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202311109428.5A priority Critical patent/CN117348977A/en
Publication of CN117348977A publication Critical patent/CN117348977A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/524Deadlock detection or avoidance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method, a device, equipment and a medium for controlling transaction concurrency in a database, wherein the database processes transactions based on a two-stage locking protocol, and the method comprises the following steps: during the read phase, the transaction obtains a read lock or a write lock by storing corresponding state information in the lock manager to confirm priority between conflicting transactions when they commit; and delaying the conflict detection from the reading stage to the submitting stage, and carrying out serialization processing according to the state information of the conflict transaction when the conflict transaction is resolved in the submitting stage so as to ensure the priority of the conflict transaction. According to the embodiment of the invention, the influence of conflict on delay is considered in the transaction concurrency control of the database, the priority is submitted by forced execution through delay conflict detection, the occurrence of deadlock and starvation is avoided, and the transaction processing performance in the database is improved.

Description

Method, device, equipment and medium for controlling transaction concurrency in database
Technical Field
The present invention relates to the field of database technologies, and in particular, to a method, an apparatus, a device, and a medium for controlling concurrency of transactions in a database.
Background
With the development of database technology, better performance is achieved by storing all data in DRAM (Dynamic Random Access Memory ) in combination with a lightweight concurrent control protocol to host the database. However, delay-sensitive applications further require that data services provide low and predictable delays, which are important for large systems that serve Web searches, email, and many other types of interactive services, where data queries can be spread out to thousands of data servers when processing a single request.
Currently, much effort to reduce latency is focused on various layers of the operating system, including kernel scheduling, queue management, and caching mechanisms. These methods either reduce latency peaks by separating large requests from small requests, avoid queuing delays caused by head-of-line blocking, or reduce latency by minimizing performance consumption by background operations such as garbage collection, data compression, etc. However, these schemes do not focus on the impact of request collisions on latency, which is very common in modern high-contention transactional workloads, and while there has been extensive research such as transaction scheduling, program analysis, etc. directed to mitigating the overhead of request collisions, these techniques focus primarily on throughput rather than on latency.
Disclosure of Invention
In view of the foregoing, a method, apparatus, device, and medium for providing concurrency control of transactions in a database that overcomes or at least partially solves the foregoing problems have been presented, including:
a method of concurrency control of transactions in a database, the database processing transactions based on a two-phase locking protocol, the method comprising:
during the read phase, the transaction obtains a read lock or a write lock by storing corresponding state information in the lock manager to confirm priority between conflicting transactions when they commit;
and delaying the conflict detection from the reading stage to the submitting stage, and carrying out serialization processing according to the state information of the conflict transaction when the conflict transaction is resolved in the submitting stage so as to ensure the priority of the conflict transaction.
Optionally, the method further comprises:
a private buffer is created for each worker thread and write operations are controlled to be modified in the private buffer during the read phase to prevent incomplete or uncommitted data from being read.
Optionally, the state information includes time stamp information, and the serialization process is performed according to a time stamp order.
Optionally, when the conflict transaction is resolved in the commit stage, serializing processing is performed according to the state information of the conflict transaction to ensure the priority of the conflict transaction, including:
Judging whether the time stamp of the current transaction is larger than that of other transactions;
if the timestamp of the current transaction is greater than the timestamps of other transactions, aborting the current transaction;
if the timestamp of the current transaction is less than or equal to the timestamp of the other transactions, the commit transaction is controlled to wait until a conflicting transaction commits.
Optionally, the method further comprises:
for write-write conflicts, the transaction directly delays acquiring the write lock;
for a blind write operation, the transaction acquires a write lock in the commit phase;
for read-modify-write operations, the transaction acquires the shared lock during the read phase and upgrades the shared lock to exclusive mode during the commit phase.
Optionally, the method further comprises:
by using a lock-less lock, locks are acquired and conflicts are detected at different stages.
Optionally, the method further comprises:
realizing lock-free locking by using a lock-free list of atomic words; wherein, a bit is allocated to each working thread in the atomic word, the offset of the bit is determined by the ID of the working thread, when the read lock is obtained, the corresponding bit is set to 1, whether the exclusive entry is set is checked, if the exclusive entry is set, the corresponding bit is cleared and waits.
An apparatus for concurrency control of transactions in a database, the database processing transactions based on a two-phase locking protocol, the apparatus for:
During the read phase, the transaction obtains a read lock or a write lock by storing corresponding state information in the lock manager to confirm priority between conflicting transactions when they commit;
and delaying the conflict detection from the reading stage to the submitting stage, and carrying out serialization processing according to the state information of the conflict transaction when the conflict transaction is resolved in the submitting stage so as to ensure the priority of the conflict transaction.
An electronic device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, which when executed by the processor implements a method of concurrency control of transactions in a database as described above.
A computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method of concurrency control of transactions in a database as described above.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, the transaction acquires the read lock or the write lock by storing the corresponding state information in the lock manager in the read stage, so that the priority among the conflict transactions is confirmed when the conflict transactions are submitted, the conflict detection is delayed from the read stage to the submit stage, and the conflict transactions are processed in a serialization manner according to the state information of the conflict transactions when the conflict transactions are solved in the submit stage, so as to ensure the priority of the conflict transactions, realize that the influence of the conflict on the delay is considered in the transaction concurrency control of the database, and avoid the occurrence of deadlock and starvation phenomena by delaying the conflict detection and forcing the submit priority, and improve the transaction processing performance in the database.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the description of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
FIG. 1 is a flowchart illustrating a method for concurrency control of transactions in a database according to one embodiment of the present invention;
FIG. 2 is a flowchart illustrating steps of another method for concurrency control of transactions in a database according to one embodiment of the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
For concurrency control protocols, including optimistic concurrency control (OCC, optimistic Concurrency Control) and pessimistic two-Phase Locking (2 pl), they each have advantages in terms of throughput and delay. OCC assumes that conflicts between transactions are rare, so it detects conflicts only during the commit phase, reducing lock overhead and effectively expanding on multi-core servers. In addition, transactions in memory are typically transient, and OCC remains efficient even at high abort rates, because the overhead of re-running aborted transactions is not high. Indeed, many recent efforts are discussing the variant protocols of OCC to provide higher performance.
Whereas 2PL requires transactions to acquire locks before accessing records, 2PL has a longer lock period, and special requirements for read locks are imposed, so it is rarely adopted in the latest in-memory databases. However, a variant protocol of 2PL, such as the wound_wait protocol, performs well in terms of latency, 12-20 times better than the OCC protocol. For latency, transactions with high latency will typically be aborted multiple times, with wound_wait resolving conflicts by first committing transactions with older timestamps. Since aborted transactions still use old timestamps, they typically have higher commit priority. Thus, wound_wait provides low latency by preventing aborted transactions from aborting again. For throughput, 2PL is inefficient because it creates unnecessary blocking and locking overhead when transactions collide.
In order to improve the processing efficiency of concurrent transactions, the occurrence of deadlock and starvation phenomena is avoided by delaying the priority of the detection forced execution of conflict. Meanwhile, the algorithm designs a phased lock primitive, so that a lock-free locking mode is realized, and locking expenditure is reduced to the maximum extent. Moreover, it has pessimistic locking and optimistic reading properties. In particular, a transaction must acquire a read-write lock, i.e., a pessimistic lock, before accessing a record from the database, but the transaction may ignore the lock conflict, directly accessing the record without being blocked, i.e., optimistically reading. By using the method, the two characteristics of high throughput and low delay can be simultaneously realized, and the method has flexibility, operability and expandability, improves the transaction processing performance of the data development module of the data center, reduces the system cost and brings better experience to users.
As shown in FIG. 1, by setting a plurality of working threads and by thread contexts, lock-free locking (lock-free list of atomic words) is used to avoid blocking phenomena such as deadlock, starvation, etc., and to reduce overhead, and by delaying write lock acquisition and private buffer writing in a commit protocol, a serializable isolation level is provided, and high throughput, low latency is achieved.
The following details the respective parts:
two-stage locking:
2PL requires transactions to acquire locks for records in either exclusive or shared mode before writing or reading records. To ensure correctness, 2PL enforces two rules: first, different transactions cannot simultaneously have conflicting locks; second, once a transaction relinquishes ownership of a lock, it cannot acquire additional locks. The second rule breaks 2PL into two phases (i.e., an increase phase and a shrink phase). When a transaction fails to acquire a lock due to violation of the first rule, the 2PL will put the request transaction in a wait queue until the lock is available. There are many 2PL variants to avoid deadlock during the waiting period.
Wound_wait protocol:
when a transaction requests a lock that some owners currently hold, the owners with the timestamps greater than are aborted, i.e., WOUND; then, either a new owner is made or a lock is waited, depending on whether all owners are suspended, a process referred to as WAIT above.
Data structure:
assume that the system initiates multiple worker threads to run a transaction issued by a client. Each worker thread has a context (ctx) that contains the following information when the transaction is run.
Ctx.wid (16 bits, non-zero) is the ID of each worker thread.
Ctx.ts (47 bits) is the timestamp of the current running transaction, which determines the sequence number of the commit phase.
Ctx.status (1 bit) keeps track of the state of each worker thread, either running or aborting. Using the wound_wait protocol, the worker thread terminates the conflicting transaction by switching this field of the thread running the transaction.
In one example, wid, ts, and status together form a 64-bit integer that uniquely identifies the currently running transaction. The contexts of all worker threads make up an array, ctx_arr [ ], indexed by worker ID and globally accessible.
Each worker thread also maintains a read-write set of current transactions. The read set contains pointers to records that the transaction reads, and the write set tracks the modified records. Records that are simultaneously read and modified will appear in both the read set and the write set. Each entry in the write set also has a private buffer to accommodate updates to this record by transactions.
Reading:
the worker thread initializes its context before running the new transaction and always acquires the corresponding lock before actually accessing the record, as follows:
data ctx_arr [ ], read set, write set
1 EXECUTE(wid):
2 ctx_arr[wid].ts=new_ts();
3 ctx_arr[wid].status=running;
4...
5 LOCKRD(ctx_arr[wid]);
6...
7 LOCKWR(ctx_arr[wid]);
8 COMMIT(wid);
To achieve higher concurrency, a lock manager is assigned to each record. The lock manager maintains the following information: first, the current writer w who owns the lock in an exclusive manner; secondly, a waiting list of writers arranged in ascending order according to the time stamp; third, reader list arranged in time of arrival orderHow the lock manager handles shared and exclusive lock requests is described as follows:
specifically, to obtain a read lock, the worker thread simply ignores the current writer and inserts itself directly into the read lockWhen writer w is submitting, the reader should be blocked to avoid reading incomplete data. When a write lock is acquired, the concurrency control algorithm checks for write-write conflicts but ignores read-write conflicts. To achieve this, it first adds itself to the "waiting list" and then acquires the lock by competing on the variable w using an atomic CAS operation. If the writer holds the lock, the WUND_WAIT protocol is used to resolve the write-write conflict (i.e., w+.0). If the timestamp of the request transaction is less than the lock owner, it kills the owner by switching its state to abort. The lock requester then waits Until it acquires the lock. The atomicity of lock acquisition will be discussed later. After obtaining the lock, the worker accesses the record and runs the transaction logic. Note that all modifications to the transaction are buffered in the private buffer prior to the commit phase.
A submitting stage:
upon completion of the transaction, the worker commits the transaction in three steps, as follows:
specifically, in phase 1, the worker thread detects read-write conflicts for all records in the write set of the transaction. It first upgrades the locks in the write set to exclusive mode. Exclusive (exclusive) locks block all readers that later attempt to acquire a read lock from reading incomplete data. When the reader observes that the lock is in exclusive mode, if the reader owns a small timestamp, it will abort this commit transaction, and then the reader waits until the exclusive lock is removed. The implementation of the exclusive lock mode will be discussed later. Once the lock is in exclusive mode, the worker detects a conflict by scanning readers in the lock: it kills all young readers with larger time stamps and waits for older readers to release the lock.
After phase 1, the worker may safely commit the transaction. In stage 2, the worker releases all read locks in its read set by deleting itself from the reader's list. In stage 3, the worker commits the modified record to the database and releases the write lock. Releasing the write lock includes three steps: the committed transaction is removed from the "wait list," the exclusive locking mode is disabled, and the longest waiter is found in the "wait list" and ownership of the lock is handed over to it.
Activity:
as shown by the lifecycle of transactions in the concurrency control algorithm, when a transaction holds some conflicting locks, it can be killed by other conflicting transactions at any stage. Thus, if a transaction is killed, the transaction needs to actively abort itself, which avoids loop dependencies, thereby ensuring system liveness. This is achieved by calling the poll () function while the transaction is waiting for a lock; note that once stage 1 is completed, the transaction is committed without checking its state; at this stage it will neither acquire more locks nor wait for lock dependencies. However, other transactions are unaware of such information and may still kill it. When the worker thread runs the next transaction, it eventually sees this abort state and aborts the current transaction, causing unnecessary aborts. This problem is solved by placing status and ts in the same 64-bit word ctx. The thread aborts or activates the transaction by atomically altering the entire word, only if the target thread is still using the original timestamp.
The preemptive abort attribute of the concurrency control algorithm (i.e., a transaction with a smaller timestamp may abort a conflicting transaction with a larger timestamp) ensures that the oldest transaction can always be committed because other conflicting transactions cannot block or abort it. Furthermore, the newly generated timestamp is monotonic, and the aborted transaction still uses its original timestamp when rerun. Thus, an aborted transaction eventually becomes the oldest transaction after a certain number of retries and then commits, which on the other hand guarantees protocol liveness. In view of the above analysis, the concurrency control algorithm has no starvation phenomenon, and if the number of working threads is smaller than the number of physical CPU cores, even if busy waiting is used in a lock manager, the condition of deadlock of the whole system does not occur.
Insertion operation:
the conflict detection of the insert operation is quite different, since the conflict occurs on non-existent records and there is nothing to lock. This problem is solved by a method of pre-inserting a new record for the insertion request during the read phase. The insert operation for key k is processed as follows: if k already exists, the transaction aborts; otherwise, the worker initializes a new record r and acquires its write lock in advance. Lock acquisition is always successful because the record is still invisible to other concurrent transactions. The record is then published to the database by inserting a slave map in the index structure. If the transaction aborts, the newly inserted map and record is deleted.
Delayed write lock acquisition:
a blind write record (i.e., a transaction writes to a record without reading it in advance) may be arbitrarily modified by other transactions prior to commit. Thus, only their write locks need to be acquired during the commit phase. The Read-modify-write record appears in both the Read-set and the write-set. Thus, their read locks are only acquired during the read phase and are upgraded to exclusive mode prior to committing the transaction. Delaying write lock acquisition further brings the following optimization opportunities: by acquiring the write lock during the commit phase, a transaction already has a complete write-set, so the write-set can be ordered and the write lock acquired in a certain global order to avoid deadlock. By enabling delayed write lock acquisition, the concurrency control algorithm may complete the read phase without any lock blocking.
However, it is found that too optimistic is not always beneficial during the read phase. For example, during storage, if a concurrent transaction reaches the commit phase too quickly, the present concurrency control algorithm typically results in a higher abort rate when using the wound_wait protocol to detect conflicts.
Lock-free implementation:
by introducing a latch-free lock, read-write conflicts and write-write conflicts are handled in a lock-free manner. The concurrency control algorithm described above resolves lock acquisition and conflict detection into different phases, and this unique attribute greatly simplifies the manner in which lock primitives are implemented using lock-free data structures.
For a read lock, it is necessary to ensure that when multiple readers are simultaneously presentWhen an item is inserted, the reader is listed +.>Is atomic. In addition, read-write conflicts should also be detected atomically. It is necessary to ensure that all subsequent readers are blocked and all existing readers are visible to commit transactions once the lock is upgraded to exclusive mode. To achieve this semantics, the concurrency control algorithm introduces a predefined 64-bit value exlc_sig in the lock's reader list to indicate exclusive mode. Specifically, & gt>Ordering by arrival time of each insert; commit transaction pass- >An excl_sig is appended to the end to upgrade the lock, and then the reader entries that appear before the excl_sig are scanned to detect read-write conflicts. When a transaction gets a read lock, it will be +.>Then scan backward to find if there is an excl _ sig entry and abort if necessary.
For write locks, only one worker is allowed to monopolize the write lock, which can be accomplished by a CAS instruction on atomic word w. Detecting write-write conflicts requires simultaneous operation w and "waiting list". However, pressing an item in the "waiting list" and changing the value of w cannot be performed atomically; the method comprises the steps of carrying out a first treatment on the surface of the Likewise, releasing the lock and giving it to the oldest camper cannot be done atomically. When the lock is requested by the T i See a non-zero w, at the same time the lock owner releases the lock and at T i Before being visible in the list (i.e., T i Not yet submitted at this time) grants the lock to the longest camper in the "waitlist" may be inconsistent. At this time T i A transaction with a larger timestamp may be awaited. To solve this problem, concurrency control algorithms require that the camper in the "wait list" repeatedly compare it with the timestamp of w, and if an inconsistency occurs, terminate the lock owner.
Lockless list using atomic words:
the implementation of a lock-free list can be further simplified by manipulating bits in an 8-byte atomic word. Specifically, a bit is allocated to each work thread in the atomic word, and the offset of the bit is determined by the ID of the work thread, i.e., wid. To insert an item into the list, the worker thread need only set the corresponding bit to 1 through the fecth_and_add. To support the exclusive mode, the last bit of the atomic word is reserved as an excl_sig entry. When the worker obtains the read lock, it uses fetch_and_add to set the corresponding bit to 1 and checks whether the excl_sig entry is already set. If so, the reader clears its corresponding bit and waits. Thus, an 8 byte atomic word can support up to 63 worker threads, and as more worker threads are added, the atomic word can be extended or a lock-free queue can be used.
Referring to FIG. 2, a flowchart of the steps of a method for concurrency control of transactions in a database processing transactions based on a two-phase lock protocol is shown, according to one embodiment of the present invention.
Specifically, the method comprises the following steps:
in a read phase, the transaction obtains a read lock or a write lock by storing corresponding state information in the lock manager to confirm priority between conflicting transactions when they commit, step 201.
Step 202, delaying the conflict detection from the reading stage to the submitting stage, and when the conflict transaction is resolved in the submitting stage, carrying out serialization processing according to the state information of the conflict transaction so as to ensure the priority of the conflict transaction.
In an embodiment of the invention, transactions are handled using a standard read phase and commit phase. During the read phase, a transaction acquires a read or write lock by storing its state (including a timestamp) in the lock manager, which helps to confirm commit priority between conflicting transactions when they commit. The algorithm does not check for lock conflicts during the read phase and thus the locking process ends immediately. The algorithm delays conflict detection to the commit stage, where the algorithm can ensure that conflicts are safely delayed so as not to cause inconsistencies and not to violate isolation.
Moreover, by safely ignoring conflicts during the read phase, transactions never read incomplete or uncommitted data, thereby ensuring the security of the data; by resolving conflicts during the commit phase, transactions may be serialized in time stamp order, thereby guaranteeing priority of transactions.
In the embodiment of the invention, the characteristics of high throughput and low tail delay can be realized at the same time. The algorithm can implement both pessimistic locking and optimistic reading, which improves throughput by avoiding detection of collisions during the reading phase, compared to traditional concurrency control algorithms. Second, pessimistic locking requires transactions to save their time stamps in the lock, which enables the algorithm to commit the transactions in time-stamped order, providing a low tail delay.
In an embodiment of the present invention, further includes:
a private buffer is created for each worker thread and write operations are controlled to be modified in the private buffer during the read phase to prevent incomplete or uncommitted data from being read.
In the embodiment of the invention, in order to avoid detecting read-write conflict in the reading stage, the algorithm introduces a private buffer area for each working thread, and the writing transaction in the reading stage is updated in the local buffer, and during the period, the reader can directly read the record without being blocked; likewise, writes may bypass reads as well, since the read operation does not modify the database in any way. Always modified in the local buffer first by the write operation, this prevents readers from reading incomplete or uncommitted data when ignoring the write lock.
In an embodiment of the present invention, the status information includes time stamp information, and the serialization process is a process of serializing according to a time stamp order.
In an embodiment of the present invention, when the conflict transaction is resolved in the commit phase, the serializing processing is performed according to the status information of the conflict transaction to ensure the priority of the conflict transaction, including:
judging whether the time stamp of the current transaction is larger than that of other transactions; if the timestamp of the current transaction is greater than the timestamps of other transactions, aborting the current transaction; if the timestamp of the current transaction is less than or equal to the timestamp of the other transactions, the commit transaction is controlled to wait until a conflicting transaction commits.
In the commit phase, potential conflicts are resolved by scanning the state of all transactions that acquire locks, and if their timestamps are large, the algorithm will terminate the corresponding transaction; otherwise, the algorithm waits for a commit transaction until there are conflicting transaction commits.
By the embodiment of the invention, the record is optimistically read in the reading stage so as to eliminate unnecessary blocking overhead and further obtain high throughput, and the algorithm ensures that the transaction is always submitted according to the time stamp sequence in the submitting stage, thereby obtaining low delay.
In an embodiment of the present invention, further includes:
for write-write conflicts, the transaction directly delays acquiring the write lock; for a blind write operation, the transaction acquires a write lock in the commit phase; for read-modify-write operations, the transaction acquires the shared lock during the read phase and upgrades the shared lock to exclusive mode during the commit phase.
In embodiments of the present invention, the reader does not have to block writing because the reader does not modify the record anyway. For write-write conflicts, the acquisition of the write lock is directly delayed: for blind writing, i.e. updating the record without reading it in advance, the transaction only needs to acquire the write lock in the commit phase; for read-modify-write (read-modify-writes) operations, transactions acquire shared locks during the read phase and upgrade them to exclusive mode during the commit phase.
In an embodiment of the present invention, further includes:
by using a lock-less lock, locks are acquired and conflicts are detected at different stages.
In the embodiment of the invention, lock-free locking is performed in stages and conflicts are detected. Traditionally, a lock primitive should maintain multiple data structures, which makes it difficult to atomically acquire/release a lock using a lock-free data structure. However, by acquiring locks and detecting conflicts at different stages and using a lock-less locking approach, system performance losses due to busy or the like can be avoided, thereby improving system efficiency and reducing system use costs.
Unlike OCC which only needs to write lock, the transaction needs lock during read and write operation, and conflict is processed by using a lock-free mode, so that locking expenditure is reduced to the greatest extent. The concurrency control algorithm decomposes the locking process into lock acquisition and conflict detection, which facilitates the design of lightweight lock primitives using lock-free data structures. The algorithm provides a serializable level of isolation while meeting both high throughput and low latency characteristics.
In an embodiment of the present invention, further includes:
realizing lock-free locking by using a lock-free list of atomic words; wherein, a bit is allocated to each working thread in the atomic word, the offset of the bit is determined by the ID of the working thread, when the read lock is obtained, the corresponding bit is set to 1, whether the exclusive entry is set is checked, if the exclusive entry is set, the corresponding bit is cleared and waits.
In the embodiment of the invention, the realization is carried out by using a lock-free list of atomic words. The implementation of a lock-free list can be further simplified by manipulating bits in an 8-byte atomic word. Each worker thread is assigned a bit in the atomic word, the offset of which is determined by the ID of the worker thread. To insert an item into the list, the worker thread need only set the corresponding bit to 1. To support exclusive mode, the last bit of the atomic word is reserved as an exclusive entry. When the worker obtains the read lock, it sets the corresponding bit to 1 and checks whether the exclusive entry has been set. If so, the reader clears its corresponding bit and waits. Thus, an 8-byte atomic word can support up to 63 worker threads, and can be expanded as more worker threads are added. By using bit operation, the locking efficiency is further improved, and the expandability and the robustness of the method are ensured by the simplicity of atomic word operation.
In the embodiment of the invention, the transaction acquires the read lock or the write lock by storing the corresponding state information in the lock manager in the read stage, so that the priority among the conflict transactions is confirmed when the conflict transactions are submitted, the conflict detection is delayed from the read stage to the submit stage, and the conflict transactions are processed in a serialization manner according to the state information of the conflict transactions when the conflict transactions are solved in the submit stage, so as to ensure the priority of the conflict transactions, realize that the influence of the conflict on the delay is considered in the transaction concurrency control of the database, and avoid the occurrence of deadlock and starvation phenomena by delaying the conflict detection and forcing the submit priority, and improve the transaction processing performance in the database.
In order for the concurrency control algorithm to properly implement serialization, it will be demonstrated that the algorithm ensures that conflicts can be serialized, and the verification process for the correctness of the concurrency control algorithm is described below: :
preparation:
several common symbols are first defined. Transaction T i Is a series of operations, each of which may be a reading r of record X i (X) write w i (X), locking l i (X) or unlocking u i (X). In addition, use is made of, for example, l r (X)、l w (X)、l w+e The form of (X) represents a read lock, a write lock, and an exclusive write lock, respectively, which will help to distinguish the types of locks. Similarly, use u r (X)、u w (X) in this form.
Definition 1 scheduling: transaction T 1 、T 2 、...、T n Is the interleaving of operations in a transaction that preserves the order of operations in the same transaction.
Definition 2 serialization scheduling: scheduling is serial if no transactions begin before a transaction ends.
Defining conflicts: write-write (WW) conflicts in schedule S are defined as a pair (w i (X),w j (X)), where i+.j, and in S, w i (X) occurs at w j (X) before. That is, two operations write one object in succession, the result of the latter operation determining the final result of the write, such a conflict may result in lost updates, as the uncommitted data is overwritten. Read-write (RW) conflicts and write-read (WR) conflicts may be similarly defined. The read-write conflict can cause unrepeatable reading, and the read-write conflict can read unrelevant And (5) data of the traffic.
Definition 3 conflict equivalence: two schedules S, S' on the same set of transactions are conflict-equivalent if they have exactly the same set of conflicts.
Definition 4 conflict serializable: if a schedule is equivalent to a serial schedule conflict, then it is conflict serializable.
For a group of transactions T that have been committed 1 、T 2 、...、T n Definition S is the schedule generated by the concurrency control algorithm above. S has the property that they will be used in later proofs.
A. At any point in time in the schedule, at most one transaction may hold a write lock on the same record X. B. In the same transaction, all lock operations precede all unlock operations. This attribute is similar to that of the standard 2PL protocol. In fact, in this concurrency control algorithm, a transaction will release all locks upon commit.
C. A transaction can only write a record after its lock is upgraded to exclusive mode, regardless of whether delayed write lock acquisition is enabled, during which other readers and writers cannot access the record.
The specific proving process is as follows:
theorem 1: let S be any schedule generated by the concurrency control algorithm, then S is conflict serializable.
And (3) proving: a generalization is used for the number of transactions in schedule S.
For the generalized basic case, S contains only one transaction T. In this case, S itself is a serial schedule, so it is a trivial and simple conflict serialization.
Let S be n transactions T for the generalization step 1 、T 2 、...、T n Is scheduled for a given time period. According to a generalized assumption, a schedule containing n-1 and below transactions is conflict serializable. The proof S is also conflict serializable.
Let u be i (X) is the first unlock operation in S, which is performed by transaction T i Emitting. Namely:
S:....,...,u i (X),...,...
s' is obtained by mixing T i Moves to the beginning of S and holds T i Schedule constructed of the order of the operations in (a):
assertion 1: s' and S are collision equivalent.
Proof of assertion: assertion 1 is demonstrated for each of the three types of conflicts.
Write-write collision: consider transaction T i Write operation w on record Y i (Y). Any other write operation w to the same record in S will be demonstrated j (Y) must be at w i (Y) occurs thereafter. To derive contradictions, assume w in schedule S j (Y) appears in w i (Y) before:
S:...,w j (Y),...,w i (Y),...
in this concurrency control algorithm, a transaction must have a write lock for the corresponding record before any writes can be made. According to attribute A, at most one transaction can hold the lock at any time, thus at T i T before obtaining lock to write Y j The lock must be released. To sum up, the schedule must have the form:
however, according to the assumption, u i (X) is the first unlock operation in S, then u i (X) must be atWhich occurred before, indicating u in the schedule i (X) at l i Before (Y), which contradicts attribute B.
Read-write collision: consider transaction T i Write operation w on record Y i (Y). Any other read operations r to the same record in S will be demonstrated j (Y) must be at w i (Y) occurs thereafter. To derive contradictions, let r in schedule S j (Y) appears in w i (Y) before:
S:...,r j (Y),...,w i (Y),...
write operation w i Before (Y), transaction T i Must be in exclusive modeAcquire write lock->But->Must occur at r j After (Y). Otherwise, T i Must be at w i (Y) releasing the lock on Y before this is a contradiction. Thus, the schedule must have the following form:
consider T i When the write lock is upgraded to exclusive mode, record the reader list of Y. Due to T j Not aborted (there must be a complete occurrence of a read for a read-write collision, i.e. this time the read is not terminated), meaning that at some earlier point in time, transaction T j The read lock of Y must have been released.
Thus, the form of scheduling can be further refined to:
according to the assumption, u i (X) is the first unlocking operation in S, and therefore must be performed in u j (Y) occurs before. But this means u i (X) atThis contradicts attribute B, which occurred previously.
Write-read collision: consider transaction T i Read operation r on record Y i (Y). Will prove that the same record in SAny other write operation w j (Y) must be at r i (Y) occurs thereafter. To derive contradictions, assume w in schedule S j (Y) is shown in r i (Y) before:
S:...,w j (Y),...,r i (Y),...
transaction T j The write lock for Y must be acquired and upgraded to exclusive mode to write. Furthermore, transaction T prior to reading i Must obtain a read lockNow differentiate->Appear in->Both before and after.
In the first case of the first type of case,appear in->Previously, the schedule had the following form:
following a discussion approach similar to the read-write collision case, at T j Before upgrading the lock to exclusive mode, T i Must release the read lock on YWill be at T i This contradictions occur before reading Y.
In the second case of the two-way valve,appear in->Thereafter, the scheduled shapeThe formula is:
however, at T i T before obtaining read lock j Y must be unlocked. Therefore, in the schedule S,should appear atBefore. According to the assumption, u i (X) is the first unlocking operation in S, so must be at +.>Which occurred before. This means that in schedule S, u i (X) is->This contradicts attribute B, which occurred previously.
Finally, scheduling S' needs to be considered. The schedule first includes transaction T i Then the sub-schedule S "of n-1 transactions. According to the generalization assumption, the S' collision is equivalent to a serial schedule. Thus, schedule S' is also equivalent to a serial schedule. From assertion 1, it is known that S and S' are conflict-equivalent, so S is also conflict-serializable.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
The embodiment of the invention provides a structural schematic diagram of a device for controlling transaction concurrency in a database, wherein the database processes transactions based on a two-stage locking protocol, and the device is used for:
during the read phase, the transaction obtains a read lock or a write lock by storing corresponding state information in the lock manager to confirm priority between conflicting transactions when they commit;
And delaying the conflict detection from the reading stage to the submitting stage, and carrying out serialization processing according to the state information of the conflict transaction when the conflict transaction is resolved in the submitting stage so as to ensure the priority of the conflict transaction.
In one embodiment of the present invention, the method is further used for:
a private buffer is created for each worker thread and write operations are controlled to be modified in the private buffer during the read phase to prevent incomplete or uncommitted data from being read.
In an embodiment of the present invention, the status information includes time stamp information, and the serialization process is a process of serializing according to a time stamp order.
In an embodiment of the present invention, when the conflict transaction is resolved in the commit phase, the serializing processing is performed according to the status information of the conflict transaction to ensure the priority of the conflict transaction, including:
judging whether the time stamp of the current transaction is larger than that of other transactions;
if the timestamp of the current transaction is greater than the timestamps of other transactions, aborting the current transaction;
if the timestamp of the current transaction is less than or equal to the timestamp of the other transactions, the commit transaction is controlled to wait until a conflicting transaction commits.
In one embodiment of the present invention, the method is further used for:
For write-write conflicts, the transaction directly delays acquiring the write lock;
for a blind write operation, the transaction acquires a write lock in the commit phase;
for read-modify-write operations, the transaction acquires the shared lock during the read phase and upgrades the shared lock to exclusive mode during the commit phase.
In one embodiment of the present invention, the method is further used for:
by using a lock-less lock, locks are acquired and conflicts are detected at different stages.
In one embodiment of the present invention, the method is further used for:
realizing lock-free locking by using a lock-free list of atomic words; wherein, a bit is allocated to each working thread in the atomic word, the offset of the bit is determined by the ID of the working thread, when the read lock is obtained, the corresponding bit is set to 1, whether the exclusive entry is set is checked, if the exclusive entry is set, the corresponding bit is cleared and waits.
In the embodiment of the invention, the transaction acquires the read lock or the write lock by storing the corresponding state information in the lock manager in the read stage, so that the priority among the conflict transactions is confirmed when the conflict transactions are submitted, the conflict detection is delayed from the read stage to the submit stage, and the conflict transactions are processed in a serialization manner according to the state information of the conflict transactions when the conflict transactions are solved in the submit stage, so as to ensure the priority of the conflict transactions, realize that the influence of the conflict on the delay is considered in the transaction concurrency control of the database, and avoid the occurrence of deadlock and starvation phenomena by delaying the conflict detection and forcing the submit priority, and improve the transaction processing performance in the database.
An embodiment of the present invention further provides an electronic device, which may include a processor, a memory, and a computer program stored on the memory and capable of running on the processor, where the computer program when executed by the processor implements a method for concurrency control of transactions in a database as above.
An embodiment of the present invention further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements a method for concurrency control of transactions in a database as above.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The foregoing has described in detail the method, apparatus, device and medium for concurrency control of transactions in a database, and specific examples have been presented herein to illustrate the principles and embodiments of the present invention, the above examples being provided only to assist in understanding the method and core idea of the present invention; meanwhile, as those skilled in the art will vary in the specific embodiments and application scope according to the idea of the present invention, the present disclosure should not be construed as limiting the present invention in summary.

Claims (10)

1. A method of concurrency control of transactions in a database, wherein the database processes transactions based on a two-phase lock protocol, the method comprising:
during the read phase, the transaction obtains a read lock or a write lock by storing corresponding state information in the lock manager to confirm priority between conflicting transactions when they commit;
and delaying the conflict detection from the reading stage to the submitting stage, and carrying out serialization processing according to the state information of the conflict transaction when the conflict transaction is resolved in the submitting stage so as to ensure the priority of the conflict transaction.
2. The method as recited in claim 1, further comprising:
a private buffer is created for each worker thread and write operations are controlled to be modified in the private buffer during the read phase to prevent incomplete or uncommitted data from being read.
3. The method of claim 1, wherein the status information includes time stamp information, and wherein the serialization process is a process of serializing in a time stamp order.
4. A method according to claim 3, wherein when resolving the conflicting transaction at the commit stage, serializing the conflicting transaction according to the status information of the conflicting transaction to ensure the priority of the conflicting transaction, comprising:
Judging whether the time stamp of the current transaction is larger than that of other transactions;
if the timestamp of the current transaction is greater than the timestamps of other transactions, aborting the current transaction;
if the timestamp of the current transaction is less than or equal to the timestamp of the other transactions, the commit transaction is controlled to wait until a conflicting transaction commits.
5. The method as recited in claim 1, further comprising:
for write-write conflicts, the transaction directly delays acquiring the write lock;
for a blind write operation, the transaction acquires a write lock in the commit phase;
for read-modify-write operations, the transaction acquires the shared lock during the read phase and upgrades the shared lock to exclusive mode during the commit phase.
6. The method as recited in claim 1, further comprising:
by using a lock-less lock, locks are acquired and conflicts are detected at different stages.
7. The method as recited in claim 6, further comprising:
realizing lock-free locking by using a lock-free list of atomic words; wherein, a bit is allocated to each working thread in the atomic word, the offset of the bit is determined by the ID of the working thread, when the read lock is obtained, the corresponding bit is set to 1, whether the exclusive entry is set is checked, if the exclusive entry is set, the corresponding bit is cleared and waits.
8. An apparatus for concurrency control of transactions in a database, wherein said database processes transactions based on a two-phase locking protocol, said apparatus for:
during the read phase, the transaction obtains a read lock or a write lock by storing corresponding state information in the lock manager to confirm priority between conflicting transactions when they commit;
and delaying the conflict detection from the reading stage to the submitting stage, and carrying out serialization processing according to the state information of the conflict transaction when the conflict transaction is resolved in the submitting stage so as to ensure the priority of the conflict transaction.
9. An electronic device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, which when executed by the processor implements a method of concurrency control of transactions in a database according to any one of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements a method of transaction concurrency control in a database according to any one of claims 1 to 7.
CN202311109428.5A 2023-08-30 2023-08-30 Method, device, equipment and medium for controlling transaction concurrency in database Pending CN117348977A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311109428.5A CN117348977A (en) 2023-08-30 2023-08-30 Method, device, equipment and medium for controlling transaction concurrency in database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311109428.5A CN117348977A (en) 2023-08-30 2023-08-30 Method, device, equipment and medium for controlling transaction concurrency in database

Publications (1)

Publication Number Publication Date
CN117348977A true CN117348977A (en) 2024-01-05

Family

ID=89360187

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311109428.5A Pending CN117348977A (en) 2023-08-30 2023-08-30 Method, device, equipment and medium for controlling transaction concurrency in database

Country Status (1)

Country Link
CN (1) CN117348977A (en)

Similar Documents

Publication Publication Date Title
AU2016244128B2 (en) Processing database transactions in a distributed computing system
US9396227B2 (en) Controlled lock violation for data transactions
CN107977376B (en) Distributed database system and transaction processing method
US8473950B2 (en) Parallel nested transactions
Ren et al. Lightweight locking for main memory database systems
US7890472B2 (en) Parallel nested transactions in transactional memory
Salem et al. Altruistic locking
US20170220617A1 (en) Scalable conflict detection in transaction management
US7840530B2 (en) Parallel nested transactions in transactional memory
US7962456B2 (en) Parallel nested transactions in transactional memory
WO2011009274A1 (en) Method and apparatus of concurrency control
US20140040218A1 (en) Methods and systems for an intent lock engine
US20140040220A1 (en) Methods and systems for deadlock detection
EP3824397B1 (en) Version-based table locking
Chen et al. Plor: General transactions with predictable, low tail latency
US20140040219A1 (en) Methods and systems for a deadlock resolution engine
WO2024098363A1 (en) Multicore-processor-based concurrent transaction processing method and system
CN117348977A (en) Method, device, equipment and medium for controlling transaction concurrency in database
Lomet et al. Using the lock manager to choose timestamps
Kanungo et al. Issues with concurrency control techniques
Pang et al. On using similarity for resolving conflicts at commit in mixed distributed real-time databases
Mohamed et al. Survey on concurrency control techniques
WO2023045713A1 (en) Method and apparatus for controlling database transaction, and related device
CN116860768A (en) Database transaction processing method, device, equipment and storage medium
CN115185957A (en) Lightweight real-time memory database transaction concurrency control method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination