CN117009361A - Two-stage lock-free parallel log playback method and device - Google Patents

Two-stage lock-free parallel log playback method and device Download PDF

Info

Publication number
CN117009361A
CN117009361A CN202310887412.0A CN202310887412A CN117009361A CN 117009361 A CN117009361 A CN 117009361A CN 202310887412 A CN202310887412 A CN 202310887412A CN 117009361 A CN117009361 A CN 117009361A
Authority
CN
China
Prior art keywords
log
logic
data
logs
playback
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310887412.0A
Other languages
Chinese (zh)
Inventor
应承峻
吴倩倩
王剑英
孙建伶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202310887412.0A priority Critical patent/CN117009361A/en
Publication of CN117009361A publication Critical patent/CN117009361A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a two-stage lock-free parallel log playback method and a device, wherein the method comprises the following steps: the read-only node reads at least one redo log from the log buffer; the redo log is distributed to the corresponding log playback thread according to the data page; the log playback thread applies the increment modification content in the redo log to the data page of the line memory, and analyzes the logic log by utilizing a redo log multiplexing method; constructing an original transaction according to the logic log, and delivering the original transaction to a logic log distributor; the logic log distributor distributes the logic log to the corresponding log playback thread according to the main key in the logic log; and the log playback thread plays back the logic logs in parallel, and applies the modifications on the logic logs to the column storage engine to complete the synchronization of the column storage data. The application can furthest utilize the calculation processing capacity of the multi-core processor, reduce the performance cost caused by concurrent control, improve the log playback performance and the resource utilization rate of the system, and has practicability and economic benefit.

Description

Two-stage lock-free parallel log playback method and device
Technical Field
The application relates to the technical field of databases, in particular to a two-stage lock-free parallel log playback method and device.
Background
In a polar db-IMCI cluster based on a write-once read-many, shared storage architecture, a read-only node needs to read and play back the incremental log written by the master node from the shared storage in order to ensure data consistency with the master node. Typically, in order to ensure that incremental modifications in the log can be applied to the read-only node in the correct order, log playback is typically performed using a single line Cheng Chuanhang, which has the advantage of being simple to implement and less prone to error. However, under such a design, each transaction needs to wait for the completion of the previous transaction to start execution, and it is difficult to fully utilize the advantages of the multi-core CPU, resulting in a performance bottleneck.
From a practical production environment, the performance of log playback is very important for read consistency of the whole system (i.e. the read-only node is able to read the latest version of data that the master node has committed), which is related to read latency and data freshness of the read-only node. Under the condition of higher delay, the data of the read-only node can be far behind the main node, and the data inconsistent with the main node is more easily read on the read-only node.
In order to maximally utilize the computing processing power of the multi-core processor, one solution is to replay the log in a parallel manner. The implementation of parallel log playback often needs to take into account concurrency control issues between transactions, such as coordination and synchronization of conflicting transactions to ensure that multiple transactions agree on the same data modification order with the master node. Currently, some schemes for parallel log playback exist in the industry, however, most of these schemes use Session (Session) or Transaction (Transaction) as granularity to perform parallel log playback, and coordinate and synchronize conflicting transactions by means of concurrent control means such as dependency graphs, optimistic locks and the like. In practice, however, synchronization between playback threads using locks still has significant performance penalty (e.g., resulting in CPU context switching or busy-wait), making it difficult to fully utilize the computing processing power of the multi-core system.
Therefore, we propose a two-stage lock-free parallel log playback method, which can complete the parallel log playback work without relying on concurrent control means such as locks on the basis of a redo log multiplexing method, thereby reducing the performance overhead caused by concurrent control and improving the log playback performance and resource utilization rate of the system.
Disclosure of Invention
The application aims to provide a two-stage lock-free parallel log playback method and device aiming at the defects of the prior art.
The aim of the application is realized by the following technical scheme: the first aspect of the embodiment of the application provides a two-stage lock-free parallel log playback method, which comprises the following steps:
(1) The read-only node reads at least one redo log from the log buffer;
(2) The redo log distributor distributes the redo log read in the step (1) to a corresponding log playback thread according to the data page acted by the redo log, wherein the data page is acquired from a data page buffer pool according to the data page number;
(3) The log playback thread applies the increment modification content in the redo log to the data page of the line memory to complete the synchronization of the line memory data, and analyzes the logic log readable by the line memory from the redo log by utilizing a redo log multiplexing method;
(4) Screening and filtering all the logic logs analyzed in the step (3) to construct an original transaction, and delivering the original transaction to a stored logic log distributor;
(5) The logic log distributor merges all logic logs of all received original transactions according to the sequence of log serial numbers so as to obtain ordered transactions; the logic log distributor traverses the logic logs in the ordered transactions and distributes the logic logs to corresponding log playback threads according to the primary keys of the line records in the logic logs;
(6) The log playback thread plays back the logic log in a parallel mode, and applies the modification on the logic log to the column memory engine to complete the synchronization of the column memory data.
Further, the redo log includes a transaction ID, a log sequence number, a data page number, and an incremental data field.
Further, the screening and filtering all the logical logs analyzed in the step (3) specifically includes: firstly, acquiring a data page number corresponding to a logic log, then acquiring a corresponding data table number according to the data page number, finally judging whether a column storage index is established on a data table corresponding to the data table number, if the column storage index is established on the data table, reserving data in the logic log, and constructing an original transaction according to the reserved logic log; otherwise, the data in the logic log is filtered and deleted.
Further, the logic log distributor merges all logic logs of all received original transactions according to the sequence of log serial numbers to obtain ordered transactions, which specifically includes: and sequencing all the logic logs of all the original transactions according to the sequence from small to large of the log sequence number by using K paths of merging sequencing, and merging the sequenced logic logs to obtain ordered transactions.
Further, the logical log allocator traverses the logical log in the ordered transaction, allocates the logical log to the corresponding log playback thread according to the primary key of the line record in the logical log, and specifically includes: the logic log distributor traverses the logic logs in the ordered transactions, carries out hash operation on the main keys of the row records in the logic logs to obtain hash values corresponding to the main keys, and distributes the logic logs corresponding to the main keys to corresponding log playback threads according to the hash values of the main keys.
A second aspect of an embodiment of the present application provides a two-stage lock-free parallel log playback device, including one or more processors and a memory, the memory coupled to the processors; the memory is used for storing program data, and the processor is used for executing the program data to realize the two-stage lock-free parallel log playback method.
A third aspect of the embodiments of the present application provides a computer-readable storage medium having stored thereon a program for implementing the two-stage lock-free parallel log playback method described above when executed by a processor.
The method has the beneficial effects that the problem of limited performance of the traditional serial log playback method under a high concurrency scene is solved by multithreading parallel playback; the application overcomes the problem of high synchronization overhead caused by the fact that the parallel log playback based on the dependency graph and the optimistic lock needs to resort to locks or other synchronization mechanisms by selecting different parallel granularities at different stages; according to the application, two-stage lock-free parallel log playback is introduced on the basis of a physical replication and redo log multiplexing method, log playback is split into two stages to be executed, parallel playback is respectively carried out according to granularity of data pages and lines, correctness of transaction playback sequence can be ensured without additional concurrency control means, cost of coordination and synchronization of mutually conflicting transactions is greatly reduced, and performance expenditure caused by concurrency control is reduced; the application fully plays the performance of the multi-core processor, improves the parallelism and performance of log playback, improves the log playback performance and resource utilization rate of the system, reduces read delay of read-only nodes, and improves the data freshness of the read-only nodes; the application has practicability and economic benefit.
Drawings
FIG. 1 is an overall frame diagram of a two-stage lock-free parallel log playback method of the present application;
fig. 2 is a schematic structural diagram of a two-stage lock-free parallel log playback device according to the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The two-stage lock-free parallel log playback method can improve the parallel granularity of log playback, and is realized on a polar DB-IMCI. The polar DB-IMCI performs row storage data synchronization between the master node and the read-only node through physical replication, and solves the logical log from the physical log by means of a redo log multiplexing method for row storage data synchronization of the read-only node. The system to which the application is applicable is a database with multiple storage engines (such as a row storage engine and a column storage engine), and the playback log update data page completes the data synchronization to the row storage engine and also needs to realize the data synchronization to the column storage engine.
It should be appreciated that polar db is a well known database system, IMCI (In Memory Column Index, in-memory column-store index) is an available column-store engine in the polar db database, and polar db-IMCI can be understood as an extension of polar db, adding to the column-store index characteristics.
Physical replication (Physical Replication) is one method for polar db-IMCI for master and read-only node data synchronization. By physical replication, the read-only node is able to obtain a copy of the data that is consistent with the master node. Compared with the traditional scheme of using Binlog as master and slave synchronization, physical replication is performed according to the dimension of the data page, granularity is finer, parallel efficiency is higher, consistency of the engine layer and the storage layer log is not required to be ensured through a two-stage commit protocol (2 PC), and delay between the master node and the read-only node can be reduced to a millisecond level. It should be noted that, physical replication needs to ensure that the data of the master node and the read-only node are isomorphic (i.e. have the same data organization mode), and for synchronization between heterogeneous copies (such as line memory data of the master node and column memory data of the read-only node), synchronization needs to be performed by means of a logical log, where the logical log may be obtained by directly writing in the master node or resolving by a redo log multiplexing method.
The Redo log multiplexing method (Reuse redox) is a method for inverse-solving a logical log from a physical log in a polar db-IMCI. The redo log multiplexing method can inverse-solve the log in a logic form by multiplexing the physical copying flow and combining the data page and the increment information recorded in the redo log. Specifically, when the data page is modified, the line memory engine writes the redo log, and when writing the redo log, the line memory engine is performed according to a certain rule, for example, the 1 st byte represents the log type, the 2 nd to 5 th bytes represent the log size, and the like, so that when the redo log is obtained, the information in the line memory engine can be reversely resolved according to the rule. Therefore, the redo log multiplexing method can reversely analyze the content in the log by utilizing the rule (log format) of the redo log, and restore the needed logic log from the log, wherein the logic log comprises the position of the modified record and the modified record content.
Referring to fig. 1, the two-stage lock-free parallel log playback method of the present application specifically includes the following steps:
(1) The read-only node reads at least one redo log from the log buffer.
It should be noted that, for a Primary node (Primary)/read-write node (RW), in a polar db-IMCI cluster based on a one-write-multiple-read, shared storage architecture, the Primary node maintains complete line memory data and provides read-write services to the outside. The polar DB-IMCI cluster only contains one master node.
In addition, for the slave node (Replica)/read-only node (RO), in the polar db-IMCI cluster based on the write-once read-many, shared storage architecture, the read-only node shares the persistent data with the master node through the shared storage and provides the read service to the outside. The polar db-IMCI cluster includes one or more read-only nodes.
Further, redo journaling is the modification on the Data page, typically including the Transaction ID (TID), journal serial number (LSN), data page number (PageID), and delta Data Field (Data Field). Wherein a Data page number (PageID) is used to identify in which Data page the modified record is located, and an incremental Data Field (DataField) is the incremental Data of the new record relative to the record prior to modification.
It should be noted that a Data Page (Data Page) generally refers to a block of memory space in a cache for storing a Data Page on a disk. The database management system caches frequently accessed data pages in the memory through a caching technology, so that the access speed can be increased, and the performance of the database system is improved. The size of the data pages is usually fixed (e.g. the size of each data page in polar db is 16 KB), but can be configured and adjusted according to practical requirements.
Specifically, when data is modified on a read-write node, the line memory engine of the read-write node writes a redo log into a shared memory, wherein the shared memory is a disk that is accessible to both the read-write node and the read-only node. The read-only node is then informed of the current latest Log Sequence Number (LSN), i.e. the log sequence number of the redo log currently newly written into the shared memory, via the network. After receiving the latest LSN notified by the read-write node, the read-only node reads the redo log from the shared storage space to the memory of the read-only node by the log reading thread, namely, the read-only node reads at least one redo log from the log buffer.
It should be appreciated that the Log Sequence Number (LSN) is a globally incremented number. The database records all operations in the log when executing the transaction, and each record in the log contains an LSN and corresponding operation information, such as update, insertion or deletion. The smaller the LSN of a record, the earlier the record is written to the log, and the earlier the corresponding operation is performed.
(2) And (3) a redo Log distributor (Log Dispatcher) distributes the redo Log read in the step (1) to a corresponding Log playback thread according to the data page acted by the redo Log, wherein the data page is acquired from a data page Buffer Pool (Buffer Pool) according to the data page number.
It should be appreciated that the redo log allocator is a module in the physical replication of the polar db for allocating the read redo log. Buffer Pool can be used to read pages of data from disk in a memory cache.
Specifically, the redo Log includes a data page number, a data page acted by the redo Log can be obtained according to the data page number, and a redo Log distributor (Log Dispatcher) distributes the redo Log read in the step (1) to a corresponding Log playback thread (Apply Worker) in a residual manner according to the data page acted by the redo Log: if M log playback threads exist, the data page number of the redo log is N, and the redo log is distributed to N% of the M log playback threads, wherein% represents the remainder. For example, assuming there are 5 log playback threads, the pageid=7 of the redo log, the redo log is assigned to the 7%5 =2 log playback thread, where the numbers of the log playback threads start from 0, i.e., 0,1,2,3,4. Each log playback thread plays back the log in the LSN order of the log. Thus, the first stage of log playback is performed at granularity of data pages, which ensures that logs acting on the same data page are all played back sequentially, and logs acting on different data pages can be played back in parallel.
(3) And the log playback thread applies the increment modification content in the redo log to the data page of the line memory to complete the synchronization of the line memory data, and analyzes the logic log readable by the line memory from the redo log by utilizing a redo log multiplexing method.
Specifically, each log playback thread applies the incremental modification content in the redo log to the Data page of the line memory, i.e. plays back the Data fields recorded in the redo log, so as to complete the synchronization of the line memory Data. Meanwhile, each log playback thread analyzes a logic log which is readable in a column according to positioning information recorded in the redo log and a corresponding data page by using a redo log multiplexing method, wherein the logic log is a DML (Data Manipulation Language, data operation language) statement with a logic format.
The location information is PageID and PageOffset (i.e. the offset recorded in the data page is modified). Typically, a data page occupies 16KB (16×1024 bytes), and if the location information of the redo log is pageid=4 and pageoffset=5000, it means that the redo log records a modification to the data page No. 4, and the modified data is located at the 5000 th byte of the data page No. 4. By replaying the positioning information recorded in the log and the corresponding data page parsing data, that is, by the positioning information, the line of data can be parsed from the 5000 th byte of the data page No. 4.
It should be noted that, one log playback thread may process the redo log from a plurality of data pages, and the redo log from one data page may only be processed by one log playback thread. For example, assuming there are 5 log playback threads, then log playback thread No. 0 may process the redo logs from pages of data (PageID) 0, 5, 10, 15, 20, log playback thread No. 1 may process the redo logs PageID 1, 6, 11, 16, but the redo logs PageID 7, 12, 17 must be processed by log playback thread No. 2.
(4) And (3) screening and filtering all the logical logs analyzed in the step (3) to construct an original transaction (Txn Buffer), and delivering the original transaction to a stored logical log distributor.
In this embodiment, the screening and filtering all the logical logs analyzed in the step (3) specifically includes: firstly, acquiring a data page number corresponding to a logic log, then acquiring a corresponding data table number according to the data page number, finally judging whether a column storage index is established on a data table corresponding to the data table number, if the column storage index is established on the data table, reserving data in the logic log, and constructing an original transaction according to the reserved logic log; otherwise, the data in the logic log is filtered and deleted.
Specifically, after all log playback threads complete log playback, all logical logs can be obtained according to step (3), and all analyzed logical logs are screened and filtered: firstly, the PageID corresponding to the logic log is obtained by analyzing the redo log, and the redo log contains the PageID information, so that the PageID corresponding to the logic log can be easily obtained; then, a corresponding data table number (TableID) is acquired according to the PageID, wherein the TableID is a function supported by the polar DB, and the TableID can be uniquely determined through the PageID because one data page can only belong to one data table; finally judging whether a column storage index is established on a data table corresponding to the data table number, if the column storage index is established on the data table, reserving data in the logic log, and constructing an original transaction according to the reserved logic log; otherwise, the data in the logic log is directly filtered and deleted without being synchronized to the column storage index. And then delivering the constructed original transaction to a logical log distributor of the column memory, so that the row memory data of the read-only node reaches a state consistent with the read-write node. It should be understood that the logical journal allocator belongs to a line store engine and the redo journal allocator belongs to a line store engine.
Illustratively, assume that a read-only node reads 4 redo logs from a log buffer (i.e., shared memory), as shown in table 1, where Type is the operation Type, INSERT represents INSERT, UPDATE represents modify UPDATE, key is the primary Key, and Value is the Value. Assume that the number of log playback threads is 2, numbered 0, 1.
Table 1: redo log example
LSN PageID TID Logic log (Type/Key/Value)
1 1 T1 M1<INSERT,K1,A>
2 2 T1 M2<INSERT,K2,D>
3 2 T2 M3<UPDATE,K2,B>
4 1 T2 M4<INSERT,K3,C>
The specific process of the first-stage log playback is as follows:
firstly, a redo log distributor distributes the redo log according to the data page number and distributes the redo log to a corresponding log playback thread: LSN1 and LSN4 are allocated to the log queue of log playback thread No. 1 (hereinafter W1); LSN2 and LSN3 are assigned to the log queue of log playback thread No. 0 (hereinafter W0). Specifically, LSN1 and LSN4 are allocated to log thread No. 1 for their pageid=1, since 1% 2=1, and LSN2 and LSN3 are allocated to log playback thread No. 0 for their pageid=2, since 2% 2=0.
Then, the redo log is fetched from the log queue in order for playback: when the No. 0 log playback thread (W0) plays back the redo log according to the sequence of the Log Serial Number (LSN), the record K2 is inserted into the data page PageID=2 and then the record K2 is modified, and the logic logs M2 and M3 are obtained through a redo log multiplexing method; when the log playback thread (W1) plays back the redo log according to the sequence of the Log Serial Number (LSN), the record K1 is inserted first and then the record K3 is inserted on the data page pageid=1, and the logical logs M1 and M4 are obtained by the redo log multiplexing method. It should be appreciated that the redo logs in each log playback thread are played back in the order of log sequence numbers, and that the redo logs acting on different data pages may be played back in parallel.
Secondly, the redo log allocator sums up the logical logs M2, M3, M1, and M4 parsed by W0 and W1 and sorts them according to the Transaction ID (TID), assembling the original transaction, i.e., b1= { M1, M2}, b2= { M3, M4}.
Finally, the redo log allocator delivers the original transactions b1= { M1, M2} and b2= { M3, M4} to the logical log allocator.
(5) The logic log distributor merges all logic logs of all received original transactions according to the sequence of log serial numbers so as to obtain ordered transactions; the logic log distributor traverses the logic logs in the ordered transactions and distributes the logic logs to the corresponding log playback threads according to the primary keys of the row records in the logic logs.
It should be noted that the logical journal splitter, when obtaining the original transactions B1 and B2 delivered by the redo journal splitter, cannot directly perform parallel playback on them. This is because the original transactions B1 and B2 resulting from the first stage, while internally ordered, are not ordered between B1 and B2, e.g., M4 and M2. If parallel playback is performed directly on the original transactions B1 and B2, an error condition may occur in which M4 is played back before M2. Therefore, the logical log allocator first needs to merge all the logical logs in the original transactions B1 and B2 according to the LSN of the log to obtain the ordered transaction.
In this embodiment, the logic log allocator merges all the logic logs of all the received original transactions according to the sequence of log sequence numbers to obtain ordered transactions, which specifically includes: and sequencing all the logic logs of all the original transactions according to the sequence from small to large of the log sequence number by using K-way Merge sequencing (K-way Merge Sort), and merging the sequenced logic logs to obtain ordered transactions.
It should be understood that the K-way merge ordering is a classical ordering algorithm, and the logic logs can be ordered rapidly through the K-way merge ordering, so that the merging efficiency is improved.
For example, if the original transaction received by the logical log allocator is q1= { M1, M4} and q2= { M2, M3}, LSNs of two logical logs M1, M4 in Q1 are 1, 4, and LSNs of two logical logs M2, M3 in Q2 are 2,3, respectively, then the K-way merge ordering is used to order all the logical logs in order of LSNs from small to large: m1, M2, M3, M4. Thus, under the condition that the logic logs inside the original transactions Q1 and Q2 are ordered, the K-way Merge Sort (K-way Merge Sort) can be used for quickly merging the original transactions Q1 and Q2, and the ordered transactions Q= { M1, M2, M3, M4}, after merging, can be obtained.
In this embodiment, the logical log allocator traverses the logical log in the ordered transaction, and allocates the logical log to the corresponding log playback thread according to the primary key of the line record in the logical log, which specifically includes: the logic log distributor traverses the logic logs in the ordered transactions, carries out Hash operation on the main keys of the row records in the logic logs to obtain Hash values corresponding to the main keys, and distributes the logic logs corresponding to the main keys to the corresponding log playback threads according to the Hash values of the main keys.
It should be appreciated that the primary Key (Key) of a row record in a logical log is actually a string that is mapped to an integer by hashing with the existing library function std in C++.
For example, the primary keys in the 4 logical logs in the first-stage log playback process are K1, K2, and K3, and Hash operations are performed on the three primary keys to obtain corresponding Hash values, which are Hash ("K1") =735, hash ("K2") =888, and Hash ("K3") =977, respectively. Assuming that 2 log playback threads are numbered 0 and 1, determining a log playback thread allocated by a logical log according to the result of Hash (Key)% 2, wherein% represents a remainder, i.e., hash ("k1")% 2=1, hash ("k2")% 2=0, hash ("k3")% 2=1, the logical log corresponding to K1 is allocated to the log playback thread No. 1, the logical log corresponding to K2 is allocated to the log playback thread No. 0, and the logical log corresponding to K3 is allocated to the log playback thread No. 1.
(6) The log playback thread plays back the logic log in a parallel mode, and applies the modification on the logic log to the column memory engine to complete the synchronization of the column memory data, and finally the column memory data consistent with the line memory is obtained.
In the second stage, the logical log obtained in the first stage needs to be replayed again in the column memory engine to ensure that the column memory data of the read-only node reaches a state consistent with the row memory data.
Specifically, the different log playback threads modify the column storage data in a parallel manner, that is, the multiple log playback threads play back the logic log in a parallel manner, and the modifications on the logic log are applied to the column storage engine to complete the synchronization of the column storage data. The distribution of the logic logs is performed at the granularity of behavior, the logic logs from a single transaction are distributed to a plurality of log playback threads for parallel playback, and all the logic logs acting on the same row are distributed to the same log playback thread according to the sequence of LSNs, even if the logic logs belong to different transactions. In summary, the logical log allocator has the responsibility of processing each log in the order of transaction commit, ensuring that different modifications to the same line are committed to the same log replay thread in the correct order, thereby ensuring consistency.
Illustratively, the first-stage log playback obtains original transactions b1= { M1, M2} and b2= { M3, M4}, taking this as an example, the specific procedure of the second-stage log playback is:
first, the logical log allocator merges the logical logs in the original transactions B1 and B2 in order of their LSNs from smaller to larger, resulting in an ordered transaction q= { M1, M2, M3, M4}.
Then, the logic log distributor calculates hash values of the main keys (Key) according to the action of the logic logs in the Q, distributes the logic logs according to the hash values of the Key and the surplus result of the log return thread quantity, and distributes the logic logs to corresponding log playback threads. For example, the logical logs M2 (key=k2) and M3 (key=k2) are sequentially assigned to the log playback thread No. 0 (W0); the logs M1 (key=k1) and M4 (key=k3) are sequentially assigned to the log playback thread No. 1 (W1).
Second, W0 and W1 play back the logical log assigned to itself concurrently, thereby completing the modification of the column-stored data without any collision during the modification.
The application introduces a two-stage lock-free parallel log playback mechanism based on physical replication and redo log multiplexing technology, divides log playback into two stages, uses data pages and lines as parallel granularity to carry out log playback according to the characteristics of the two stages, greatly reduces the cost of coordinating and synchronizing conflicting transactions, further reduces the performance expenditure brought by concurrency control, improves the log playback performance and resource utilization rate of the system, reduces read-only node reading delay, and improves the data freshness of read-only nodes. According to the application, two stages of log playback can be executed in parallel, and the two stages of log playback have the property of no conflict, so that the correctness of the transaction playback sequence is ensured without an additional concurrency control means, and the performance of log playback is improved.
Corresponding to the embodiment of the two-stage lock-free parallel log playback method, the application also provides an embodiment of the two-stage lock-free parallel log playback device.
Referring to fig. 2, a two-stage lock-free parallel log playback device according to an embodiment of the present application includes one or more processors and a memory coupled to the processors; the memory is used for storing program data, and the processor is used for executing the program data to realize the two-stage lock-free parallel log playback method in the embodiment.
The embodiment of the two-stage lock-free parallel log playback device can be applied to any device with data processing capability, and the device with data processing capability can be a device or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 2, a hardware structure diagram of an arbitrary device with data processing capability where the two-stage lock-free parallel log playback apparatus of the present application is located is shown in fig. 2, and in addition to a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 2, the arbitrary device with data processing capability where the apparatus is located in an embodiment generally includes other hardware according to an actual function of the arbitrary device with data processing capability, which is not described herein again.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present application without undue burden.
The embodiment of the application also provides a computer readable storage medium, on which a program is stored, which when executed by a processor, implements the two-stage lock-free parallel log playback method in the above embodiment.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may be any device having data processing capability, for example, a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), or the like, which are provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (7)

1. The two-stage lock-free parallel log playback method is characterized by comprising the following steps of:
(1) The read-only node reads at least one redo log from the log buffer;
(2) The redo log distributor distributes the redo log read in the step (1) to a corresponding log playback thread according to the data page acted by the redo log, wherein the data page is acquired from a data page buffer pool according to the data page number;
(3) The log playback thread applies the increment modification content in the redo log to the data page of the line memory to complete the synchronization of the line memory data, and analyzes the logic log readable by the line memory from the redo log by utilizing a redo log multiplexing method;
(4) Screening and filtering all the logic logs analyzed in the step (3) to construct an original transaction, and delivering the original transaction to a stored logic log distributor;
(5) The logic log distributor merges all logic logs of all received original transactions according to the sequence of log serial numbers so as to obtain ordered transactions; the logic log distributor traverses the logic logs in the ordered transactions and distributes the logic logs to corresponding log playback threads according to the primary keys of the line records in the logic logs;
(6) The log playback thread plays back the logic log in a parallel mode, and applies the modification on the logic log to the column memory engine to complete the synchronization of the column memory data.
2. The two-phase lock-free parallel log playback method of claim 1, wherein the redo log comprises a transaction ID, a log sequence number, a data page number, and an incremental data field.
3. The two-stage lock-free parallel log playback method according to claim 1, wherein the filtering and filtering all the logical logs analyzed in the step (3) specifically comprises: firstly, acquiring a data page number corresponding to a logic log, then acquiring a corresponding data table number according to the data page number, finally judging whether a column storage index is established on a data table corresponding to the data table number, if the column storage index is established on the data table, reserving data in the logic log, and constructing an original transaction according to the reserved logic log; otherwise, the data in the logic log is filtered and deleted.
4. The two-phase lock-free parallel log playback method according to claim 1, wherein the logical log allocator merges all logical logs of all received original transactions in order of log sequence numbers to obtain ordered transactions, and specifically comprises: and sequencing all the logic logs of all the original transactions according to the sequence from small to large of the log sequence number by using K paths of merging sequencing, and merging the sequenced logic logs to obtain ordered transactions.
5. The two-stage lock-free parallel log playback method according to claim 1, wherein the logical log allocator traverses the logical logs in the ordered transactions and allocates the logical logs to the corresponding log playback threads according to the primary keys of the row records in the logical logs, and specifically comprises: the logic log distributor traverses the logic logs in the ordered transactions, carries out hash operation on the main keys of the row records in the logic logs to obtain hash values corresponding to the main keys, and distributes the logic logs corresponding to the main keys to corresponding log playback threads according to the hash values of the main keys.
6. A two-phase lock-free parallel log playback device comprising one or more processors and a memory, wherein the memory is coupled to the processors; wherein the memory is for storing program data and the processor is for executing the program data to implement the two-phase lock-free parallel log playback method of any one of claims 1-5.
7. A computer readable storage medium, having stored thereon a program which, when executed by a processor, is adapted to implement the two-phase lock-free parallel log playback method of any one of claims 1-5.
CN202310887412.0A 2023-07-19 2023-07-19 Two-stage lock-free parallel log playback method and device Pending CN117009361A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310887412.0A CN117009361A (en) 2023-07-19 2023-07-19 Two-stage lock-free parallel log playback method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310887412.0A CN117009361A (en) 2023-07-19 2023-07-19 Two-stage lock-free parallel log playback method and device

Publications (1)

Publication Number Publication Date
CN117009361A true CN117009361A (en) 2023-11-07

Family

ID=88564741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310887412.0A Pending CN117009361A (en) 2023-07-19 2023-07-19 Two-stage lock-free parallel log playback method and device

Country Status (1)

Country Link
CN (1) CN117009361A (en)

Similar Documents

Publication Publication Date Title
US10180946B2 (en) Consistent execution of partial queries in hybrid DBMS
CN111338766B (en) Transaction processing method and device, computer equipment and storage medium
CN108509462B (en) Method and device for synchronizing activity transaction table
CN111159252B (en) Transaction execution method and device, computer equipment and storage medium
US8364909B2 (en) Determining a conflict in accessing shared resources using a reduced number of cycles
CN111597015B (en) Transaction processing method and device, computer equipment and storage medium
CN111143389A (en) Transaction execution method and device, computer equipment and storage medium
CN103646111A (en) System and method for realizing real-time data association in big data environment
CN105404673A (en) NVRAM-based method for efficiently constructing file system
CN107665219B (en) Log management method and device
US20110055151A1 (en) Processing Database Operation Requests
US20230418811A1 (en) Transaction processing method and apparatus, computing device, and storage medium
CN109690522B (en) Data updating method and device based on B+ tree index and storage device
US20080228793A1 (en) System and program for append mode insertion of rows into tables in database management systems
CN112286941A (en) Big data synchronization method and device based on Binlog + HBase + Hive
CN104519103A (en) Synchronous network data processing method, server and related system
CN113868028A (en) Method for replaying log on data node, data node and system
CN115114370B (en) Master-slave database synchronization method and device, electronic equipment and storage medium
CN115114294A (en) Self-adaption method and device of database storage mode and computer equipment
CN114741453A (en) Method, system and computer readable storage medium for data synchronization
CN112965939A (en) File merging method, device and equipment
CN115858252B (en) Data recovery method, device and storage medium
CN109710698B (en) Data aggregation method and device, electronic equipment and medium
CN117009361A (en) Two-stage lock-free parallel log playback method and device
JP2023546818A (en) Transaction processing method, device, electronic device, and computer program for database system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination