WO2021073571A1 - Fragment-free recovery-based database multi-version concurrency control system - Google Patents

Fragment-free recovery-based database multi-version concurrency control system Download PDF

Info

Publication number
WO2021073571A1
WO2021073571A1 PCT/CN2020/121149 CN2020121149W WO2021073571A1 WO 2021073571 A1 WO2021073571 A1 WO 2021073571A1 CN 2020121149 W CN2020121149 W CN 2020121149W WO 2021073571 A1 WO2021073571 A1 WO 2021073571A1
Authority
WO
WIPO (PCT)
Prior art keywords
transaction
version
record
file
access
Prior art date
Application number
PCT/CN2020/121149
Other languages
French (fr)
Chinese (zh)
Inventor
陈元熹
许建辉
王涛
Original Assignee
深圳巨杉数据库软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳巨杉数据库软件有限公司 filed Critical 深圳巨杉数据库软件有限公司
Priority to CA3130011A priority Critical patent/CA3130011A1/en
Publication of WO2021073571A1 publication Critical patent/WO2021073571A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing

Definitions

  • the invention relates to the technical field of system optimization, in particular to a database multi-version concurrency control system based on non-fragment recovery.
  • the biggest problem based on the multi-version concurrency control mechanism is the need to consume additional storage space. All databases implemented based on the multi-version concurrency control mechanism need to implement a space recovery mechanism to release the versions that are no longer needed. In practice, these recycling processes will bring additional CPU overhead, disk read and write operations, and memory operations. During the triggering process of the recycling operation, the system processing capacity is greatly affected, and the stability is greatly reduced. After recycling, it will also cause disk fragmentation, which reduces the efficiency of disk access.
  • the technical problem to be solved by the embodiments of the present invention is to provide a database multi-version concurrency control system based on non-fragmentation recovery, which can adopt a simple and effective storage recovery mechanism that will not cause too much impact on the system, thereby It greatly improves the database performance of multi-version concurrency control.
  • an embodiment of the present invention provides a database multi-version concurrency control system based on non-fragmentation recovery, including:
  • the transaction information recording module is used to assign a unique identifier to each transaction in the database, and to record the oldest uncommitted transaction in the system and the transaction information initiated after the oldest uncommitted transaction;
  • the transaction data management module is used to use the rollback segment space to perform operation management and access optimization on transactions and recorded data; wherein, the rollback segment is composed of a file or a set of fixed-length files;
  • the data visibility optimization module is used to use the record exclusive lock to perform write and write mutual exclusion constraints on the access operation of the transaction; according to the read lock result of the transaction on the record lock, the record version information is optimized for access; according to the initial log sequence of the transaction The comparison result of the log number and the original log sequence number of the record, restricts the visibility of the records of the old version;
  • the memory optimization speed-up module is used to cache the latest old version record in the system into a non-disk memory structure; wherein, the non-disk memory structure includes a hash bucket;
  • the multi-version recovery module is used to distinguish multiplexed files and non-multiplexed files in the rollback segment space, clear the non-multiplexed files, and recover the multiplexed files.
  • the transaction information includes a transaction ID, submission status information, and a time stamp at the time of submission.
  • transaction data management module includes:
  • the transaction access management unit is used to use the space of the rollback segment to perform operation management on the transaction; among them, the operation of the transaction on the record includes the creation operation, the modification operation and the deletion operation;
  • the rollback segment space management unit is used to store and access the record version in the rollback segment.
  • the rollback segment space is composed of a hash table and a disk file; wherein the number of the disk file is one or a group; when the rollback segment space is composed of a group of disk files, the disk Each file of the file is a fixed-length file.
  • the data visibility optimization module includes:
  • the transaction access operation constraint unit is used to use the record exclusive lock to perform write and write mutual exclusion constraints on transaction access operations
  • the record version information access optimization unit is configured to perform access optimization processing on the recorded version information according to the read lock result of the record lock by the transaction; wherein the read lock result includes a successful read lock application and a failed read lock application;
  • the old version record access restriction unit is used to restrict the access visibility of the old version record according to the comparison result of the transaction initial log sequence number and the record initial log sequence number.
  • the multi-version recovery module includes:
  • the rollback segment file distinguishing unit is used to compare the rollback segment file with the oldest uncommitted transaction based on the initial log sequence number of the transaction, and distinguish between reused files and non-reusable files in the rollback segment space according to the comparison result file;
  • the rollback segment file clearing unit is used to clear non-reusable files in the rollback segment space
  • the rollback segment file recovery unit is used to recover the reused files in the rollback segment space as a recycling file.
  • the method for triggering the clearing and reclaiming operation of the rollback segment space by the multi-version reclaiming module includes:
  • the present invention has the following beneficial effects:
  • the configurable rollback segment can guarantee that there are no restrictions on the number of concurrently running transactions and the length of the transaction when there are enough disks;
  • the design of the combination of memory and disk can meet most situations (short transactions) to avoid or reduce disk access, thereby ensuring the access performance of the system.
  • FIG. 1 is a schematic structural diagram of a database multi-version concurrency control system based on non-fragmentation recovery provided by an embodiment of the present invention
  • Fig. 2 is a schematic diagram of an application example of rollback segment access provided by an embodiment of the present invention.
  • the present invention uses a fixed-length data segment to store multi-version data records in a chain structure, and retains the last updated data version in this segment in the metadata of the data segment.
  • the old data is reclaimed, it is determined whether the multi-version records of the segment are all older than the active transaction range of the current database system according to the metadata in the data segment to determine whether the entire data needs to be reclaimed.
  • This large-block data recovery mechanism can ensure the recovery efficiency, will not cause disk fragmentation, and effectively reduce the overhead on the system.
  • Transaction A program execution logic unit composed of a series of operations to access and update data in the system. Transactions have Atomicity, Consistency, Isolation, and Duration, referred to as ACID.
  • Transaction Log Record An additional record in the database system used to record any changes to the database to ensure that the data can be restored to the state before the update when the transaction is rolled back.
  • Isolation level In the standard SQL specification, 4 transaction isolation levels are defined, and different isolation levels handle transactions differently. The 4 isolation levels are: read unmentioned (READ_UNCOMMITTED), read submitted (READ_COMMITTED), repeatable read (REPEATABLE_READ), sequential read (SERIALIZABLE).
  • Multi-version concurrency control A concurrency control mechanism in the database to solve the problem of phantom reading of data, so as to achieve repeatable reading without blocking the reading.
  • the oldest uncommitted transaction (lowTran): the earliest started and uncommitted transaction in the system.
  • Log Sequence Number (LSN, Log Sequence Number): The increasing sequence number in the corresponding log when the event occurs.
  • Transaction ID (TID, Transaction ID): the unique identification number of the transaction.
  • TBLSN Transaction Begin LSN
  • TBLSN Transaction Begin LSN
  • Record ID A unique identification of a record, usually containing information for locating a record, such as the logical or physical location on the disk.
  • Rollback Segment Store the old version of the data segment, which can be in memory or on disk.
  • an embodiment of the present invention provides a database multi-version concurrency control system based on non-fragmentation recovery, including:
  • the transaction information recording module is used to assign a unique identifier to each transaction in the database, and to record the oldest uncommitted transaction in the system and the transaction information initiated after the oldest uncommitted transaction; further, the The transaction information includes transaction ID, submission status information, and time stamp at the time of submission.
  • each transaction in the database has its unique identifier. This identification is logically incremented according to the order in which the transactions enter the system.
  • IDs that contain a timestamp or a log sequence number (LSN) are usually used for identification.
  • the database management system usually records the lowtran in the current system and the status of all transactions starting after the lowtran, including at least the transaction ID, whether it is committed, and the time stamp when it is committed. For each record, the system also implicitly marks the transaction ID that created the record version for visibility judgment.
  • the transaction data management module is used to use the rollback segment space to perform operation management and access optimization on transactions and recorded data; wherein, the rollback segment is composed of a file or a set of fixed-length files;
  • transaction data management module includes:
  • the transaction access management unit is used to use the space of the rollback segment to perform operation management on the transaction; among them, the operation of the transaction on the record includes the creation operation, the modification operation and the deletion operation;
  • the rollback segment space management unit is used to store and access the record version in the rollback segment.
  • the rollback segment space is composed of a hash table and a disk file; wherein the number of the disk files is one or a group; when the rollback segment space is composed of a group of disk files When the files are composed, each file of the disk file is a fixed-length file.
  • the modification process of the transaction to the data is as follows:
  • the rollback segment contains a hash table in memory and one or a group of disk files.
  • the file consists of multiple fixed-length data segments.
  • the records are in a chain structure in the file according to the version. Refer to Figure 2.
  • the record will be hashed into different hash buckets according to the RID, and each bucket stores the disk address of the last old version of the record.
  • Each old version record on the disk will contain a pointer to an older version to form a chain. But when adding a new record version through the hash bucket, the system only needs to append that record at the end of the rollback segment, point its pointer to the address originally saved in the hash bucket, and then change the address in the hash bucket to point to the newly added record. Just record. In this way, each hash bucket corresponds to the old record version chain.
  • the rollback segment is composed of a file or a group of fixed-length files
  • each file When it is composed of a group of files, each file has a fixed length size. In the metadata of the file, it can be placed in the file header or elsewhere to save the maximum TBLSN (maxTBLSN) of all recorded transactions in the file. This TBLSN is different from the TBLSN on the record header, it is the TBLSN of the transaction that wants to modify the version record.
  • the rollback segment automatically switches to the next file.
  • the size of a single file and the total number of files can be artificially limited, or there is no limitation. When the limit is not set, it may take up a lot of disk space, but when the limit is set, the number of uncommitted transactions allowed in the database system and the maximum allowable execution time of uncommitted transactions are also limited.
  • the file can be composed of multiple fixed-length logical segments. Each logical segment stores the maxTBLSN in this segment.
  • the logic segment switching logic is similar to the file switching logic. There will be metadata in the system to track the currently used file or logical segment information.
  • the data visibility optimization module is used to use the record exclusive lock to perform write and write mutual exclusion constraints on the access operation of the transaction; according to the read lock result of the transaction on the record lock, the record version information is optimized for access; according to the initial log sequence of the transaction The result of the comparison between the number and the sequence number of the record initial log restricts the visibility of the old version record; further, the data visibility optimization module includes:
  • the transaction access operation constraint unit is used to use the record exclusive lock to perform write and write mutual exclusion constraints on transaction access operations
  • the record version information access optimization unit is configured to perform access optimization processing on the recorded version information according to the read lock result of the record lock of the transaction; wherein the read lock result includes a successful read lock application and a failed read lock application;
  • the old version record access restriction unit is used to restrict the access visibility of the old version record according to the comparison result of the transaction initial log sequence number and the record initial log sequence number.
  • Insert/modify/delete operations need to hold a record exclusive lock to ensure that writes are mutually exclusive
  • the read operation will first try to obtain the shared lock of the record. According to the success/failure of obtaining the record lock, the version information of the record can be handled in the following three situations:
  • the read lock application is successful. If the read transaction TBLSN is greater than the TBLSN of the current disk record, it means that the read transaction is initiated after the latest version is generated, and the read transaction can directly use the current disk version record;
  • the memory optimization speed-up module is used to cache the most recent old version record in the system into a non-disk memory structure; wherein, the non-disk memory structure includes a hash bucket; in the embodiment of the present invention, it should be noted that because The disk access speed is too slow, which may affect the response time of the transaction. In order to improve the response speed, the system can cache the most recent old version in a hash bucket or other memory structure. This ensures that in a short transaction scenario, this old record version in the memory can meet most of the access requirements, thereby avoiding disk access.
  • the multi-version recovery module is used to distinguish multiplexed files and non-multiplexed files in the rollback segment space, clear the non-multiplexed files, and recover the multiplexed files. Further, the multi-version recovery module includes:
  • the rollback segment file distinguishing unit is used to compare the rollback segment file with the oldest uncommitted transaction based on the initial log sequence number of the transaction, and distinguish between reused files and non-reusable files in the rollback segment space according to the comparison result file;
  • the rollback segment file clearing unit is used to clear non-reusable files in the rollback segment space
  • the rollback segment file recovery unit is used to recover the reused files in the rollback segment space as a recycling file.
  • the method for triggering the clearing and reclaiming operation of the rollback segment space by the multi-version reclaiming module includes:
  • the system can set a background task to be responsible for reclaiming the space of the rollback segment.
  • a set of rollback segment files when using a set of rollback segment files, if maxTBLSN>lowtran of a rollback segment file, the file can be reused or deleted.
  • the logical segment can be reused when maxTBLSN>lowtran of a logical segment. It is understandable that the principle is that all transactions that may access this version have ended (commit or rollback), any running or new transactions only need to access the latest data items on the current disk or relatively new in the rollback segment The old version is recorded, and the entire database system no longer needs the record in this file or logical segment.
  • Such a background task is very simple, only need to scan the metadata of the file or logical segment to clear the unneeded large fixed-length space.
  • clearing means deleting unused files.
  • the reuse mode the system only needs to remember which file or logical block has been recovered, and then recycle it.
  • the background task may be set to be triggered periodically, or triggered when the use of the rollback segment reaches a certain preset threshold.
  • the embodiment of the present invention uses fixed-length file file groups or fixed-length logical segments to design rollback segments, provides a set of MVCC implementation mechanism and rollback segment recovery and cleaning methods, and combines memory to improve access response time. Compared with the existing MVCC implementation, it has the following advantages:
  • the configurable rollback segment can guarantee that there are no restrictions on the number of concurrently running transactions and the length of the transaction when there are enough disks;
  • the design of the combination of memory and disk can meet most situations (short transactions) to avoid or reduce disk access, thereby ensuring the access performance of the system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A fragment-free recovery-based database multi-version concurrency control system, comprising a transaction information recording module, a transaction data management module, a data visibility optimization module, a memory optimization speed-up module, and a multi-version recovery module. In the system, rollback segments are designed by using file groups of fixed-length files or fixed-length logical segments, a set of MVCC implementation mechanism and rollback segment recovery and cleaning methods are provided, and memory access optimization is combined to improve access response time, thereby avoiding significant system overhead caused when the rollback segments are recovered, and avoiding the fragmentation of a disk. Meanwhile, in the situation in which there are enough disks, the configurable rollback segments may ensure that there are no limitations on the number of concurrently running transactions and the length of transactions. In addition, the design of the combination of a memory and disks may avoid or reduce disk access in most cases, thereby ensuring the access performance of the system.

Description

基于无碎片回收的数据库多版本并发控制系统Database multi-version concurrency control system based on non-fragment recovery 技术领域Technical field
本发明涉及系统优化技术领域,尤其是涉及一种基于无碎片回收的数据库多版本并发控制系统。The invention relates to the technical field of system optimization, in particular to a database multi-version concurrency control system based on non-fragment recovery.
背景技术Background technique
传统的数据库在支持并发事务访问时,通常会提供不同的数据访问隔离级别来满足应用对数据的使用。目前主要有两种实现方式:基于事务锁的机制和基于多版本并发控制机制。前者最大的弊端是会出现读写互斥从而降低系统的并发性。所以后者逐渐成为目前的主流实现方式。其原理是在数据变更或删除时在数据结构后部追加新数据版本,且在系统中保留原有版本的数据版本。同时,每次生成数据时要记录一个递增的版本号标识,这样在查询时可以根据查询事务开始时的标识可以确定使用哪一个版本的数据。When traditional databases support concurrent transaction access, they usually provide different data access isolation levels to meet the application's use of data. At present, there are mainly two implementation methods: the mechanism based on transaction lock and the mechanism based on multi-version concurrency control. The biggest drawback of the former is that there will be mutual exclusion of reads and writes, which reduces the concurrency of the system. So the latter has gradually become the current mainstream implementation. The principle is to add a new data version to the back of the data structure when the data is changed or deleted, and the data version of the original version is retained in the system. At the same time, an incremental version number identification should be recorded each time data is generated, so that during the query, it can be determined which version of the data to use according to the identification at the beginning of the query transaction.
基于多版本并发控制机制最大的问题就是需要耗费额外的存储空间。所有基于多版本并发控制机制实现的数据库都需要实现一套空间回收机制,释放不再需要的版本。在实践中,这些回收过程会带来额外的CPU开销、磁盘的读写操作和内存操作,回收操作触发过程中系统处理能力受到极大影响,稳定性大幅下降。回收后还会造成磁盘碎片化,降低了磁盘访问效率。The biggest problem based on the multi-version concurrency control mechanism is the need to consume additional storage space. All databases implemented based on the multi-version concurrency control mechanism need to implement a space recovery mechanism to release the versions that are no longer needed. In practice, these recycling processes will bring additional CPU overhead, disk read and write operations, and memory operations. During the triggering process of the recycling operation, the system processing capacity is greatly affected, and the stability is greatly reduced. After recycling, it will also cause disk fragmentation, which reduces the efficiency of disk access.
发明内容Summary of the invention
本发明实施例所要解决的技术问题在于,提供一种基于无碎片回收的数据库多版本并发控制系统,能够通过采用一套简单有效的、不会给系统带来太大影响的存储回收机制,从而给多版本并发控制的数据库性能带来极大的提升。The technical problem to be solved by the embodiments of the present invention is to provide a database multi-version concurrency control system based on non-fragmentation recovery, which can adopt a simple and effective storage recovery mechanism that will not cause too much impact on the system, thereby It greatly improves the database performance of multi-version concurrency control.
为了解决上述技术问题,本发明实施例提供了一种基于无碎片回收的数据库多版本并发控制系统,包括:In order to solve the above technical problems, an embodiment of the present invention provides a database multi-version concurrency control system based on non-fragmentation recovery, including:
事务信息记录模块,用于对数据库中的各个事务分配唯一标识,并对系统中的最老未提交事务、以及发起于所述最老未提交事务之后的事务信息进行记录;The transaction information recording module is used to assign a unique identifier to each transaction in the database, and to record the oldest uncommitted transaction in the system and the transaction information initiated after the oldest uncommitted transaction;
事务数据管理模块,用于利用回滚段空间对事务以及记录数据进行操作管理和访问优化;其中,所述回滚段由一个文件或一组定长文件组成;The transaction data management module is used to use the rollback segment space to perform operation management and access optimization on transactions and recorded data; wherein, the rollback segment is composed of a file or a set of fixed-length files;
数据可见性优化模块,用于利用记录排他锁以对事务的访问操作进行写写互斥约束;根据事务对记录锁的读锁结果,对记录的版本信息进行访问优化处理;根据事务初始日志顺序号和记录初始日志顺序号的比较结果,对老版本记录的可见性进行约束;The data visibility optimization module is used to use the record exclusive lock to perform write and write mutual exclusion constraints on the access operation of the transaction; according to the read lock result of the transaction on the record lock, the record version information is optimized for access; according to the initial log sequence of the transaction The comparison result of the log number and the original log sequence number of the record, restricts the visibility of the records of the old version;
内存优化提速模块,用于将系统中最近的一个老版本记录缓存到非磁盘内存结构中;其中,所述非磁盘内存结构包括哈希桶;The memory optimization speed-up module is used to cache the latest old version record in the system into a non-disk memory structure; wherein, the non-disk memory structure includes a hash bucket;
多版本回收模块,用于区分回滚段空间中的复用文件和非复用文件,对所述非复用文件进行清除,并对所述复用文件进行回收。The multi-version recovery module is used to distinguish multiplexed files and non-multiplexed files in the rollback segment space, clear the non-multiplexed files, and recover the multiplexed files.
进一步地,所述事务信息包括事务ID、提交状态信息、提交时的时间戳。Further, the transaction information includes a transaction ID, submission status information, and a time stamp at the time of submission.
进一步地,所述事务数据管理模块包括:Further, the transaction data management module includes:
事务访问管理单元,用于利用回滚段空间对事务进行操作管理;其中,事务对记录的操作包括创建操作、修改操作和删除操作;The transaction access management unit is used to use the space of the rollback segment to perform operation management on the transaction; among them, the operation of the transaction on the record includes the creation operation, the modification operation and the deletion operation;
回滚段空间管理单元,用于对回滚段中的记录版本进行存储和访问管理。The rollback segment space management unit is used to store and access the record version in the rollback segment.
进一步地,所述回滚段空间由哈希表和磁盘文件组成;其中,所述磁盘文件的数量为一个或一组;当所述回滚段空间由一组磁盘文件组成时,所述磁盘文件的每个文件均为定长文件。Further, the rollback segment space is composed of a hash table and a disk file; wherein the number of the disk file is one or a group; when the rollback segment space is composed of a group of disk files, the disk Each file of the file is a fixed-length file.
进一步地,所述数据可见性优化模块,包括:Further, the data visibility optimization module includes:
事务访问操作约束单元,用于利用记录排他锁以对事务的访问操作进行写写互斥约束;The transaction access operation constraint unit is used to use the record exclusive lock to perform write and write mutual exclusion constraints on transaction access operations;
记录版本信息访问优化单元,用于根据事务对记录锁的读锁结果,对记录的版本信息进行访问优化处理;其中,所述读锁结果包括读锁申请成功和读锁申请失败;The record version information access optimization unit is configured to perform access optimization processing on the recorded version information according to the read lock result of the record lock by the transaction; wherein the read lock result includes a successful read lock application and a failed read lock application;
老版本记录访问约束单元,用于根据事务初始日志顺序号和记录初始日志顺序号的比较结果,对老版本记录的访问可见性进行约束。The old version record access restriction unit is used to restrict the access visibility of the old version record according to the comparison result of the transaction initial log sequence number and the record initial log sequence number.
进一步地,所述多版本回收模块,包括:Further, the multi-version recovery module includes:
回滚段文件区分单元,用于根据事务初始日志顺序号对回滚段文件与最老未提交事务进行新老比较,并根据比较的结果区分回滚段空间中的复用文件和非复用文件;The rollback segment file distinguishing unit is used to compare the rollback segment file with the oldest uncommitted transaction based on the initial log sequence number of the transaction, and distinguish between reused files and non-reusable files in the rollback segment space according to the comparison result file;
回滚段文件清除单元,用于对回滚段空间中的非复用文件进行清除;The rollback segment file clearing unit is used to clear non-reusable files in the rollback segment space;
回滚段文件回收单元,用于对回滚段空间中的复用文件进行回收,作为循环利用文件。The rollback segment file recovery unit is used to recover the reused files in the rollback segment space as a recycling file.
进一步地,所述多版本回收模块对回滚段空间的清除和回收操作的触发方式包括:Further, the method for triggering the clearing and reclaiming operation of the rollback segment space by the multi-version reclaiming module includes:
根据预设的时间间隔进行触发;或,Trigger according to the preset time interval; or,
根据预设的回滚段文件使用次数阈值进行触发。Triggered according to the preset threshold of the number of times the rollback segment file is used.
与现有技术相比,本发明具有如下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
1.回滚段回收简单,没有显著的系统开销,没有系统性能抖动,也不会造成磁盘的碎片化;1. The recovery of the rollback segment is simple, there is no significant system overhead, no system performance jitter, and no disk fragmentation;
2.可配置的回滚段在有足够磁盘的情况下可以保证没有对并发运行事务个数和事务长短的限制;2. The configurable rollback segment can guarantee that there are no restrictions on the number of concurrently running transactions and the length of the transaction when there are enough disks;
3.内存与磁盘相结合的设计可以满足绝大多数情况下(短事务)避免或减少磁盘访问,从而保证系统的访问性能。3. The design of the combination of memory and disk can meet most situations (short transactions) to avoid or reduce disk access, thereby ensuring the access performance of the system.
附图说明Description of the drawings
图1是本发明一实施例提供的基于无碎片回收的数据库多版本并发控制系统的结构示意图;FIG. 1 is a schematic structural diagram of a database multi-version concurrency control system based on non-fragmentation recovery provided by an embodiment of the present invention;
图2是本发明一实施例提供的回滚段访问的应用举例示意图。Fig. 2 is a schematic diagram of an application example of rollback segment access provided by an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整的描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following describes the technical solutions in the embodiments of the present invention clearly and completely with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
相比于现有技术,本发明采用定长数据段来存储链式结构下的多版本数据记 录,并在数据段的元数据中保留本段中最后更新的数据版本。在老数据回收时,根据数据段中元数据判断该段多版本记录是否全部老于当前数据库系统活跃的事务范围来决定是否整段数据需要被回收。这种大块的数据回收的机制可以保证回收效率,不会造成磁盘碎片,同时有效减少了对系统的开销。Compared with the prior art, the present invention uses a fixed-length data segment to store multi-version data records in a chain structure, and retains the last updated data version in this segment in the metadata of the data segment. When the old data is reclaimed, it is determined whether the multi-version records of the segment are all older than the active transaction range of the current database system according to the metadata in the data segment to determine whether the entire data needs to be reclaimed. This large-block data recovery mechanism can ensure the recovery efficiency, will not cause disk fragmentation, and effectively reduce the overhead on the system.
以下为本发明实施例中的术语解释:The following is an explanation of terms in the embodiments of the present invention:
事务(Transaction):是由一系列对系统中数据进行访问与更新的操作所组成的一个程序执行逻辑单元。事务具有原子性(Atomicity)、一致性(Consistency)、隔离性(Isolation)、持久性(Duration),简称ACID。Transaction (Transaction): A program execution logic unit composed of a series of operations to access and update data in the system. Transactions have Atomicity, Consistency, Isolation, and Duration, referred to as ACID.
事务日志记录(LR,Log Record):数据库系统中额外的记录,用于记录针对数据库的任何变更,以保证事务回滚时数据可以被恢复到更新前的状态。Transaction Log Record (LR, Log Record): An additional record in the database system used to record any changes to the database to ensure that the data can be restored to the state before the update when the transaction is rolled back.
隔离级别(Isolation level):在标准SQL规范中,定义了4个事务隔离级别,不同的隔离级别对事务的处理不同。4个隔离级别分别是:读未提及(READ_UNCOMMITTED)、读已提交(READ_COMMITTED)、可重复读(REPEATABLE_READ)、顺序读(SERIALIZABLE)。Isolation level: In the standard SQL specification, 4 transaction isolation levels are defined, and different isolation levels handle transactions differently. The 4 isolation levels are: read unmentioned (READ_UNCOMMITTED), read submitted (READ_COMMITTED), repeatable read (REPEATABLE_READ), sequential read (SERIALIZABLE).
多版本并发控制(MVCC):数据库里的一种并发控制机制,用以解决数据的幻读问题,从而在不阻塞读的情况下实现可重复读。Multi-version concurrency control (MVCC): A concurrency control mechanism in the database to solve the problem of phantom reading of data, so as to achieve repeatable reading without blocking the reading.
最老的未提交事务(lowTran):系统中最早开始的且尚未提交的事务。The oldest uncommitted transaction (lowTran): the earliest started and uncommitted transaction in the system.
日志顺序号(LSN,Log Sequence Number):事件发生时对应日志中的递增顺序号。Log Sequence Number (LSN, Log Sequence Number): The increasing sequence number in the corresponding log when the event occurs.
事务ID(TID,Transaction ID):事务的唯一标识号。Transaction ID (TID, Transaction ID): the unique identification number of the transaction.
事务开始LSN(TBLSN,Transaction Begin LSN):事务开始时的LSN。由于LSN的唯一性,可以使用TBLSN来标识一个事务。本发明中优选地可使用TBLSN来替代TID。Transaction Begin LSN (TBLSN, Transaction Begin LSN): The LSN at the beginning of the transaction. Due to the uniqueness of LSN, TBLSN can be used to identify a transaction. In the present invention, TBLSN can preferably be used instead of TID.
记录ID(RID):一条记录的唯一标识,通常包含定位一条记录的信息,例如磁盘上的逻辑或物理位置。Record ID (RID): A unique identification of a record, usually containing information for locating a record, such as the logical or physical location on the disk.
回滚段(Rollback Segment):存储老版本的数据片段,可以在内存中或磁盘上。Rollback Segment: Store the old version of the data segment, which can be in memory or on disk.
请参见图1,本发明实施例提供了一种基于无碎片回收的数据库多版本并发控制系统,包括:Referring to Fig. 1, an embodiment of the present invention provides a database multi-version concurrency control system based on non-fragmentation recovery, including:
事务信息记录模块,用于对数据库中的各个事务分配唯一标识,并对系统中的最老未提交事务、以及发起于所述最老未提交事务之后的事务信息进行记录;进一步地,所述事务信息包括事务ID、提交状态信息、提交时的时间戳。The transaction information recording module is used to assign a unique identifier to each transaction in the database, and to record the oldest uncommitted transaction in the system and the transaction information initiated after the oldest uncommitted transaction; further, the The transaction information includes transaction ID, submission status information, and time stamp at the time of submission.
在具体实施例中,需要说明的是,数据库中每个事务都有其唯一标识。这个标识在逻辑上是根据事务进入系统的先后顺序递增的,在各种数据库的实现中,通常使用包含时间戳或者是日志顺序号(LSN)的ID来标识。数据库管理系统通常会记录当前系统中的lowtran,以及所有开始于lowtran之后的事务的状态,至少包括事务ID,是否提交,提交时的时间戳。对于每条记录,系统也要隐式的标注创建该记录版本的事务ID用于可见性判断。In the specific embodiment, it should be noted that each transaction in the database has its unique identifier. This identification is logically incremented according to the order in which the transactions enter the system. In the implementation of various databases, IDs that contain a timestamp or a log sequence number (LSN) are usually used for identification. The database management system usually records the lowtran in the current system and the status of all transactions starting after the lowtran, including at least the transaction ID, whether it is committed, and the time stamp when it is committed. For each record, the system also implicitly marks the transaction ID that created the record version for visibility judgment.
事务数据管理模块,用于利用回滚段空间对事务以及记录数据进行操作管理和访问优化;其中,所述回滚段由一个文件或一组定长文件组成;The transaction data management module is used to use the rollback segment space to perform operation management and access optimization on transactions and recorded data; wherein, the rollback segment is composed of a file or a set of fixed-length files;
进一步地,所述事务数据管理模块包括:Further, the transaction data management module includes:
事务访问管理单元,用于利用回滚段空间对事务进行操作管理;其中,事务对记录的操作包括创建操作、修改操作和删除操作;The transaction access management unit is used to use the space of the rollback segment to perform operation management on the transaction; among them, the operation of the transaction on the record includes the creation operation, the modification operation and the deletion operation;
回滚段空间管理单元,用于对回滚段中的记录版本进行存储和访问管理。The rollback segment space management unit is used to store and access the record version in the rollback segment.
在本发明实施例中,优选地,所述回滚段空间由哈希表和磁盘文件组成;其中,所述磁盘文件的数量为一个或一组;当所述回滚段空间由一组磁盘文件组成时,所述磁盘文件的每个文件均为定长文件。In the embodiment of the present invention, preferably, the rollback segment space is composed of a hash table and a disk file; wherein the number of the disk files is one or a group; when the rollback segment space is composed of a group of disk files When the files are composed, each file of the disk file is a fixed-length file.
在本发明实施例中,事务对数据的修改过程如下:In the embodiment of the present invention, the modification process of the transaction to the data is as follows:
1.所有事务发起时从系统中获取TBLSN并写一条事务开始日志。1. Obtain TBLSN from the system when all transactions are initiated and write a transaction start log.
2.事务创建(insert),修改(update)或删除(delete)一条记录时,先获取记录锁,成功后先将原有版本记录(仅限修改和删除操作)放入回滚段,在原有位置直接变更或插入记录,记录的头部包含本事务的TBLSN,也即生成这个版本数据的事务TBLSN。删除操作可直接删除该记录。2. When a transaction is created (insert), modified (update) or deleted (delete) a record, the record lock is acquired first, and the original version record (only for modification and deletion operations) is put into the rollback segment after success. The position is directly changed or inserted into the record, and the header of the record contains the TBLSN of the transaction, which is the transaction TBLSN that generated this version of the data. The delete operation can directly delete the record.
3.回滚段是包含一个内存中的哈希表,和一个或一组磁盘文件。文件由多个 定长的数据段组成。记录根据版本在文件中呈链式结构。请参见图2,记录将根据RID哈希到不同的哈希桶中,每个桶中存放该记录的最后一个老版本的磁盘地址。每个磁盘上老版本记录都会包含一个指针指向更老的一个版本从而形成一个链。但通过哈希桶加入一个新的记录版本时,系统只需要在回滚段最后追加那个记录,将其指针指向哈希桶中原来保存的地址,再更改哈希桶中的地址指向新添加的记录即可。这样每个哈希桶中对应的都是重新到老的记录版本链。3. The rollback segment contains a hash table in memory and one or a group of disk files. The file consists of multiple fixed-length data segments. The records are in a chain structure in the file according to the version. Refer to Figure 2. The record will be hashed into different hash buckets according to the RID, and each bucket stores the disk address of the last old version of the record. Each old version record on the disk will contain a pointer to an older version to form a chain. But when adding a new record version through the hash bucket, the system only needs to append that record at the end of the rollback segment, point its pointer to the address originally saved in the hash bucket, and then change the address in the hash bucket to point to the newly added record. Just record. In this way, each hash bucket corresponds to the old record version chain.
当然由于哈希桶数量有限,每个桶上还是会有冲突的,也即一条链上可能会出现多个记录,系统通过对比RID可以找到匹配的记录。Of course, due to the limited number of hash buckets, there will still be conflicts on each bucket, that is, multiple records may appear on a chain, and the system can find matching records by comparing the RID.
4.回滚段由一个文件或一组定长文件组成;4. The rollback segment is composed of a file or a group of fixed-length files;
当由一组文件组成时,每个文件有定长大小。在文件的元数据中,可以放在文件头或其他地方,保存该文件中所有记录事务的最大TBLSN(maxTBLSN)。这个TBLSN有别于记录头上的TBLSN,它是要修改该版本记录的事务的TBLSN。当一个文件写满后,或空余位置不足以保存下一条记录时,回滚段自动切换到下一个文件。在具体实施例中可以人为限制单个文件大小和总文件数目,也可不做限制。当不做限制时可能占用很多磁盘空间,但做限制时也会限制数据库系统中允许的未提交事务个数和最长允许的未提交事务执行时间。When it is composed of a group of files, each file has a fixed length size. In the metadata of the file, it can be placed in the file header or elsewhere to save the maximum TBLSN (maxTBLSN) of all recorded transactions in the file. This TBLSN is different from the TBLSN on the record header, it is the TBLSN of the transaction that wants to modify the version record. When a file is full, or the free space is not enough to save the next record, the rollback segment automatically switches to the next file. In a specific embodiment, the size of a single file and the total number of files can be artificially limited, or there is no limitation. When the limit is not set, it may take up a lot of disk space, but when the limit is set, the number of uncommitted transactions allowed in the database system and the maximum allowable execution time of uncommitted transactions are also limited.
当回滚段由一个文件组成,该文件可以由多个定长的逻辑段组成。每个逻辑段上保存本段中的maxTBLSN。逻辑段切换逻辑与文件切换逻辑类似。在系统中会有元数据追踪当前使用的文件或逻辑段信息。When the rollback segment is composed of a file, the file can be composed of multiple fixed-length logical segments. Each logical segment stores the maxTBLSN in this segment. The logic segment switching logic is similar to the file switching logic. There will be metadata in the system to track the currently used file or logical segment information.
数据可见性优化模块,用于利用记录排他锁以对事务的访问操作进行写写互斥约束;根据事务对记录锁的读锁结果,对记录的版本信息进行访问优化处理;根据事务初始日志顺序号和记录初始日志顺序号的比较结果,对老版本记录的可见性进行约束;进一步地,所述数据可见性优化模块,包括:The data visibility optimization module is used to use the record exclusive lock to perform write and write mutual exclusion constraints on the access operation of the transaction; according to the read lock result of the transaction on the record lock, the record version information is optimized for access; according to the initial log sequence of the transaction The result of the comparison between the number and the sequence number of the record initial log restricts the visibility of the old version record; further, the data visibility optimization module includes:
事务访问操作约束单元,用于利用记录排他锁以对事务的访问操作进行写写互斥约束;The transaction access operation constraint unit is used to use the record exclusive lock to perform write and write mutual exclusion constraints on transaction access operations;
记录版本信息访问优化单元,用于根据事务对记录锁的读锁结果,对记录的版本信息进行访问优化处理;其中,所述读锁结果包括读锁申请成功和读锁申请 失败;The record version information access optimization unit is configured to perform access optimization processing on the recorded version information according to the read lock result of the record lock of the transaction; wherein the read lock result includes a successful read lock application and a failed read lock application;
老版本记录访问约束单元,用于根据事务初始日志顺序号和记录初始日志顺序号的比较结果,对老版本记录的访问可见性进行约束。The old version record access restriction unit is used to restrict the access visibility of the old version record according to the comparison result of the transaction initial log sequence number and the record initial log sequence number.
在本发明实施例中,数据可见性有多种判断方法且与上面回滚段的设计相互独立。下面进行具体举例描述一种结合事务TBLSN、事务锁的优化实现方式:In the embodiment of the present invention, there are multiple judgment methods for data visibility and are independent of the design of the above rollback segment. The following specific examples describe an optimized implementation method combining transaction TBLSN and transaction lock:
1.插入/修改/删除操作需要持有记录排他锁保证写写互斥;1. Insert/modify/delete operations need to hold a record exclusive lock to ensure that writes are mutually exclusive;
2.读操作会先尝试获取记录的共享锁,根据获取记录锁成功/失败,记录的版本信息可以分为以下三种情况处理:2. The read operation will first try to obtain the shared lock of the record. According to the success/failure of obtaining the record lock, the version information of the record can be handled in the following three situations:
(1)读锁申请成功,如果读事务TBLSN大于当前磁盘记录的TBLSN,意味着读事务发起于最新版本生成之后,读事务可以直接使用当前磁盘版本记录;(1) The read lock application is successful. If the read transaction TBLSN is greater than the TBLSN of the current disk record, it means that the read transaction is initiated after the latest version is generated, and the read transaction can directly use the current disk version record;
(2)读锁申请成功且读事务TBLSN小于磁盘记录的TBLSN,读事务需要使用回滚段中的老版本。这种场景下可以立即释放记录锁,也即读老版本过程中不用持有记录锁。(2) The read lock application is successful and the read transaction TBLSN is less than the TBLSN recorded on the disk, and the read transaction needs to use the old version in the rollback segment. In this scenario, the record lock can be released immediately, that is, there is no need to hold the record lock in the process of reading the old version.
(3)读锁申请失败,读事务需要使用老版本,具体操作类似情况2。同样在读老版本过程中不用持有记录锁(3) The read lock application fails, the read transaction needs to use the old version, and the specific operation is similar to case 2. Also do not need to hold the record lock in the process of reading the old version
3.在使用老版本时,先通过RID找到对应哈希桶,读到记录在回滚段中的地址(文件加偏移量)。读出记录信息,如果RID匹配,则比较事务TBLSN和记录TBLSN。若记录TBLSN小于事务TBLSN,则该记录对事务可见,否则根据指针找到更老的一个版本,直至找到可见版本为止。3. When using the old version, first find the corresponding hash bucket through RID, and read the address (file plus offset) recorded in the rollback segment. Read the record information. If the RID matches, compare the transaction TBLSN with the record TBLSN. If the record TBLSN is less than the transaction TBLSN, the record is visible to the transaction, otherwise an older version is found according to the pointer until the visible version is found.
内存优化提速模块,用于将系统中最近的一个老版本记录缓存到非磁盘内存结构中;其中,所述非磁盘内存结构包括哈希桶;在本发明实施例中,需要说明的是,由于磁盘访问速度太慢,可能影响事务的响应时间。为提高响应速度,系统可以将最近的一个老版本缓存在哈希桶中或者是其他的内存结构上。从而保证了在短事务场景下,内存中的这一个老记录版本能满足大部分访问需求,从而避免了磁盘访问。The memory optimization speed-up module is used to cache the most recent old version record in the system into a non-disk memory structure; wherein, the non-disk memory structure includes a hash bucket; in the embodiment of the present invention, it should be noted that because The disk access speed is too slow, which may affect the response time of the transaction. In order to improve the response speed, the system can cache the most recent old version in a hash bucket or other memory structure. This ensures that in a short transaction scenario, this old record version in the memory can meet most of the access requirements, thereby avoiding disk access.
多版本回收模块,用于区分回滚段空间中的复用文件和非复用文件,对所述非复用文件进行清除,并对所述复用文件进行回收。进一步地,所述多版本回收 模块,包括:The multi-version recovery module is used to distinguish multiplexed files and non-multiplexed files in the rollback segment space, clear the non-multiplexed files, and recover the multiplexed files. Further, the multi-version recovery module includes:
回滚段文件区分单元,用于根据事务初始日志顺序号对回滚段文件与最老未提交事务进行新老比较,并根据比较的结果区分回滚段空间中的复用文件和非复用文件;The rollback segment file distinguishing unit is used to compare the rollback segment file with the oldest uncommitted transaction based on the initial log sequence number of the transaction, and distinguish between reused files and non-reusable files in the rollback segment space according to the comparison result file;
回滚段文件清除单元,用于对回滚段空间中的非复用文件进行清除;The rollback segment file clearing unit is used to clear non-reusable files in the rollback segment space;
回滚段文件回收单元,用于对回滚段空间中的复用文件进行回收,作为循环利用文件。The rollback segment file recovery unit is used to recover the reused files in the rollback segment space as a recycling file.
在本发明实施例中,进一步地,所述多版本回收模块对回滚段空间的清除和回收操作的触发方式包括:In the embodiment of the present invention, further, the method for triggering the clearing and reclaiming operation of the rollback segment space by the multi-version reclaiming module includes:
根据预设的时间间隔进行触发;或,Trigger according to the preset time interval; or,
根据预设的回滚段文件使用次数阈值进行触发。Triggered according to the preset threshold of the number of times the rollback segment file is used.
需要说明的是,系统可以设置后台任务负责回收回滚段空间。其中,当使用一组回滚段文件时,若一个回滚段文件的maxTBLSN>lowtran,该文件可以被重用或删除。当使用一个回滚段文件时,当一个逻辑段的maxTBLSN>lowtran时该逻辑段可以被重用。可以理解的是,其原理是所有可能访问该版本的事务都已经结束了(提交或回滚),任何正在运行或新来的事务只需要访问当前磁盘上最新数据项或回滚段中相对新的老版本记录了,整个数据库系统不再需要这个文件或逻辑段中的记录了。It should be noted that the system can set a background task to be responsible for reclaiming the space of the rollback segment. Among them, when using a set of rollback segment files, if maxTBLSN>lowtran of a rollback segment file, the file can be reused or deleted. When using a rollback segment file, the logical segment can be reused when maxTBLSN>lowtran of a logical segment. It is understandable that the principle is that all transactions that may access this version have ended (commit or rollback), any running or new transactions only need to access the latest data items on the current disk or relatively new in the rollback segment The old version is recorded, and the entire database system no longer needs the record in this file or logical segment.
这样的后台任务非常简单,只需扫描文件或逻辑段的元数据就可以清除不需要的大块定长空间。一般可以根据数据库实际使用负载来配置定长大小和总的逻辑块/文件个数,并选择是否复用空间。可以在决定复用时选择一个文件和多个逻辑块的模式,不复用空间时选用多个文件的模式。在多文件(不复用)模式下,清除就是删除不用的文件。在复用的模式下,系统只需记住回收到了那个文件或逻辑块,然后循环使用。在具体实施例中,后台任务可以设定为定时触发,或者在回滚段使用达到一定预设阈值时触发。Such a background task is very simple, only need to scan the metadata of the file or logical segment to clear the unneeded large fixed-length space. Generally, you can configure the fixed-length size and the total number of logical blocks/files according to the actual load of the database, and choose whether to reuse the space. You can select the mode of one file and multiple logical blocks when deciding to reuse, and select the mode of multiple files when not reusing space. In the multi-file (not multiplexed) mode, clearing means deleting unused files. In the reuse mode, the system only needs to remember which file or logical block has been recovered, and then recycle it. In a specific embodiment, the background task may be set to be triggered periodically, or triggered when the use of the rollback segment reaches a certain preset threshold.
综上所述,本发明实施例利用定长文件的文件组或定长逻辑段的方式设计回滚段,提供了一套MVCC实现机制和回滚段回收清理方法,并结合内存来提高访 问响应时间。相对于现有MVCC实现有如下优势:In summary, the embodiment of the present invention uses fixed-length file file groups or fixed-length logical segments to design rollback segments, provides a set of MVCC implementation mechanism and rollback segment recovery and cleaning methods, and combines memory to improve access response time. Compared with the existing MVCC implementation, it has the following advantages:
1.回滚段回收简单,没有显著的系统开销,没有系统性能抖动,也不会造成磁盘的碎片化;1. The recovery of the rollback segment is simple, there is no significant system overhead, no system performance jitter, and no disk fragmentation;
2.可配置的回滚段在有足够磁盘的情况下可以保证没有对并发运行事务个数和事务长短的限制;2. The configurable rollback segment can guarantee that there are no restrictions on the number of concurrently running transactions and the length of the transaction when there are enough disks;
3.内存与磁盘相结合的设计可以满足绝大多数情况下(短事务)避免或减少磁盘访问,从而保证系统的访问性能。3. The design of the combination of memory and disk can meet most situations (short transactions) to avoid or reduce disk access, thereby ensuring the access performance of the system.
需要说明的是,对于以上方法或流程实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明实施例并不受所描述的动作顺序的限制,因为依据本发明实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作并不一定是本发明实施例所必须的。It should be noted that for the above method or process embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should know that the embodiments of the present invention are not affected by the described sequence of actions. Limitation, because according to the embodiment of the present invention, some steps can be performed in other order or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are optional embodiments, and the actions involved are not necessarily required by the embodiments of the present invention.
以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也视为本发明的保护范围。The above are the preferred embodiments of the present invention. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present invention, several improvements and modifications can be made, and these improvements and modifications are also considered This is the protection scope of the present invention.

Claims (7)

  1. 一种基于无碎片回收的数据库多版本并发控制系统,其特征在于,包括:A database multi-version concurrency control system based on non-fragmentation recovery, which is characterized in that it includes:
    事务信息记录模块,用于对数据库中的各个事务分配唯一标识,并对系统中的最老未提交事务、以及发起于所述最老未提交事务之后的事务信息进行记录;The transaction information recording module is used to assign a unique identifier to each transaction in the database, and to record the oldest uncommitted transaction in the system and the transaction information initiated after the oldest uncommitted transaction;
    事务数据管理模块,用于利用回滚段空间对事务以及记录数据进行操作管理和访问优化;其中,所述回滚段由一个文件或一组定长文件组成;The transaction data management module is used to use the rollback segment space to perform operation management and access optimization on transactions and recorded data; wherein, the rollback segment is composed of a file or a set of fixed-length files;
    数据可见性优化模块,用于利用记录排他锁以对事务的访问操作进行写写互斥约束;根据事务对记录锁的读锁结果,对记录的版本信息进行访问优化处理;根据事务初始日志顺序号和记录初始日志顺序号的比较结果,对老版本记录的可见性进行约束;The data visibility optimization module is used to use the record exclusive lock to perform write and write mutual exclusion constraints on the access operation of the transaction; according to the read lock result of the transaction on the record lock, the record version information is optimized for access; according to the initial log sequence of the transaction The comparison result of the log number and the original log sequence number of the record, restricts the visibility of the records of the old version;
    内存优化提速模块,用于将系统中最近的一个老版本记录缓存到非磁盘内存结构中;其中,所述非磁盘内存结构包括哈希桶;The memory optimization speed-up module is used to cache the latest old version record in the system into a non-disk memory structure; wherein, the non-disk memory structure includes a hash bucket;
    多版本回收模块,用于区分回滚段空间中的复用文件和非复用文件,对所述非复用文件进行清除,并对所述复用文件进行回收。The multi-version recovery module is used to distinguish multiplexed files and non-multiplexed files in the rollback segment space, clear the non-multiplexed files, and recover the multiplexed files.
  2. 根据权利要求1所述的基于无碎片回收的数据库多版本并发控制系统,其特征在于,所述事务信息包括事务ID、提交状态信息、提交时的时间戳。The database multi-version concurrency control system based on non-fragmentation recovery according to claim 1, wherein the transaction information includes a transaction ID, submission status information, and a timestamp at the time of submission.
  3. 根据权利要求1所述的基于无碎片回收的数据库多版本并发控制系统,其特征在于,所述事务数据管理模块包括:The database multi-version concurrency control system based on non-fragmentation recovery according to claim 1, wherein the transaction data management module comprises:
    事务访问管理单元,用于利用回滚段空间对事务进行操作管理;其中,事务对记录的操作包括创建操作、修改操作和删除操作;The transaction access management unit is used to use the space of the rollback segment to perform operation management on the transaction; among them, the operation of the transaction on the record includes the creation operation, the modification operation and the deletion operation;
    回滚段空间管理单元,用于对回滚段中的记录版本进行存储和访问管理。The rollback segment space management unit is used to store and access the record version in the rollback segment.
  4. 根据权利要求3所述的基于无碎片回收的数据库多版本并发控制系统,其特征在于,所述回滚段空间由哈希表和磁盘文件组成;其中,所述磁盘文件的数量为一个或一组;当所述回滚段空间由一组磁盘文件组成时,所述磁盘文件的 每个文件均为定长文件。The database multi-version concurrency control system based on non-fragmentation recovery according to claim 3, wherein the rollback segment space is composed of a hash table and a disk file; wherein the number of the disk file is one or one Group; when the rollback segment space is composed of a group of disk files, each file of the disk file is a fixed-length file.
  5. 根据权利要求1所述的基于无碎片回收的数据库多版本并发控制系统,其特征在于,所述数据可见性优化模块,包括:The database multi-version concurrency control system based on non-fragmentation recovery according to claim 1, wherein the data visibility optimization module comprises:
    事务访问操作约束单元,用于利用记录排他锁以对事务的访问操作进行写写互斥约束;The transaction access operation constraint unit is used to use the record exclusive lock to perform write and write mutual exclusion constraints on transaction access operations;
    记录版本信息访问优化单元,用于根据事务对记录锁的读锁结果,对记录的版本信息进行访问优化处理;其中,所述读锁结果包括读锁申请成功和读锁申请失败;The record version information access optimization unit is configured to perform access optimization processing on the recorded version information according to the read lock result of the record lock by the transaction; wherein the read lock result includes a successful read lock application and a failed read lock application;
    老版本记录访问约束单元,用于根据事务初始日志顺序号和记录初始日志顺序号的比较结果,对老版本记录的访问可见性进行约束。The old version record access restriction unit is used to restrict the access visibility of the old version record according to the comparison result of the transaction initial log sequence number and the record initial log sequence number.
  6. 根据权利要求1所述的基于无碎片回收的数据库多版本并发控制系统,其特征在于,所述多版本回收模块,包括:The database multi-version concurrency control system based on non-fragmentation recovery according to claim 1, wherein the multi-version recovery module comprises:
    回滚段文件区分单元,用于根据事务初始日志顺序号对回滚段文件与最老未提交事务进行新老比较,并根据比较的结果区分回滚段空间中的复用文件和非复用文件;The rollback segment file distinguishing unit is used to compare the rollback segment file with the oldest uncommitted transaction based on the initial log sequence number of the transaction, and distinguish between reused files and non-reusable files in the rollback segment space according to the comparison result file;
    回滚段文件清除单元,用于对回滚段空间中的非复用文件进行清除;The rollback segment file clearing unit is used to clear non-reusable files in the rollback segment space;
    回滚段文件回收单元,用于对回滚段空间中的复用文件进行回收,作为循环利用文件。The rollback segment file recovery unit is used to recover the reused files in the rollback segment space as a recycling file.
  7. 根据权利要求1所述的基于无碎片回收的数据库多版本并发控制系统,其特征在于,所述多版本回收模块对回滚段空间的清除和回收操作的触发方式包括:The database multi-version concurrency control system based on non-fragmentation recovery according to claim 1, wherein the multi-version recovery module triggers the clearance and recovery operation of the rollback segment space including:
    根据预设的时间间隔进行触发;或,Trigger according to the preset time interval; or,
    根据预设的回滚段文件使用次数阈值进行触发。Triggered according to the preset threshold of the number of times the rollback segment file is used.
PCT/CN2020/121149 2019-10-16 2020-10-15 Fragment-free recovery-based database multi-version concurrency control system WO2021073571A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA3130011A CA3130011A1 (en) 2019-10-16 2020-10-15 Fragment-free recycling-based database multiversion concurrence control (mvcc) system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910986945.8A CN110825752B (en) 2019-10-16 2019-10-16 Database multi-version concurrency control system based on fragment-free recovery
CN201910986945.8 2019-10-16

Publications (1)

Publication Number Publication Date
WO2021073571A1 true WO2021073571A1 (en) 2021-04-22

Family

ID=69549621

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/121149 WO2021073571A1 (en) 2019-10-16 2020-10-15 Fragment-free recovery-based database multi-version concurrency control system

Country Status (3)

Country Link
CN (1) CN110825752B (en)
CA (1) CA3130011A1 (en)
WO (1) WO2021073571A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114428774A (en) * 2022-04-02 2022-05-03 北京奥星贝斯科技有限公司 Constraint relation checking method and device for database
CN114722125A (en) * 2022-04-11 2022-07-08 京东科技信息技术有限公司 Database transaction processing method, device, equipment and computer readable medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825752B (en) * 2019-10-16 2020-11-10 深圳巨杉数据库软件有限公司 Database multi-version concurrency control system based on fragment-free recovery
CN111400279B (en) * 2020-03-12 2021-02-12 腾讯科技(深圳)有限公司 Data operation method, device and computer readable storage medium
CN113419844A (en) * 2020-07-27 2021-09-21 阿里巴巴集团控股有限公司 Space recovery method and device, electronic equipment and computer storage medium
CN116244041B (en) * 2022-12-02 2023-10-27 湖南亚信安慧科技有限公司 Performance optimization method for database sub-transaction
CN116594808B (en) * 2023-04-26 2024-05-28 深圳计算科学研究院 Database rollback resource processing method, device, computer equipment and medium
CN117707607B (en) * 2023-12-29 2024-06-07 中正国际认证(深圳)有限公司 Concurrent management system and method for multi-version memory of data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744936A (en) * 2013-12-31 2014-04-23 华为技术有限公司 Multi-version concurrency control method in database and database system
CN106855858A (en) * 2015-12-08 2017-06-16 阿里巴巴集团控股有限公司 Database operation method and device
US20180330106A1 (en) * 2017-05-12 2018-11-15 Microsoft Technology Licensing, Llc Access Control Lists for High-Performance Naming Service
CN109710388A (en) * 2019-01-09 2019-05-03 腾讯科技(深圳)有限公司 Method for reading data, device, electronic equipment and storage medium
CN109871386A (en) * 2017-12-04 2019-06-11 Sap欧洲公司 Multi version concurrency control (MVCC) in nonvolatile memory
CN110825752A (en) * 2019-10-16 2020-02-21 深圳巨杉数据库软件有限公司 Database multi-version concurrency control system based on fragment-free recovery

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9922046B2 (en) * 2011-04-26 2018-03-20 Zettaset, Inc. Scalable distributed metadata file-system using key-value stores
CN110019140B (en) * 2017-12-29 2021-07-16 华为技术有限公司 Data migration method, device, equipment and computer readable storage medium
CN108363806B (en) * 2018-03-01 2020-07-31 上海达梦数据库有限公司 Multi-version concurrency control method and device for database, server and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744936A (en) * 2013-12-31 2014-04-23 华为技术有限公司 Multi-version concurrency control method in database and database system
CN106855858A (en) * 2015-12-08 2017-06-16 阿里巴巴集团控股有限公司 Database operation method and device
US20180330106A1 (en) * 2017-05-12 2018-11-15 Microsoft Technology Licensing, Llc Access Control Lists for High-Performance Naming Service
CN109871386A (en) * 2017-12-04 2019-06-11 Sap欧洲公司 Multi version concurrency control (MVCC) in nonvolatile memory
CN109710388A (en) * 2019-01-09 2019-05-03 腾讯科技(深圳)有限公司 Method for reading data, device, electronic equipment and storage medium
CN110825752A (en) * 2019-10-16 2020-02-21 深圳巨杉数据库软件有限公司 Database multi-version concurrency control system based on fragment-free recovery

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114428774A (en) * 2022-04-02 2022-05-03 北京奥星贝斯科技有限公司 Constraint relation checking method and device for database
CN114722125A (en) * 2022-04-11 2022-07-08 京东科技信息技术有限公司 Database transaction processing method, device, equipment and computer readable medium

Also Published As

Publication number Publication date
CN110825752B (en) 2020-11-10
CN110825752A (en) 2020-02-21
CA3130011A1 (en) 2021-04-22

Similar Documents

Publication Publication Date Title
WO2021073571A1 (en) Fragment-free recovery-based database multi-version concurrency control system
US9223805B2 (en) Durability implementation plan in an in-memory database system
JP3593366B2 (en) Database management method
US9183236B2 (en) Low level object version tracking using non-volatile memory write generations
US8181065B2 (en) Systems and methods for providing nonlinear journaling
US7266669B2 (en) File system with file management function and file management method
US6567928B1 (en) Method and apparatus for efficiently recovering from a failure in a database that includes unlogged objects
EP3493071B1 (en) Multi-version concurrency control (mvcc) in non-volatile memory
US7587429B2 (en) Method for checkpointing a main-memory database
US8560500B2 (en) Method and system for removing rows from directory tables
CN107735774B (en) SMR perception only adds file system
US11409616B2 (en) Recovery of in-memory databases after a system crash
US20150347547A1 (en) Replication in a NoSQL System Using Fractal Tree Indexes
US8108356B2 (en) Method for recovering data in a storage system
CN110515705B (en) Extensible persistent transactional memory and working method thereof
CN113515501B (en) Nonvolatile memory database management system recovery method and device and electronic equipment
CN113220490A (en) Transaction persistence method and system for asynchronous write-back persistent memory
US20120317384A1 (en) Data storage method
CN114816224A (en) Data management method and data management device
US20230333939A1 (en) Chunk and snapshot deletions
CN115221145A (en) Method and system for solving PostgreSQL database table expansion based on Undo table space
CN117389696A (en) Parallel recovery method and storage medium applied to OLTP memory database
Yang et al. FlashTKV: a high-throughput transactional keyvalue store on flash solid state drives

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20875992

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3130011

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28.09.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20875992

Country of ref document: EP

Kind code of ref document: A1