CN115809247A

CN115809247A - Database system and data processing method

Info

Publication number: CN115809247A
Application number: CN202211655150.7A
Authority: CN
Inventors: 杜嘉暄; 晏东; 邱礼胜; 吕静
Original assignee: Chengdu Ghostcloud Technology Co ltd
Current assignee: Chengdu Ghostcloud Technology Co ltd
Priority date: 2022-12-22
Filing date: 2022-12-22
Publication date: 2023-03-17

Abstract

The invention discloses a database system and a data processing method, relates to the field of databases, and solves the problems that the existing database has a complex structure, needs to analyze sentences and cannot meet high concurrency, and the technical scheme has the key points that: the system comprises a database operation entry module, a database operation entry module and a database operation entry module, wherein the database operation entry module is used for receiving a user instruction and sending a corresponding trigger instruction; the transaction processing module is used for controlling the memory and the disk to execute the transaction when the trigger instruction is the creation of the transaction; the disk is used for storing database information, and the database information comprises: the database data, the idle page data and at least two sets of metadata are stored by adopting a B + tree structure; the memory is used for mapping the database data and the free page data to the memory and executing objects on the memory level; the database operation entry module directly receives user instructions and controls the database to execute things or other operations, complex query statements are not needed, a server does not need to be additionally configured, and reading and writing concurrency can be supported.

Description

Database system and data processing method

Technical Field

The present invention relates to the field of databases, and more particularly, to a database system and a data processing method.

Background

In the information-oriented society, information resources are managed and utilized sufficiently and effectively, which is a precondition for scientific research and decision management. The database is a core part of the system and is an important means for scientific research and decision management. It can be said that, to some extent, the choice of database determines the quality of the system, but the traditional database has many disadvantages:

(1) Traditional relational databases structure data into rows that can only be accessed through SQL, which creates overhead when parsing and planning SQL statements.

(2) Most relational databases are independent servers that run independently of the application, adding to the overhead of serializing and transmitting data over the network.

(3) It is difficult to meet the high concurrency demands.

Disclosure of Invention

The present application aims to provide a database system and a data processing method, which solve the above problems.

One aspect of the present application provides a database system, which is implemented by the following technical solutions: comprises that

The database operation entry module is used for receiving a user instruction and sending a trigger instruction corresponding to the user instruction;

the transaction processing module is used for controlling the memory and the disk to execute the transaction when the trigger instruction is the creation of the transaction; wherein the content of the first and second substances,

the disk is used for storing database information, and the database information comprises: the system comprises database data, free page data and at least two sets of metadata, wherein the database data are stored by adopting a B + tree structure; and the memory is used for mapping the database data and the free page data to the memory through the metadata and executing the things on a memory level.

By adopting the technical scheme, on one hand, the database operation entry module directly receives the user instruction and controls the database to execute things or other operations, no complex query statement exists, and unnecessary expenses are reduced; on the other hand, the database system comprises a magnetic disk, the database information is stored, no additional server is needed to be configured, and the overhead of serialization and data transmission on the network is reduced; and at least two groups of metadata are used, when things are executed, the database information of the disk is mapped to the memory through the metadata, and the concurrent reading and writing can be supported.

Further, the system also comprises a cache module which is used for storing part of database data by adopting a hash table structure.

Further, in the magnetic disk, the database information is stored in a block form.

Further, the memory includes:

the database module is used for mapping the database data and is used for constructing a B + tree containing the database data;

the node module is used for storing database data positioned at the node of the B + tree;

and the free page module is used for mapping the free page data and storing the free page data.

And the vernier module is used for traversing and positioning the database module when the object is executed.

Further, the database operation entry module includes a plurality of encapsulation interfaces, and the encapsulation interfaces include: an interface to open/close a database, an interface to update a database, and an interface to create a transaction.

The application also provides a data processing method, which is realized by the following technical scheme: based on the database system, the method comprises the following steps of writing things:

s110, initializing a database and things: mapping database information of a disk to a memory to obtain a B + tree containing database data;

s120, writing data: traversing the B + tree, positioning a B + tree node corresponding to a user instruction, and writing data into the node;

s130, submitting the affair: balancing the B + tree, updating database information, and converting the updated database information into blocks to be brushed into a disk; and inquiring whether the B + tree node corresponding to the user instruction exists in the cache module, and if so, marking the node as invalid.

Further, the committing the transaction includes: and splitting and merging the nodes of the B + tree, distributing a new block for the modified nodes, distributing a new block for the idle page data, writing the new block into a disk, and finally writing the metadata into the disk.

Further, reading things, comprising the following steps:

s210, preliminary query: inquiring database data corresponding to the user instruction in the cache module, and if the database data exists, returning the database data; if not, initializing the database and things: mapping database information of a disk to a memory to obtain a B + tree containing database data;

s220, deep query: traversing the B + tree, positioning the B + tree node corresponding to the user instruction, and returning the database data of the B + tree node.

Further, the following steps are carried out: and when the transaction is executed by the writing object, locking the database data by adopting a pessimistic lock until the object is submitted to the end, and releasing the pessimistic lock.

Furthermore, the blocks are arranged in ascending order according to the identifiers, and are written into the disk file through sequential scanning.

Compared with the prior art, the method has the following beneficial effects:

the data in the disk can be accessed through the database operation entry module, the query statement does not need to be analyzed, and the operation can be carried out without a server; the data is stored by adopting a B + tree structure, and the data is accessed based on the index of the B + tree, so that the reading and writing efficiency is high and stable; the read-read concurrency and the read-write concurrency can be ensured by adopting two groups of metadata and memory mapping; and a cache module is arranged for storing common data, so that the reading performance is further improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:

fig. 1 is a schematic structural diagram of a database system according to an embodiment of the present invention;

FIG. 2 is a flow chart of a write transaction according to an embodiment of the present invention;

FIG. 3 is a diagram of a write transaction lane provided by one embodiment of the present invention;

FIG. 4 is a lane diagram of a read object according to an embodiment of the present invention.

Detailed Description

Hereinafter, the terms "includes" or "may include" used in various embodiments of the present application indicate the presence of the claimed function, operation, or element, and do not limit the addition of one or more functions, operations, or elements. Furthermore, as used in various embodiments of the present application, the terms "comprising," "having," and their derivatives, are intended to be inclusive and mean only that a particular feature, number, step, operation, element, component, or combination of the foregoing is meant, and should not be construed as first excluding the presence of, or adding to, one or more other features, numbers, steps, operations, elements, components, or combinations of the foregoing.

In various embodiments of the present application, the expression "or" at least one of B or/and C "includes any or all combinations of the words listed simultaneously. For example, the expression "B or C" or "at least one of B or/and C" may include B, may include C, or may include both B and C.

Expressions (such as "first", "second", and the like) used in various embodiments of the present application may modify various constituent elements in the various embodiments, but may not limit the respective constituent elements. For example, the above description does not limit the order and/or importance of the elements described. The foregoing description is for the purpose of distinguishing one element from another. For example, the first user device and the second user device indicate different user devices, although both are user devices. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of various embodiments of the present application.

It should be noted that: if it is described that one constituent element is "connected" to or "connected" with another constituent element, the first constituent element may be directly connected to the second constituent element, and the third constituent element may be "connected" between the first constituent element and the second constituent element. In contrast, when one constituent element is "directly connected" to or connected "directly to another constituent element, it is understood that there is no third constituent element between the first constituent element and the second constituent element.

The terminology used in the various embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the various embodiments of the present application. As used herein, the singular forms are intended to include the plural forms as well, unless the context clearly indicates otherwise. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the various embodiments of the present application belong. The terms (such as those defined in commonly used dictionaries) should be interpreted as having a meaning that is consistent with their contextual meaning in the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in various embodiments.

To make the purpose, technical solution and advantages of the present application more apparent, the present application is further described in detail below with reference to examples and drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present application and are not used as limitations of the present application.

The present application provides a database system, the structure of which is shown in fig. 1, including: the system comprises a database operation entry module, an object processing module and a cache module;

the transaction processing module is used for controlling the memory and the disk to execute the transaction when the trigger instruction is used for creating the transaction; the disk is used for storing database information, and the database information comprises: database data, free page data and at least two sets of metadata, wherein the database data are stored by adopting a B + tree structure; the memory is used for mapping the database data and the free page data to the memory through the metadata and executing objects on the memory level;

and the cache module is used for storing part of database data by adopting a hash table structure, wherein the part of database data refers to commonly used database data, and the database reading efficiency is convenient to improve.

Further, in the disk, the database information is converted into blocks for storage.

Specifically, a disk stores a plurality of files, one file corresponds to one database, and database information of the database is stored; dividing the file into blocks according to the page size, reading and writing data by taking the blocks as a unit, and storing metadata in the first two blocks when the metadata are two groups; the special block stores free page data, namely the id of the free block; the remaining blocks form a B + tree structure, storing database data.

It should be noted that the metadata includes basic information of the database, such as an id of a free block, a root page of the database, and the like, and the root page of the database stores database data.

Further, the memory includes:

the database module is used for mapping database data and is used for constructing a B + tree containing the database data;

And the cursor module is used for traversing and positioning the database module when executing the object.

Specifically, the memory takes metadata in a disk as an inlet, and maps database information of the disk to the memory through mmap mapping to form a database module, a node module and a free page module; wherein, the first and the second end of the pipe are connected with each other,

the database module can be regarded as a huge B + tree in the memory and stores database data thereon; the B + tree comprises a plurality of buckets, key value pair data, namely key/value, can be stored in the buckets, index data can also be stored, and sub-buckets are nested. The bucket is an abstract concept of database encapsulation, is a name space essentially, is a collection of key value pair data and is equivalent to a table; logically, each database system maintains a root page, internally storing all user-created buckets in which a user can insert key-value pair data or continue to create buckets nested as needed. Finally, the data stored in the whole database can be regarded as a huge B + tree; physically, each bucket is actually a B + tree, and these B + trees together form a huge tree according to the nesting relationship of the logical structures.

The node module corresponds to a single node of the B + tree and comprises data of the node; the nodes comprise internal nodes and leaf nodes, the internal nodes store index data pointing to the nested sub-buckets, and the leaf nodes store key value pair data; the nodes correspond to blocks in the disk, the nodes are serialized into blocks, the blocks are deserialized into nodes, and one node can correspond to a plurality of blocks.

And the free page module stores the id of the free block, and can consider allocation when allocating the block for the node.

The vernier module is used for packaging bottom layer operation and providing methods of traversing, positioning, searching and the like for the barrel.

Specifically, the encapsulation interface encapsulates the database operation into an interface for the client to use, for example: opening a database interface Open (), and filling in a database name by a user; writing an object interface Put (), wherein a user only needs to fill in keys and values; reading object Get () and the user fills in the key.

It should be noted that before the database operates the entry module, an app may be set, and the app integrates a package interface to interact with a user.

By adopting the database system, the data is directly accessed by directly using the packaging interface without query sentences, so that unnecessary expenses are reduced; the disk stores database information and can run without a server; the reading concurrency and the reading concurrency can be ensured by adopting two groups of metadata and combining with memory mapping; the front cache module is used for improving the read-write performance of the database; based on the index of the B + tree, the reading efficiency is efficient and stable.

The present application further provides a data processing method, and as shown in fig. 2 to 4, embodiments of the present application provide a specific implementation process of reading and writing things, it should be noted that all operations of a system are assigned a transaction lightx, which is a layer of abstraction provided by a database to an application, and the layer of abstraction hides all concurrency problems and various possible errors of software and hardware, and only exposes two states to the application: successful commit and abort, all modifications that occur in the transaction are either executed or rolled back, there is no intermediate state of execution to half.

Writing things, the steps are as follows:

s120, writing data: traversing the B + tree, positioning the B + tree node corresponding to the user instruction, and writing data into the node;

s130, submitting the transaction: balancing the B + tree, updating database information, and converting the updated database information into blocks to be brushed into a disk; and inquiring whether the B + tree node corresponding to the user instruction exists in the cache module, and if so, marking the node as invalid.

Specifically, before writing things, the database DB is opened through the app control database operation entry module ghdb, and the database DB is initialized; the user fills in the name of the database by opening a database interface Open (), the database system reads a file according to the name, if the name does not exist, the file is created, information such as metadata meta and the like is written in the file, the file is mapped to the memory through mmap, if the file exists, the metadata meta in the file is read, and the metadata meta is mapped to the memory through mmap.

S110, initializing a database and things: and when a user fills in key/value through a data writing interface Put (), triggering a transaction processing module rightx to start writing an object, initializing the object GhTX, copying metadata meta to the object GhTX, namely acquiring a database module Bucket, and mapping a B + tree containing database data to a memory.

S120, writing data: and traversing the database module Bucket through a cursor module cursor, positioning a B + tree node corresponding to the key, writing a value into the node, if the writing fails, finishing the rollback, and if the writing fails, submitting a commit GhTX.

S130, committing the transaction commit GhTX: according to the structural requirement of a database module Bucket on a B + tree, splitting the B + tree, balancing Rebalance, allocating a new block for the modified node, allocating a new block for free page data, namely serializing, writing the new block into a disk, finally writing metadata meta into the disk, releasing an object GhTX, and completing the object writing operation; meanwhile, whether the B + tree node corresponding to the user instruction key exists in the cache module is inquired, if yes, the node is marked as invalid, and errors generated during reading things are avoided.

Reading things, comprising the following steps:

s210, preliminary query: inquiring database data corresponding to the user instruction in a cache module, and if the database data exists, returning the database data; if not, initializing the database and things: mapping database information of a disk to a memory to obtain a B + tree containing database data;

s220, deep query: and traversing the B + tree, positioning the B + tree node corresponding to the user instruction, and returning to the database data of the B + tree node.

Specifically, in S210, when the user fills in a key through the data writing interface Get (), the transaction processing module lightx is triggered to execute the object to be read, the cache module is firstly queried, and if the cache module does not have a key, the step S110 is referred to initialize the database DB and the object GhTx;

s220, traversing the B + tree in the database module Bucket through the cursor module cursor, positioning a node of the B + tree corresponding to the key, returning a value at the node, releasing the object GhTX, and finishing the object reading operation.

Furthermore, when the transaction is executed by writing things, a pessimistic lock is adopted to lock the database data until the submitted things are finished, and then the pessimistic lock is released, so that the condition that the disk file is modified only when a plurality of writing things are synchronously initiated is avoided.

Furthermore, the blocks are arranged in an ascending order according to the identifiers, and the disk files are written through sequential scanning, so that the overhead of serialization and deserialization is saved.

According to the data processing method, all operations before transaction submission are carried out in the memory, and data of the disk is read into the memory through mmap, so that the database can be completely accessed and operated in the memory; and monopolizing the write lock during the transaction writing, and releasing the lock until the transaction is completed, so that the modification of the disk file is ensured to be unique.

The above embodiments are provided to further explain the objects, technical solutions and advantages of the present invention in detail, it should be understood that the above embodiments are merely exemplary embodiments of the present invention and are not intended to limit the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A database system, characterized by: comprises that

the transaction processing module is used for controlling the memory and the disk to execute the transaction when the trigger instruction is the creation of the transaction; wherein, the first and the second end of the pipe are connected with each other,

2. A database system according to claim 1, wherein: the system also comprises a cache module which is used for storing part of database data by adopting a hash table structure.

3. A database system according to claim 2, wherein: and in a magnetic disk, converting the database information into blocks for storage.

4. A database system according to claim 1, wherein: the memory comprises:

5. A database system according to claim 1, wherein: the database operation entry module includes a plurality of encapsulation interfaces, the encapsulation interfaces including: an interface to open/close a database, an interface to update a database, and an interface to create a transaction.

6. A data processing method is characterized in that: a database system according to any of claims 3 to 5, comprising writing means for:

s130, submitting the affair: balancing the B + tree, updating database information, and converting the updated database information into blocks to be brushed into a disk; and inquiring whether a B + tree node corresponding to the user instruction exists in the cache module, and if so, marking the node as invalid.

7. A data processing method according to claim 6, characterized by: the commit transaction, comprising: and splitting and merging the nodes of the B + tree, distributing a new block for the modified nodes, distributing a new block for the idle page data, writing the new block into a disk, and finally writing the metadata into the disk.

8. A data processing method according to claim 6, characterized by: also includes reading things, the steps are as follows:

s210, preliminary query: inquiring database data corresponding to the user instruction in the cache module, and if the database data exists, returning the database data; if not, initializing the database and things: mapping the database information of the disk to a memory to obtain a B + tree containing database data;

s220, deep query: and traversing the B + tree, positioning the B + tree node corresponding to the user instruction, and returning the database data of the B + tree node.

9. A data processing method according to claim 8, characterized by: and when the transaction is executed by the writing object, locking the database data by adopting a pessimistic lock until the object is submitted to the end, and releasing the pessimistic lock.

10. A data processing method according to claim 6, characterized by: the blocks are arranged in ascending order according to the identifiers and written into the disk file through sequential scanning.