CN116226038A

CN116226038A - File system acceleration method and system based on LevelDB

Info

Publication number: CN116226038A
Application number: CN202211524055.3A
Authority: CN
Inventors: 赵泽钧; 袁苇
Original assignee: Fujian Newland Communication Science Technologies Co ltd
Current assignee: Fujian Newland Communication Science Technologies Co ltd
Priority date: 2022-12-01
Filing date: 2022-12-01
Publication date: 2023-06-06

Abstract

The invention provides a file system acceleration method and a file system acceleration system based on a level DB, which belong to the technical field of software engineering, wherein the method comprises the following steps: step S10, creating a live DB table in a user space; step S20, applying for i node numbers for each file to be stored through a global counter; step S30, setting a capacity threshold, and storing the file into a local file system or a live DB table of the kernel space based on the capacity threshold and the i node number; step S40, storing the update log of the live l DB table to a local file system; and S50, quickly accessing the small file based on the live l DB table of the local file system. The invention has the advantages that: the access performance of the small file is greatly improved.

Description

File system acceleration method and system based on LevelDB

Technical Field

The invention relates to the technical field of software engineering, in particular to a file system acceleration method and system based on a LevelDB.

Background

In current file systems, directory entries are stored in a linear array in a single file and are simply associated with inode numbers. A file system such as ext4 uses a hash table for directory association operations, while XFS, ZFS, etc. use B-trees for indexing directories; at the same time, LFS, ZFS, etc. also use log techniques to batch changes to metadata in a file system and write in a sequential fashion, which can centralize all metadata needed to access a file.

With the advent of the big data age, a large number of small files were stored to the file system, whereas conventional file systems were suitable for high bandwidth, large file transfers, and when accessing a large number of small files, low performance problems were often encountered due to the limited cache coverage.

Therefore, how to provide a file system acceleration method and system based on a level db to achieve the improvement of the small file access performance becomes a technical problem to be solved urgently.

Disclosure of Invention

The invention aims to solve the technical problem of providing a file system acceleration method and a file system acceleration system based on a LevelDB, which can improve the access performance of small files.

In a first aspect, the present invention provides a file system acceleration method based on a level db, including the steps of:

step S10, a level DB table is created in a user space;

step S20, applying for an inode number for each file to be stored through a global counter;

step S30, setting a capacity threshold, and storing the file into a local file system or a level DB (database) table of the kernel space based on the capacity threshold and an inode number;

step S40, storing the update log of the LevelDB table to a local file system;

and S50, rapidly accessing the small file based on the level DB table of the local file system.

Further, in step S10, the level db table adopts a key structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and the value of each row at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.

Further, in the step S20, the inode number is 64 bits long.

Further, the step S30 specifically includes:

setting a capacity threshold, classifying files into large files and small files based on the capacity threshold, storing the large files into a local file system of a kernel space based on the inode number, and storing the small files into a level DB table based on the inode number.

Further, the step S40 specifically includes:

and storing the LevelDB table to a local file system, wherein the LevelDB table is based on a self-contained pre-written log function, and synchronizing an update log of the LevelDB table to the local file system based on a preset synchronization period.

In a second aspect, the present invention provides a file system acceleration system based on a leverdb, including the following modules:

the LevelDB table creation module is used for creating a LevelDB table in the user space;

the node number application module is used for applying the node number for each file to be stored through the global counter;

the file classification storage module is used for setting a capacity threshold value, and storing files into a local file system or a level DB (database) table of the kernel space based on the capacity threshold value and an inode number;

the LevelDB table storage module is used for storing the LevelDB table and the update log of the LevelDB table to a local file system;

and the small file access module is used for quickly accessing the small file based on the level DB table of the local file system.

Further, in the level db table creating module, the level db table adopts a key value structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and the value of each row at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.

Further, in the inode number application module, the inode number is 64 bits long.

Further, the file classification storage module is specifically configured to:

Further, the level db table storage module specifically includes:

The invention has the advantages that:

by creating the LevelDB table, storing small files with the capacity smaller than the set capacity threshold value into the LevelDB table based on inode numbers, storing the LevelDB table with the catalogues of the small files, and storing the LevelDB table into a local file system, namely, centralizing all the small files into the LevelDB table with a key value structure, the addressing problem when different small files are frequently accessed is solved, namely, the data of the small files can be rapidly positioned through one LevelDB table, and finally, the small file access performance is greatly improved.

Drawings

The invention will be further described with reference to examples of embodiments with reference to the accompanying drawings.

FIG. 1 is a flow chart of a LevelDB-based file system acceleration method of the present invention.

Fig. 2 is a schematic structural diagram of a file system acceleration system based on a level db according to the present invention.

FIG. 3 is a schematic diagram of the architecture of the present invention.

Detailed Description

According to the technical scheme in the embodiment of the application, the overall thought is as follows: the small files are stored into the LevelDB table based on the inode number, and then the LevelDB table is stored into a local file system, namely, all the small files are concentrated into the LevelDB table with a key value structure, so that the addressing problem when different small files are accessed frequently is solved, and the small file access performance is improved.

Referring to fig. 1 to 3, a preferred embodiment of a file system acceleration method based on a leverldb of the present invention includes the following steps:

step S10, a level DB table is created in a user space; the small files can be stored through the LevelDB table, so that the small file access performance can be improved, and the small files are compatible with a modern local file system of most POSIX standards in Linux;

step S20, applying for an inode number for each file to be stored through a global counter; the global counter will self-increment when creating a new file or new directory;

step S40, storing the update log of the LevelDB table to a local file system;

The invention has three modes in use: firstly, the Linux kernel module is embedded under the VFS, so that the mode has the best performance; secondly, separating the application layer independent service from the kernel, and interacting other applications with a local file system through a FUSE library of Linux; thirdly, the method directly operates in the application program as a library, and the method has the advantage of being convenient in deployment and operation.

In the step S10, the level db table adopts a key structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and each row of values at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.

The first 64 bits of the values of all directory entries under the same directory are the same, and when readdir () is called, after the inode number of the target directory is obtained, the items of which the first 64 bits of all keys are the inode number are sequentially scanned in the level DB table; when a single file needs to be parsed, the retrieval is started from the root directory, and the current inode number is continuously combined with the hash value of the next part in the path to look up the table until the line of data representing the file is finally found.

When hard links appear, more than two rows can have the same inode number, file attribute and data, so special treatment is needed; for duplicate entries, only one row is reserved for storing attributes and data, and the first 64 bits of the key of that particular row are its own inode number, and the last 64 bits are null values, with the other rows having only hard-linked tags in the value portion.

The LevelDB provides atomicity guarantee for simple write operations of a row, but the read-modify-write operation does not have atomicity guarantee, because all file attributes are stored in the LevelDB as row values, so the read-modify-write operation must be performed frequently, and security and performance are guaranteed by a lightweight fine-grained lock when implemented.

The level db table is exemplified as follows:

key with a key	Value of
		<0,h1>	1,"home",structstat
<1,h2>	2,"foo",structstat
		<1,h3>	3,"bar",structstat
<2,h4>	4,"apple",hardlink
		<2,h5>	5,"book",structstat,inlinesmallfile(＜4KB)
<3,h6>	4,"pear",hardlink
		<4,null>	4,structstat,largefilepointer(＞4KB)

The LevelDB is a key value database using LSM tree (Log-Structured Merge Tree), providing GET, PUT, DELETE, SCAN etc. APIs; the basic principle of the LevelDB and LSM tree is to use a log mode to manage a plurality of large ordered data arrays (namely SSTable technology) on a local disk, when data is newly added or updated, the new data is written back to a buffer area in a memory, and then the new data is written back to the disk as SSTable after being ordered. When the buffer area grows beyond the threshold value, the buffer area is written back to the disk in the SSTable; when inquiring data, the system searches in a list containing a plurality of SSTable and returns the latest value in time; in order to reduce the number of SSTable to be searched, the LevelDB maintains an index table in the memory, which records the range of keys in each SSTable, and uses a bloom filter to reduce the false searching process; to expedite querying data and removing deleted data, SSTable is also periodically subjected to a merge-sort process, known as "merging".

In the step S20, the inode number is 64 bits long, and since the order of magnitude of the 64 bits long is very large, a reclamation mechanism is not temporarily required to reclaim the inode number of the deleted file.

The step S30 specifically includes:

setting a capacity threshold, classifying files into large files and small files based on the capacity threshold, storing the large files into a local file system of a kernel space based on the inode number, and storing the small files into a level DB table based on the inode number; the local file system is preferably ext4; the capacity threshold is preferably 4K.

The large file is stored in a two-layer directory tree structure, and the large file with the inode number I is stored under the directory of "/LargeStore/J/I" (J=I/10000).

The step S40 specifically includes:

The LevelDB table supports synchronous or asynchronous mode to synchronize the update log to the local file system, and in order to realize consistency guarantee similar to the sequential mode in ext4, the LevelDB table needs to be forced to submit the update log to the local file system in a synchronous mode based on a preset synchronous period; the synchronization period is preferably 5 seconds.

The invention relates to a preferred embodiment of a file system acceleration system based on a LevelDB, which comprises the following modules:

the LevelDB table creation module is used for creating a LevelDB table in the user space; the small files can be stored through the LevelDB table, so that the small file access performance can be improved, and the small files are compatible with a modern local file system of most POSIX standards in Linux;

the node number application module is used for applying the node number for each file to be stored through the global counter; the global counter will self-increment when creating a new file or new directory;

In the level db table creation module, the level db table adopts a key value structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and the value of each row at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.

The level db table is exemplified as follows:

In the inode number application module, the inode number is 64 bits long, and a recovery mechanism is not needed temporarily to recover the inode number of the deleted file because the order of magnitude of the 64 bits long is very large.

The file classification storage module is specifically used for:

The LevelDB table storage module specifically comprises:

In summary, the invention has the advantages that:

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that the specific embodiments described are illustrative only and not intended to limit the scope of the invention, and that equivalent modifications and variations of the invention in light of the spirit of the invention will be covered by the claims of the present invention.

Claims

1. A file system acceleration method based on a LevelDB is characterized by comprising the following steps of: the method comprises the following steps:

step S10, a level DB table is created in a user space;

step S40, storing the update log of the LevelDB table to a local file system;

2. A file system acceleration method based on a level db as claimed in claim 1, characterized in that: in the step S10, the level db table adopts a key structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and each row of values at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.

3. A file system acceleration method based on a level db as claimed in claim 1, characterized in that: in the step S20, the inode number is 64 bits long.

4. A file system acceleration method based on a level db as claimed in claim 1, characterized in that: the step S30 specifically includes:

5. A file system acceleration method based on a level db as claimed in claim 1, characterized in that: the step S40 specifically includes:

6. A file system acceleration system based on a level db, characterized in that: the device comprises the following modules:

7. A level db based file system acceleration system as set forth in claim 6, wherein: in the level db table creation module, the level db table adopts a key value structure, the key length is 128 bits, the key of the first 64 bits is the inode number of the parent directory, the key of the last 64 bits is the hash value of the file name of the small file, and the value of each row at least includes the file name of the small file, the inode number, the access authority, the file size, the timestamp, and the data carried by the small file.

8. A level db based file system acceleration system as set forth in claim 6, wherein: in the inode number application module, the inode number is 64 bits long.

9. A level db based file system acceleration system as set forth in claim 6, wherein: the file classification storage module is specifically used for:

10. A level db based file system acceleration system as set forth in claim 6, wherein: the LevelDB table storage module specifically comprises: