CN117312256A

CN117312256A - File system, operating system and electronic equipment

Info

Publication number: CN117312256A
Application number: CN202311615379.2A
Authority: CN
Inventors: 刘科; 张闯
Original assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Current assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Priority date: 2023-11-29
Filing date: 2023-11-29
Publication date: 2023-12-29
Anticipated expiration: 2043-11-29
Also published as: CN117312256B

Abstract

The embodiment of the application provides a file system, an operating system and an electronic device, wherein the file system is deployed in an intermediate layer added between an application layer and a local file system layer in the operating system, the intermediate layer is allowed to track read-write characteristics of files in the application layer, and the file system comprises: the system comprises a cache module, a compression module and an encoding module, wherein the cache module is used for caching the logic file written in the file system according to the target read-write characteristics of the logic file corresponding to the application layer to obtain the user data of the logic file; the compression module is used for responding to the received compression request, compressing the user data of the logic file and obtaining the compressed data of the logic file; and the encoding module is used for encoding the compressed data of the logic file into a continuous physical file. By the method and the device, the technical problem of poor flexibility of file system processing is solved, and the technical effect of improving the flexibility of file system processing is achieved.

Description

File system, operating system and electronic equipment

Technical Field

Embodiments of the present application relate to the field of computers, and in particular, to a file system, an operating system, and an electronic device.

Background

With the rapid development of the Internet and the Internet of things, the generation speed of data presents the growth of blowout, and provides new challenges for the storage space and the read-write performance of computer equipment. The data is stored after being compressed, so that the storage space can be remarkably saved, and the input/output IO delay of a read disk can be greatly reduced.

Although the user can compress the data by the compression software, store the data and decompress the data when reading the data, the method is only suitable for managing a small number of files. For a large data center, a large number of files and uninterrupted read-write requests exist, and a user cannot manually process the files one by one; because of the randomness of the read-write request, a large amount of application software cannot compress and decompress data by embedding compression and decompression codes. Thus, various file systems have emerged to enable the processing of files in computer devices.

However, although the current file system can perform processes such as caching, compressing, storing and the like on files, the current file system is generally configured by adopting a unified parameter rule, and the unified processing logic flow is used for realizing the storage and access of the files, so that the flexibility of processing the files is poor, and the current file system cannot adapt to the types of the files which are more and more complicated at present, the use scenes of the files and the hardware structures for assisting in file processing.

Aiming at the problems of poor flexibility of file processing of a file system and the like in the related art, no effective solution is proposed yet.

Disclosure of Invention

The embodiment of the application provides a file system, an operating system and electronic equipment, which are used for at least solving the problem that the file system in the related technology has poor flexibility in processing files.

According to one embodiment of the present application, there is provided a file system deployed in an intermediate layer added between an application layer and a local file system layer in an operating system, the intermediate layer being allowed to track read-write characteristics of files in the application layer, the file system including: the system comprises a cache module, a compression module and an encoding module, wherein the cache module is used for caching the logic file written in the file system according to the target read-write characteristics of the logic file corresponding to the application layer to obtain the user data of the logic file; detecting whether the disc-falling time of the logic file is reached or not; under the condition that the arrival of the landing time is detected, a compression request is sent to a compression module; the compression module is used for responding to the received compression request, compressing the user data of the logic file and obtaining the compressed data of the logic file; the encoding module is used for encoding the compressed data of the logic file into continuous physical files, wherein the mapping relation between the physical space where the physical files are located and the logic space where the logic files are located is recorded in the physical files, and the physical files are used for landing.

In one exemplary embodiment, a cache module is configured to: obtaining the block length of a block interval of a logic file; dividing the block length into a minimum number of length values, wherein each length value is an integer multiple of a block unit supported by the cache space; distributing the minimum number of cache blocks for each block interval of the logic file according to the minimum number of length values, wherein the length values correspond to the cache blocks one by one, and each cache block is a continuous cache space; and caching the data of each block interval of the logic file to a group of least number of cache blocks to obtain the user data of the logic file.

In one exemplary embodiment, a cache module is configured to: extracting a length value of the maximum integer multiple of the block unit from the remaining length from the block length until the remaining length is less than or equal to the block unit, wherein the remaining length is a difference between the block length and the extracted data length; the length value extracted from the block length and the length value of the block unit are determined as the minimum number of length values.

In one exemplary embodiment, the compression module is configured to: determining the number of units of the compression calculation units and the cache occupation amount of each compression calculation unit according to the cache capacity of the compression calculation and the maximum length value in the minimum length values; creating one or more target computing units according to the number of the units, and distributing occupied caches for each target computing unit according to the occupied cache amount; the buffer module is used for determining the transmission time interval between the minimum number of buffer blocks according to the data transmission time and the data compression time of the unit data; the minimum number of cache blocks in the corresponding block interval is transmitted to each target computing unit at transmission time intervals.

In one exemplary embodiment, a cache module is configured to: under the condition that the transmission time interval arrives, constructing direct memory access descriptors of the current to-be-transmitted cache blocks in the minimum number of cache blocks; and transmitting the current buffer memory block to be transmitted to a corresponding target computing unit through the direct memory access descriptor.

In one exemplary embodiment, the cache module is further configured to one of: determining the block length according to the file block size expansion attribute edited by the user; determining the block length according to the file type supported by the file system; the storage unit of the database used by the file system is determined as a block length.

In an exemplary embodiment, the cache module is further configured to: assigning a block index to the partition intervals of the logic file and extracting meta information of each partition interval of the logic file, wherein the meta information comprises: the system comprises a state field of a cache block, a cache block pointer array field and a descriptor field, wherein the state field of the cache block is used for recording the operated information of the cache block included in each block interval, the cache block pointer array field is used for recording the position of the cache block included in each block interval, and the descriptor field is used for recording the descriptor used for carrying out data transmission on the cache block included in each block interval; and constructing an index management structure of meta information of the block intervals of the logic file according to the block index.

In one exemplary embodiment, the status field of the cache block includes: a block index field, a status field of the cache block, and at least one of: the system comprises an acceleration card synchronization field, an input and output synchronization field, a locking field, a compression mark field and a statistical information field, wherein the acceleration card synchronization field is used for recording whether a corresponding block interval is transmitted to a compression module, the input and output synchronization field is used for recording whether the corresponding block interval is completed to be dropped, the locking field is used for recording whether the corresponding block interval is allowed to be released or exchanged to a magnetic disk, the compression mark field is used for recording whether the corresponding block interval is completed to be compressed, and the statistical information field is used for recording statistical information generated in the operation process of the corresponding block interval.

In one exemplary embodiment, a cache module includes: the storage space stores compressed data of the plurality of block intervals, and the compressed data of each block interval comprises a compressed page; the cache module is used for executing the modification operation of the user on the logic file to obtain a modified target user page; the compression module is used for generating a patch page corresponding to the target user page under the condition that the target block interval where the target user page is located is not recompressed, wherein the patch page is used for recording the modification position and the modification content of the modification operation to the logic file; and storing the patch page into a target compressed page of a target block interval in the storage space, wherein the patch page is used for executing modification operation on a decompressed page of the target compressed page.

In one exemplary embodiment, each modification position of the modification operation to the logical file and the position of the modification content corresponding to each modification position in the patch page are recorded from top to bottom in the patch page, and the modification content of the modification operation to the logical file is recorded from bottom to top.

In one exemplary embodiment, the file system further comprises: the cost evaluation module is used for detecting the read amplification coefficient of the user data in each block interval of the logic file, wherein the read amplification coefficient is used for indicating the degree of the read amplification phenomenon of the data; the cost evaluation module is used for deciding compression information when the user data of each block interval is recompressed according to the read amplification coefficient, wherein the compression information comprises: whether compression and compression algorithms; and the compression module is used for recompressing the user data of each block interval according to the compression information.

In one exemplary embodiment, a cache module is configured to: detecting the data quantity read out and the data quantity accessed by a user when each reading operation is executed on the user data of each block interval; calculating the ratio between the read data quantity and the data quantity accessed by the user to obtain the reading amplification ratio of each reading operation; the accumulated sum of the sense amp ratios within the target time window is determined as the sense amp ratio.

In an exemplary embodiment, the cost evaluation module is further configured to: predicting compression performance benefits of user data of each block interval of the logic file according to hardware resource attributes of the file system; and the compression module is controlled to compress the user data of each block interval according to the compression performance gain.

In one exemplary embodiment, a cache module is configured to: detecting a read-ahead coefficient of a user on the logic file, wherein the read-ahead coefficient is used for indicating the continuous read size which is used most by the user on the logic file; and when the next reading operation of the user on the logic file is detected, reading the physical file of the logic file according to the pre-reading coefficient, and decompressing the read physical file in parallel.

In an exemplary embodiment, the cache module is further configured to: before caching a logic file written in a file system, detecting an acceleration card type of an acceleration card used by the file system, wherein the acceleration card is used for hardware acceleration for data compression or decompression; a cache space is created that matches the accelerator card type, wherein the logical file is cached in the cache space.

In one exemplary embodiment, the cache module is further configured to at least one of: under the condition that a disc-landing instruction initiated by a user is received, determining that the disc-landing time is reached; determining that the arrival of the landing time is detected under the condition that the target period is detected; and determining that the arrival of the landing time is detected when the cached data volume is detected to be greater than or equal to the data volume threshold.

In an exemplary embodiment, the physical file includes an index block, where the index block is used to record a mapping relationship between a physical space in which the physical file is located and a logical space in which the logical file is located.

In one exemplary embodiment, the index block includes: the window number field is used for indicating the number of the block interval of the logic file, and the physical block field is used for indicating the corresponding physical block of the block interval of the logic file in the physical file.

In one exemplary embodiment, the physical block field includes: the system comprises a physical block digital section, a compressed block digital field, a block number field and a compression algorithm field, wherein the physical block digital section is used for representing the number of physical blocks occupied by a corresponding block interval, the compressed block digital section is used for representing the number of compressed blocks in the physical blocks occupied by the corresponding block interval, the block number field is used for representing the number of the physical blocks occupied by the corresponding block interval, and the compression algorithm field is used for representing the compression algorithm adopted by the corresponding block interval.

In an exemplary embodiment, the physical file further includes a free space management block, wherein the free space management block is configured to record whether each physical block in the physical file is free.

In an exemplary embodiment, the header of the physical file further includes a file management block, wherein the file management block is used to record a location of each index block in the physical file and a location of the free space management block.

According to yet another embodiment of the present application, there is further provided an operating system, where the operating system includes an application layer and a local file system layer, and an intermediate layer is further added between the application layer and the local file system layer, where the intermediate layer is allowed to track a read-write characteristic of a file by the application layer, and the intermediate layer deploys the file system described in any one of the foregoing.

According to a further embodiment of the present application, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

According to the method and the device, the middle layer is added between the application layer and the local file system layer in the operating system, the file system is deployed on the middle layer and processes files in units of logical files, the file system is allowed to track the read-write characteristics of the files in the application layer, the cache module in the file system can flexibly cache the logical files into user data according to the target read-write characteristics of the logical files corresponding to the application layer, and after caching, the user data is not directly compressed and landed, but delayed and landed strategies are carried out through detecting landing time, and data compression and landing are flexibly carried out according to specific requirements. Meta information describing the mapping relation between the physical space where the physical file is located and the logical space where the logical file is located is dynamically generated in the encoding process, so that flexible random reading and writing of data can be supported. Therefore, the technical problem of poor flexibility of file processing of the file system can be solved, and the technical effect of improving the flexibility of file processing of the file system is achieved.

Drawings

FIG. 1 is a block diagram of the hardware architecture of a server device with a file system deployed in an embodiment of the present application;

FIG. 2 is a schematic architecture diagram of a file system according to an embodiment of the present application;

FIG. 3 is a schematic architecture diagram of a computer device according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a data transmission scheme according to an embodiment of the present application;

FIG. 5 is a schematic diagram of another data transmission scheme according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a database partitioning approach according to an embodiment of the present application;

FIG. 7 is a schematic diagram of an index management structure according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a patch page according to an embodiment of the present application;

FIG. 9 is a schematic diagram of different buffering modes of a buffering module according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a cache module detecting a read-ahead coefficient according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a process of reading data according to an embodiment of the present application;

FIG. 12 is a schematic diagram of a process of determining file compression according to an embodiment of the present application;

FIG. 13 is a schematic diagram of a hardware platform according to an embodiment of the present application;

FIG. 14 is a flow chart of a method of compressed storage of files according to an embodiment of the present application;

FIG. 15 is a schematic diagram of a compressed storage process for files according to an embodiment of the present application;

FIG. 16 is a schematic diagram of a file compression landing process according to an embodiment of the present application;

FIG. 17 is a schematic diagram of a timed write back instruction triggered compressed storage process according to an embodiment of the present application;

FIG. 18 is a schematic diagram of a release space instruction triggered compressed storage process according to an embodiment of the present application;

FIG. 19 is a schematic diagram of a space release process according to an embodiment of the present application;

FIG. 20 is a schematic diagram of an optimized store instruction triggered compressed store process according to an embodiment of the present application;

FIG. 21 is a schematic diagram of a file recording manner of a to-be-dropped disc according to an embodiment of the present application;

FIG. 22 is a schematic illustration of meta information of a file according to an embodiment of the present application;

FIG. 23 is a flow chart of a method of compression encoding of a file according to an embodiment of the present application;

FIG. 24 is a schematic illustration of meta information of another file according to an embodiment of the present application;

FIG. 25 is a schematic diagram of a free space management block in meta-information of a file according to an embodiment of the present application;

FIG. 26 is a schematic diagram of a file management block in meta information of a file according to an embodiment of the present application;

FIG. 27 is a schematic illustration of a file read operation process according to an embodiment of the present application;

FIG. 28 is a schematic diagram of a write operation process for a file according to an embodiment of the present application.

Detailed Description

Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.

It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.

The embodiments provided in the embodiments of the present application may be deployed in a server apparatus or a computer apparatus such as a similar computing device. Taking deployment on a server device as an example, fig. 1 is a hardware architecture block diagram of a server device with a file system deployed in an embodiment of the present application. As shown in fig. 1, the server device may include, but is not limited to, a user layer, a kernel layer, and a physical layer, where disks are deployed for storing files, where VFS (Virtual File System ) is deployed, and one or more of different local file systems may be deployed under the VFS, such as: FAT (file configuration table), NTFS (new technology file system), EXT2 (second generation extended file system), F2FS (new open source flash file system designed specifically for NAND-based storage devices), and the like. The user layer sends out read-write operation to the logic file, and the user layer invokes an operation function corresponding to the local file system in the kernel layer through the VFS of the operating system, and performs read-write operation to physical data according to a storage mode defined by the local file system in the disk. The file system proposed in the present application may be deployed at the user layer as shown in fig. 1, but may also be deployed at other locations, such as: the kernel layer is not limited in this application. It will be appreciated by those of ordinary skill in the art that the architecture shown in fig. 1 is merely illustrative and is not intended to limit the architecture of the server apparatus described above. For example, the server device may also include more or fewer divisions and deployments than those shown in fig. 1, or have a different configuration than that shown in fig. 1.

In this embodiment, a file system is provided, fig. 2 is a schematic architecture diagram of the file system according to an embodiment of the present application, and as shown in fig. 2, the file system is disposed in an intermediate layer 206 added between an application layer 202 and a local file system layer 204 in an operating system, where the intermediate layer 206 is allowed to track read-write characteristics of files in the application layer 202, and the file system includes: a buffering module 208, a compression module 210 and an encoding module 212.

The caching module 208 is configured to cache, according to a target read-write characteristic corresponding to the application layer, a logic file written in a file system, to obtain user data of the logic file; detecting whether the disc-falling time of the logic file is reached or not; under the condition that the arrival of the landing time is detected, a compression request is sent to a compression module;

the compression module 210 is configured to compress user data of the logic file in response to the received compression request, to obtain compressed data of the logic file;

the encoding module 212 is configured to encode the compressed data of the logical file into a continuous physical file, where the physical file further records a mapping relationship between a physical space where the physical file is located and a logical space where the logical file is located, and the physical file is used for landing.

Through the scheme, the middle layer is added between the application layer and the local file system layer in the operating system, the file system is deployed on the middle layer and processes files in units of logical files, the file system is allowed to track the read-write characteristics of the files in the application layer, a cache module in the file system can flexibly cache the logical files into user data according to the target read-write characteristics of the logical files corresponding to the application layer, and the data can be compressed and landed flexibly according to specific requirements without directly compressing and landing after the cache, but by detecting a landing time to delay a landing strategy. Meta information describing the mapping relation between the physical space where the physical file is located and the logical space where the logical file is located is dynamically generated in the encoding process, so that flexible random reading and writing of data can be supported. Therefore, the technical problem of poor flexibility of file processing of the file system can be solved, and the technical effect of improving the flexibility of file processing of the file system is achieved.

The file system may be deployed at a user layer of a computer device using, but not limited to, a Fuse (Filesytem in Userspace, user-oriented file system) technical framework, such as: fig. 3 is a schematic architecture diagram of a computer device according to an embodiment of the present application, and as shown in fig. 3, a transparent compressed file system based on Fuse is provided. By creating a file system in a user mode, a user's operation on a logical file will first go through the user mode file system, and then send a read-write request to the VFS (virtual file system) interface of the kernel layer through a libfuse (user mode file system library) module. In the user-state file system, the user-defined operation can be performed on the reading and writing of the logic file, for example: the method comprises the steps of carrying out block compression on a logic file, generating metadata through space mapping management, optimally reading through file cache, determining whether to compress and store through cost estimation, encoding the generated metadata and compressed blocks, and then writing back to a physical file to realize transparent compression read-write of the file. Because the operation is aimed at the logical file, the physical storage interface of the file is still unchanged, so that compatibility with different local file systems is realized.

It should be noted that, the file system in this embodiment may be disposed in a kernel layer of the computer device, or may be disposed in other manners, which is not limited to this embodiment. In this embodiment, the above file system is deployed at the user layer based on the Fuse framework as an example.

In this embodiment, the file system is custom defined in the user state based on the Fuse framework. The Fuse framework based file system may include, but is not limited to: the system comprises a VFS interface module, a cache module, a compression module, a cost estimation module, a coding module, a write-back module and other functional modules. Wherein: the VFS interface module is configured to implement VFS interface callback for a file in a user state, and includes: mount, open, write, read, flush, close and the like, and respectively completing the operations of mounting, writing, reading, synchronizing, closing and the like of the file system. The buffer module can be used for buffering the data which is recently written or read by Chi Huancun, so that IO operation and repeated compression/decompression calculation operation between the buffer module and a disk are reduced; while keeping track of the read-write locality characteristics of each cache page for evaluating whether a cache block should be compressed. The compression module can be used for compressing the cached data, and generating compressed data blocks and compressed length information after compression. The compression module also selects proper hardware resources for compression calculation according to factors such as system computing power resources and loads, residual capacity caching, user disc-dropping instructions and the like. The cost estimation module can be used for determining whether the encoding of the file block adopts a compression encoding mode according to the system computing power, IO throughput bandwidth, locality characteristics of data block access and the size after compression. The coding module can be used for determining which coding mode is adopted to integrate the compression blocks into continuous physical files according to the characteristics of the underlying file system and the file attributes set by the user. And the write-back module can be used for writing the physical file coded by the coding module into a local file system at the bottom layer by adopting two modes of periodic write-back and forced write-back by a user.

For the VFS interface module, the support for the application layer file operation is implemented by mounting the operation function provided by the Fuse framework, including: mounting, opening, closing, reading, writing, synchronizing, extending property settings, unloading, etc. The functions of the individual operating functions are as follows:

first, the corresponding mount callback function is installed, and the following functions can be executed: function 1, resolving user input parameters, comprising: the system compression accounting force, the hardware acceleration card type, the default compression algorithm, the IO bandwidth, the default compression ratio threshold (the compression of the data block cloth smaller than the compression ratio), whether to open pre-reading, whether to open patch pages, whether to open idle compression correction, and the parameters input by a user are stored. And 2, initializing a cache pool, applying for the memory of the cache pool and performing initialization operation.

The open function is used for creating a file or file cache, and the following steps are executed: if the open file path does not exist, the file is created, a 12KB file header is created, and the encoding of the 12KB file header is completed by the encoding module. If an open file path exists, the 12KB file header is read. A reference count for the file path is added.

The closing function is used for cleaning up file resources, and the following steps are executed: the reference count of the file path is reduced. And releasing the temporary cache used after the file is opened.

The read operation function is used for submitting a read application to the cache module, acquiring data of the cache module, and executing the following steps: and taking the offset address as an index, and acquiring a cache block from the cache module. And extracting data to be read from the cache block, and returning the extracted data to a user layer.

Assuming that the cache block caches the logical file in units of 16KB, the user reads a data block with a file offset of 20KB and a length of 4KB, the operation flow may be, but is not limited to: first, the VFS receives a 4KB read request from the cache module at the 20KB offset of the user layer read file, and obtains a 16KB cache block from the cache module with the 20KB offset as an index. From the 16KB cache block, data between 4KB-8KB is extracted, corresponding to data between 20KB-24KB in the logical file. And returning the extracted 4KB content to the user layer to finish the reading operation.

The write operation function is used for copying the data written by the user to the corresponding position in the cache block, and the operation steps are as follows: and taking the offset address as an index, and acquiring a cache block from the cache module. And the VFS interface module is used for covering the data written by the user to the corresponding address offset position in the cache block. The cache block is marked as an unsynchronized state.

Assuming that the cache block caches the logical file in units of 16KB, the user reads a data block with a file offset of 20KB and a length of 4KB, the operation flow may be, but is not limited to: from the cache module, a 16KB cache block is obtained with a 20KB offset as an index. The data written by the user is written to the 16KB cache block between 4KB and 8KB, and the data corresponding to the data between 20KB and 24KB in the logic file. The cache block is marked as an unsynchronized state.

The synchronization function is used for receiving a forced synchronization command sent by a user, synchronizing written data into a disk, and sending a signal to the cache module, wherein the cache module performs operation control of compression and write-back, and blocks to wait for the cache module to complete all operations before returning. The operation steps may be, but are not limited to: the file path is sent to the caching module, marking that the file needs to be synchronized immediately to disk. And waiting for the buffer module to compress all the buffer blocks marked as unsynchronized and writing the buffer blocks into a disk.

The extended attribute setup function is used to mark file storage methods, which may include, but are not limited to: whether the file starts compression storage, the file defaults to compression algorithm, the block size of the file compression, the file defaults to pre-read size, the file uses the minimum compression ratio of compression storage, etc. The file system can use these attributes as parameters for other modules to use in caching, compressing, encoding.

And for the caching module, the method can be used for caching recently written or read data, reducing repeated IO operation and compression/decompression calculation and tracking the read-write characteristics of the file block. The buffer module in this embodiment supports each file to use buffer blocks with different sizes, and is designed to face a hardware compression accelerator, so as to improve the overall performance of compression/decompression calculation, and the hardware compression accelerator may be, but not limited to, a special device such as an FPGA or an ASIC.

The characteristics of the hardware compression accelerator include: the hardware compression accelerator may have a plurality of hardware acceleration computing units, each containing a fixed size of on-chip RAM memory space (16 KB-1 MB). Host side data specifies host side physical addresses, accelerator side RAM space addresses, and transfers the data into on-chip RAM memory space by DMA by configuring DMA descriptors. Each hardware acceleration computing unit has compression/decompression computing capability, and reads data from the on-chip RAM space to perform hardware acceleration computation of a specific compression/decompression algorithm (GZIP/LZ 77/SNAPPY). During compression/decompression calculation, the size of the original data is configured through a register, the size of the compressed/decompressed original data is specified, and the calculated size is obtained through reading the register after calculation is completed. When the calculated data is larger than the RAM storage space, the method is realized through fractional loading, the state register is used for controlling and indicating whether the calculating unit is idle, and when the calculating unit is idle, the next data can be loaded continuously.

For example, a GZIP hardware acceleration computing unit has a 32KB on-chip RAM cache, and the bus address mapped by PCIe on the host side is 0x80000000, and compression computation is performed on 40KB data whose host side physical address is 0x 20000000. The calculation flow may include, but is not limited to:

the first DMA descriptor is constructed by configuring the total length of data to be compressed to be 40KB through a register, and the content of 32KB size is transferred to the address location of 0x80000000 starting at the host side 0x 20000000. And judging whether the calculation is completed or not through a state register, and waiting for completion of the calculation of the 32KB data. A second DMA descriptor is constructed, starting with host side 0x20000000+32kb, 8KB sized content transferred to the 0x80000000 address location. And judging whether the calculation is completed or not through a state register, and waiting for completion of the calculation of the 8KB data. The compressed size is obtained through a register.

In one exemplary embodiment, the caching module may be, but is not limited to, configured to: obtaining the block length of a block interval of a logic file; dividing the block length into a minimum number of length values, wherein each length value is an integer multiple of a block unit supported by the cache space; distributing the minimum number of cache blocks for each block interval of the logic file according to the minimum number of length values, wherein the length values correspond to the cache blocks one by one, and each cache block is a continuous cache space; and caching the data of each block interval of the logic file to a group of least number of cache blocks to obtain the user data of the logic file.

The buffer module divides the available memory into blocks according to the specified size (namely the block units, such as 4 KB) to form a buffer pool, and divides the block length of the block interval of the user application logic file into a plurality of continuous buffer blocks according to the whole power of 2 when the buffer is distributed according to the unit of the size of 4KB, so that the continuity of physical blocks is ensured as much as possible, the number of DMA descriptors is reduced, the time consumption for creating the DMA descriptors is reduced, and the transmission efficiency between the buffer card and an acceleration card is improved.

A splitting method of a block length may, but is not limited to, extracting a length value according to a maximum integer multiple of a block unit from a remaining length until the remaining length is less than or equal to the block unit, for example, allocating a length value of the block unit: the block length is 40KB, the block unit is 4KB, the length value of the length which is 8 times of the length of 4KB is 32KB can be extracted firstly, the length value which is 2 times of the length of the rest 8KB and is 8KB can be extracted, and two cache blocks of 32KB+8KB can be obtained. The above process may be implemented, but is not limited to, by: the block length applied by the user is circularly shifted right by taking KB as a unit, the initial value is 1KB, and each time the block length is shifted right, the block length is multiplied by 2. Skipping when encountering 0, applying for a cache block with a corresponding length when encountering 1, and applying for a size of 4KB when the cache block is smaller than 4 KB. For example 40KB, a binary of 101000, is shifted right 3 times when the first 1 is encountered, i.e. 2 ³ When encountering 2 nd 1, move right 5 times, i.e. 2 ⁵ =32KB。

In the conventional manner, the operating system cache is managed by a size of 4KB, and when the memory block length applied by the user is allocated by a unit of the size of 4KB, the allocation of the data is not guaranteed to be continuous as much as possible, so that all the 4KB pages are randomly allocated and possibly are discrete storage spaces. When transferring the corresponding pages to the hardware accelerator, a DMA descriptor needs to be built for each page, resulting in a long time consuming and excessive space occupation of the DMA descriptor.

For example, a 40KB partition length allocates 10 4KB cache blocks in the conventional manner, and does not guarantee continuity between the cache blocks, so 10 DMA descriptors are required to be constructed, and each descriptor is marked with a source address, a destination address, and a 4KB length, thereby realizing the requirement of transmission to a hardware acceleration unit. In this embodiment, two buffer blocks of 32kb+8kb are allocated, and the continuity of the physical space is guaranteed inside the two buffer blocks, only 2 DMA descriptors need to be constructed. The time consumption and the occupation of the memory space for creating the DMA descriptor are greatly reduced, and the transmission efficiency between the host side and the acceleration card is improved.

In the traditional mode, a fixed-length allocation strategy is adopted for block compression storage, all file blocks in a file system are fixed in length, an accelerator adopts a RAM space design with the same size, parallel compression calculation of a plurality of blocks is realized through a parallel mode of a plurality of calculation units, and each transmission and calculation consume a fixed clock period. The operation mode is simple, after a batch of cache blocks are transmitted to the calculation unit, calculation is started, calculation is waited to be completed, and next batch of data is loaded. In this embodiment, the buffer module uses the on-chip buffer space of the hardware accelerator computing unit as a base, sets the size of a transmission block, calculates the data transmission interval according to the time consumption ratio of calculation and transmission, and performs alternate DMA transmission and calculation of buffer data to the accelerator, thereby reducing the transmission and calculation delay from end to end.

In this embodiment, the partitioning compression storage supports a variable-length allocation policy, and the hardware accelerator supports compression of variable-length data in a manner of fractional loading as described above, so that different files can use cache blocks with different lengths, so as to improve the performance of data access. However, since the previous calculation needs to be waited for before the next transmission can be started between the separate loads, if the next transmission is waited for after the calculation is completed in the conventional manner, a great delay may be generated.

In order to solve this problem, in this embodiment, an alternate transmission and computation strategy is adopted to improve overall performance, for example, it is assumed that the hardware accelerator has 128KB on-chip RAM cache resources, 128KB data is consumed by DMA transmission for 128KB, the hardware acceleration algorithm takes 256 clock cycles to complete GZIP compression of 128KB, the computation/transmission time consumption ratio is 2:1, and the longest user block length is 64KB. Fig. 4 is a schematic diagram of a data transmission manner according to an embodiment of the present application, as shown in fig. 4, in a conventional manner, 2 computing units are created, and each computing unit occupies a cache space of 64KB. In this embodiment, the maximum length value of the minimum number of length values is 32KB, and 4 computing units can be created, and each computing unit occupies 32KB of cache space. When compression calculation is required for 4 64KB cache blocks, the traditional mode is divided into two loads, and 64KB are transmitted each time to be calculated in two calculation units. In this embodiment, the loading is performed twice, and in each time, 32KB to 4 computing units are transferred, and the computing start time is earlier due to the greater number of computing units, so that a lower delay is obtained.

FIG. 5 is a schematic diagram of another data transmission mode according to an embodiment of the present application, as shown in FIG. 5, when compression calculation is required for 4 40KB cache blocks, the conventional mode loads twice, and transmits 40KB each time, and performs calculation for 2 units of calculation. In the invention, the first 32KB of 4 cache blocks is loaded twice, the first 32KB of 4 cache blocks is transferred to the calculating unit, the second 8KB can be calculated according to the proportion of time consumption of transfer and calculation, and the time of the second 8KB is put into a DMA (direct memory access) transmission queue, so that the calculating unit immediately loads the data of the next part for calculation after the calculation of the data of the previous part is completed, and lower overall delay is realized.

In an exemplary embodiment, the buffer module is further configured to configure a block length of one of the following:

firstly, determining the block length according to the file block size expansion attribute edited by a user;

determining a block length according to the file type supported by the file system;

in a third aspect, a storage unit of a database used by a file system is determined as a block length.

In this embodiment, the user may specify the size of the file for block compression by using the file block size extension attribute, so that the file supports cache blocks with different lengths. The cache divides the logical file into a plurality of data blocks of the same size, and the size of the block length can be specified by a user through the expansion attribute of the file and is used for optimizing the spatial locality access of different types of files. FIG. 6 is a schematic diagram of a database partitioning approach according to an embodiment of the present application, as shown in FIG. 6, the MySQL database defaults to storing data in units of 32KB, while PostgreSQL stores data in units of 8 KB. When the application layer initiates access to the storage layer, mySQL accesses the first 4KB of the 32KB, and then must access the next 28KB of content; while PostgreSQL accesses the first 4KB and then the later 4KB content, but not necessarily the later 8KB content. This feature of accessing the next page after accessing the previous portion of data is called spatial locality, and is caused by the need for an application to access a semantic whole (32 KB for MySQL once, 8KB for PostgreSQL once). Due to the spatial locality characteristics of the application layer, it is often predictable at program run-time, such as: the configuration files in the database can display the data storage of the database according to the unit capacity, so that the space locality of different files in access can be fully utilized for block compression, the number of times of IO and compression/decompression calculation is reduced, and the customized performance is improved. Assume that the compression ratio is 4:1, mySQL random access is 0-32KB, 64-96KB, postgreSQL random access is 0-8KB,16-32KB. If the block compression is performed according to the storage unit in the database, as shown in (a) and (b) in fig. 6, mySQL only needs to read 16KB data, and the decompression calculation is time-consuming for two times; the PostgreSQL only needs to read 4KB to perform decompression computation twice, but does not need to perform block compression according to the storage unit polarity of the database, as shown in fig. 6 (c), mySQL is divided into blocks according to 16KB and smaller than the storage unit of the database, mySQL needs to read the 16KB database to perform decompression computation four times, and because the time required for starting transmission and decompression can be longer than (a), as shown in fig. 6 (d), postgreSQL is divided into blocks according to 8KB and larger than the storage unit of the database, postgreSQL needs to read 8KB data to perform decompression computation once, the amount of reading and computation is doubled, and users actually only need half of the data, which causes unnecessary delay.

In this embodiment, a plurality of configuration modes of block lengths are provided, and a user can flexibly configure the block lengths according to requirements and the condition of a database to be used, so that a file system can be used more flexibly and efficiently.

The caching module manages caching of the file using an index management structure that may be in the form of, but is not limited to, a balanced binary tree that includes nodes on the balanced binary tree as meta-information for the cache blocks. The meta information is responsible for managing a section of logical address space in the file, maintaining data transmission with the data acceleration card and data synchronization with the disk.

In the conventional manner, the cache blocks are managed according to a size of 4KB, and insertion and lookup of the cache blocks are managed using tree containers such as a radix tree. And after the allocation and IO reading of the cache are completed, the page table item is set to be in a read-only mode, when the write operation is carried out on the cache block, the page table item triggers a page-missing exception, and the page is marked as a dirty page in the page-missing exception, so that the data in the cache is inconsistent with the data in the disk, and the data needs to be subjected to the write-back operation.

However, in the context of compressed storage, a chunk contains multiple 4KB pages, and when any one 4KB page is modified, the entire chunk of data should be treated as a whole, such as: recompression computes the entire block or other method processing.

In this embodiment, the logical file is divided into blocks, each block includes a plurality of pages with a size of 4KB, and compressed storage is performed by taking the size of the block as a unit. Fig. 7 is a schematic diagram of an index management structure according to an embodiment of the present application, where each partition is assigned a piece of meta-information, which is managed using a balanced binary tree, and inserted and searched using a block index, as shown in fig. 7. The field contents in the meta information include: the state of the cache block, the cache block pointer array, the DMA descriptor, etc. When a user accesses a logic file and accesses a certain 4KB page, when the 4KB page has no physical memory or is write-protected, a hardware triggering page-missing exception exists, in the embodiment, the accessed 4KB page offset is converted into a block index of a block, a logic page memory and a physical page cache of the block size are respectively applied on a host side and an acceleration card side, an acceleration card physical page supporting P2P is an off-chip DDR memory of an acceleration card, and an acceleration card physical page not supporting P2P is a host side memory. A memory mapping from virtual addresses to physical addresses of a block size is established at a time at the host side. Performing meta-information initialization, including: the buffer block state is initialized, and the buffer block of the host side is formed by a plurality of physical blocks with the integer power of 2, so that the buffer block pointer array is established, the buffer blocks are logically connected in series, and the DMA descriptor transmitted between the host side and the accelerator card is established. The meta-information is inserted into the balanced binary tree using the chunking index. When a user accesses a 4KB page for the first time, a page-missing exception is triggered, the page-missing exception causes no physical page, the memory of the partition where the 4KB page is located is applied in a page-missing exception processing function, the mapping from the physical address to the virtual address is built, if the read operation starts IO to read data from a disk, and then all pages in the partition are set to be write-protected. When a user reads the page of the block where the 4KB is located, because the mapping between the physical address and the virtual address is already established, page missing abnormality does not occur, the IO is not required to be started to acquire data from a disk, and the data is directly read from the memory. When a user performs a writing operation to a page of the block where the 4KB is located, because the pages in the block are all set to be in a writing protection state, page-missing abnormality can be triggered, the page-missing abnormality is caused by writing protection, and in a page-missing abnormality processing function, a synchronous mark of an acceleration card and a disk is modified to indicate that the content of the page is inconsistent with the content in the disk. When a user performs a disk-dropping operation, a cache block inconsistent with a disk in the cache block is traversed, and all memory pages are quickly acquired from a cache block pointer array without searching for one 4KB page. Instead of using the lock flag bit for each page, all 4KB memory in a partition is locked at a time using the lock flag bit for meta-information. And transmitting the data block to the acceleration card for compression calculation according to the DMA descriptor, modifying the acceleration card synchronous mark after the compression calculation result is obtained, and modifying the IO synchronous mark after the disc is dropped. And updating the read-write access statistics of the user on the cache block.

In the cache module, the storage of the page is divided into a logical page (which can be called as a user space) and a physical page (which can be called as a storage space), and the logical data seen by the user and the data actually stored in the disk are respectively corresponding. The physical page comprises a compressed page and a patch page, and the file blocks are stored by different storage strategies. Wherein the compressed pages store compressed data for file blocks, possibly generated by different compression algorithms; and the patch page is used for storing the part which has completed compressing or landing the original data and has undergone a small amount of modification to the original data by the user. At this time, if the original data is compressed again or dropped, the generated delay cost is too high, so that the dropped is not compressed repeatedly for the modified scene of a small amount of data, and the modified part is marked by using the patch page. Page patches fall into two categories: whole page patch and segment patch. The whole page patch is used for replacing the content of a whole page, and the segment patch is used for replacing partial content in the compressed block.

And a block interval comprises a compressed data compression block page and a page patch, and the two pages. Wherein the meta-information of the type of the page patch in the encoding module is marked.

When patching compressed data, the patch data is copied to a decompressed page, fig. 8 is a schematic diagram of a patch page according to an embodiment of the present application, and as shown in fig. 8, a segment patch page may, but is not limited to, adopt the following structure: the method comprises the steps of storing metadata of a patch (recording each modification position of a logic file by modification operation and the position of modification content corresponding to each modification position in the patch page) from top to bottom in a section patch page, marking the initial position of the modification page after decompression in the metadata, and offsetting the patch in the patch page and length of the patch; and sequentially storing patch data, namely modifying the content, from bottom to top in the segment patch page.

For example, fig. 9 is a schematic diagram of different buffering manners of a buffering module according to an embodiment of the present application, as shown in fig. 9, a storage method under different scenarios of the buffering module is provided: in the cache module, user data is managed in units of pages, and the size of the pages is usually 4KB. If a plurality of pages are a group as a unit of block compression, for example, 4 pages are a group in fig. 9, the file is block compressed in units of 16 KB. The buffer memory module stores the data read or written by the user in a binary tree structure, and the page is stored in the buffer memory in the following modes:

in mode 1, as shown in (1) of fig. 9, the writing amount in the data block is large, a large amount of data in the file block is modified, compression calculation is performed on the user page of the block, and the cache in the storage page is: and (5) compressing and storing.

In the mode 2, as shown in (2) in fig. 9, the writing amount in the data block is small, only a small amount of modification is performed in the file block, the user page of the block is not compressed and calculated again, the modified part is recorded through the patch page, and the cache in the storage page is as follows: the manner in which patch pages are stored in combination is compressed.

In the mode 3, as shown in (3) in fig. 9, the compression ratio is small or the reading amplification coefficient is too large in the data block, the encoding module adopts non-compression storage when the data is dropped, the cache module reads the format of non-compression storage, the user page is practically the same as the storage page, and only the modified page is updated in the storage page when writing operation is performed on the data.

In the mode 4, as shown in (4) of fig. 9, the user does not perform a read or write operation on the file, and the buffer module does not buffer the data in the section.

In the mode 5, as shown in (5) in fig. 9, the user reads a piece of data in the file, compression adopted in the disk in the partition is combined with a patch page storage mode, after the original data is read to a storage page, the original data is decompressed to the user page, and then the data in the patch is written to the user page, so that the consistency of the modified data is maintained.

And in the cache module, the continuous reading size of the file is recorded, statistics is carried out, and the most common reading size is recorded as a pre-reading coefficient. During the period from file opening to file closing, the buffer module records the continuous reading size of the file within a certain time window, and the most common continuous reading and writing size is used as a pre-reading coefficient. The pre-reading coefficient is used for pre-reading operation of the file, pre-distributing a buffer block larger than a single request of a user, continuously reading data volume larger than the single request of the user, and starting decompression calculation in the reading process so as to improve the overall performance.

Fig. 10 is a schematic diagram of a buffer module detecting a read-ahead coefficient according to an embodiment of the present application, and as shown in fig. 10, the buffer module detecting the read-ahead coefficient includes the following steps:

step 1: setting a time window, and recording whether the access to the file in the time window is the access of the continuous address.

Step 2: if the access to the file address in the time window is continuous, triggering the pre-reading operation, and setting the length of the file address continuously accessed in the time window as the length of the cache window.

Step 3: the method comprises the steps of applying for cache blocks with the size of a cache window, wherein one cache window comprises a plurality of file blocks, and each file block comprises a plurality of 4KB pages.

Step 4: and starting IO (input/output) read compression blocks, and starting decompression calculation after finishing reading of each block of data.

Step 5: after IO reading of all the compression blocks in the pre-reading window is completed, judging whether the execution time of the execution step 4 is longer than the time of the time window.

Step 6: if the execution time of the step 4 is longer than the window time, judging whether the access of the user to the file is continuous access or not, and covering the data block which has completed decompression, if so, moving the cache window backwards to continue to execute the step 4, otherwise, suspending the prefetching operation.

It is worth noting that the user is continuously accessing dense data in this scenario. Such as: in the network transmission scene, after a user accesses one 32KB page, data is sent to the network card, the next 32KB page is accessed immediately, and the access interval of the two 32KB pages is extremely small. Therefore, the pause time of the IO operation should be reduced as much as possible in the scene, if the user accesses all the decompressed pages after the IO is completed, the user is represented to still send out the continuous reading instruction, and in order to reduce the time consumption of the IO interval as much as possible, the embodiment immediately starts the reading and the decompression of the IO in the scene.

Step 7: if the execution time of step 4 is less than the window time. And when the user accesses the file continuously and accesses the last file block in the cache window, moving the cache window backwards to continue to execute the step 4.

It is worth noting that the user access to the file in this scenario is a non-dense continuous access. Such as: under the AI model training scene, a user firstly reads a 32KB page, carries out complex matrix operation, and then reads the next 32KB page. Although the two accesses to the file are consecutive, the two accesses to the file are separated by a longer time. In this case, the total time consumption is reduced, and only the IO reading and decompression is needed to be completed before the user accesses the next 32 KB.

For example, fig. 11 is a schematic diagram of a process of reading data according to an embodiment of the present application, as shown in fig. 11, assuming that a user issues four read requests within a period of time window, and each time accesses 4 pages, the pages are consecutive: the cache module tracks each read request access, records a period of time window, finds that 16 continuous pages are most frequently accessed, and the locality of the page access is characterized in that: after the 1 st page access is completed, the next 15 pages are typically accessed. When the pre-read function is not turned on, as shown in fig. 11 (a), each read request requires a read and decompression operation to be performed once. The read and decompression operations are performed sequentially, resulting in a large access delay. After the pre-reading function is started, as shown in fig. 11 (b), when the first reading request is read, the cache module reads the data of 16 user pages, when the 0 th page decompresses and calculates, the 1 st page reading operation is performed simultaneously, and similarly, when the 1 st page is decompressed, the 2 nd and 3 rd pages are read simultaneously. The read and decompress operations are time-parallel, thus greatly reducing the overall latency of 16 page accesses.

Optionally, the cache module creates different types of cache pools according to different accelerator card types. Considering that compression and decompression computation consumes a large amount of CPU power resources to cause a large delay, hardware acceleration of compression and decompression is often performed in practice using a hardware acceleration card. Because the consistency maintenance mechanisms of the unused hardware accelerator cards to the caches are different, in this embodiment, the caches are set respectively for the unused hardware accelerator devices:

for the situation without an acceleration card, the dynamic application cache forms a cache pool, and both the physical page and the logical page use the memory of the host side. For a PCIe accelerator card, because the PCIe accelerator card cannot guarantee the consistency of memories on the host side and the accelerator card side, read-write operation needs to be carried out by DMA to and fro transmission between the host side and the accelerator card. Therefore, in this embodiment, for the accelerator card supporting P2P transmission, the physical page uses the external memory DDR of the accelerator card; for accelerator cards that do not support P2P transport, the physical page uses host side memory. For the CXL acceleration card, the memory consistency of the host and the acceleration card can be automatically ensured on the CXL acceleration card hardware. Therefore, in this embodiment, the memory address space of the CXL accelerator card is mapped from the kernel mode to the user mode, and the memory address mapped to the user mode is partitioned, and the memory of the CXL accelerator card is directly used to form the cache pool. The memory of CXL can be used by both the logical page and the physical page, the physical page preferentially acquires the memory from the CXL, and when the memory in the CXL is insufficient, the memory is allocated from the host side.

In an exemplary embodiment, the buffer module is further configured to trigger a landing opportunity in at least one of the following manners:

mode a, under the condition that a user initiated landing instruction is received, determining that the landing time is reached. The method is equivalent to the moment of disc landing after receiving a forced disc landing instruction triggered by a user.

Mode b, when the target period is detected, determines that the landing timing has been detected. The method is equivalent to periodically triggering the timing write-back instruction and reaching the disc-dropping time.

Mode c, determining that the landing time is reached when the buffered data amount is detected to be greater than or equal to the data amount threshold. Equivalent to the arrival of the landing time by triggering the release space command or optimizing the storage command.

And the compression module can be used for completing the compression calculation of the page needing to be dropped in the cache module. And reading the user page in the cache page, compressing and storing the user page in the storage page. The compression module may call the CPU or the hardware acceleration card to perform compression calculation, and the working flow is as follows: receiving compression requests of the cache module, wherein each request corresponds to compression of a file, and prioritizing the requests, a rule may be, but is not limited to: the disk-landing priority of the forced disk-landing request of the user on the file is greater than the compression request periodically sent by the cache module; traversing a file to be compressed, transmitting the ID number of the file into a cost estimation module, and returning whether the file should be compressed and calculated or not and the compression mark of each file cache block by the cost estimation module; traversing the cache blocks of the file for the file to be compressed, and marking the compressed blocks as the cache blocks to be compressed for the cache blocks which are not compressed and the cache blocks with a large number of modifications by the cost estimation module. And (3) carrying out compression calculation on the module, if the obtained compression ratio is larger, putting the compressed data into a cache block storage page, and marking the effective length of the compressed data of the storage page. If the obtained compression ratio is smaller, the original data is stored in the memory page, and the memory page is marked to be in a non-compressed state. For a cache block that has been compressed, only a small number of user pages have been modified, the cost estimation module may be marked as not requiring compression. At this time, the part modified by the user page is used as a patch page, stored in a storage page, and the size of the patch page is marked. After the compression calculation of all the cache blocks in the whole file is completed, the average compression ratio of the file is updated.

The cache module tracks the cache failure frequency, the amount of data accessed in the user page and the amount of data loaded into the memory page, and calculates the read amplification factor. When the system is not enough in cache, a cache block without read-write operation for a long time is cleaned, the cache of the file block is in an invalid state, the read-write operation is performed again, and data needs to be reloaded from a disk to a storage page. And within a period of time window, the cache block is invalid, and the number of times of reading and writing again is set as a cache invalidation frequency coefficient. During the effective period of the cache block, the data volume of the storage page read from the disk is recorded, the data volume accessed in the user page is divided by the data volume accessed in the user page, and the read amplification ratio is set. And in a period of time window, the accumulated sum of the read amplification factors is equal to the read amplification factors, and is used as a space locality evaluation index of the block to determine whether the block data is worth adopting a compressed storage mode when the block data falls down again.

Because not all files can accurately predict the spatial locality characteristics of the files during access, in this embodiment, a read-and-amplify coefficient is introduced as an auxiliary means for dynamically judging whether the files should be compressed, for example, fig. 12 is a schematic diagram of a judging process of file compression according to an embodiment of the present application, as shown in fig. 12, assuming that the files use 4KB as pages and 4 pages as block compression lengths, data compression is 2:1, T2-T4 represents a period of time window (e.g. 5 seconds), 3 cache failures occur, during each effective period of cache, the read-and-write access process of data is performed at time T1, and (a) and (b) blocks use compressed storage, and the required storage space is 8KB; at the moment T2, the blocks (a) and (b) only access one 4KB page, and the read-amplification ratio is 2; at time T3, (a) the block only accesses 4KB, and the read-amplification ratio is 2; (b) 8KB of block accesses, and the read-amplification ratio is 1; at time T4, (a) the block only accesses 4KB, and the read-amplification ratio is 2; (b) 16KB of block accesses, and the read-amplification ratio is 0.5; at time T5, a writing operation of the file again occurs, at which time (a) the block read amplification factor is 6, and (b) the block read amplification factor is 3.5, and the writing is performed 3 times, 2 times, and the ratio of reading to writing is 1.5. In the cost estimation module, after receiving the parameters, the cost estimation module estimates whether the file block is compressed and stored according to the compression calculation bandwidth and the IO read-write bandwidth. Because (a) the block read amplification coefficient is higher, the invalid disk read and decompression calculation proportion in each read operation is high, and the read operation of the file is larger than the write operation of the file, the block (a) can be marked as uncompressed by the cost estimation module, the compression module can not perform compression calculation, and the uncompressed storage is adopted to adapt to the read-write locality characteristic of the block, so that unnecessary disk read and decompression calculation is reduced.

For the cost estimation module, the embodiment can aim at a hardware platform except for a CPU, a memory and a magnetic disk in a conventional storage server. There may also be dedicated hardware compression accelerator cards and different storage media. Fig. 13 is a schematic diagram of a hardware platform according to an embodiment of the present application, as shown in fig. 13, where various applications (MySQL, postgreSQL, oracle, etc.) of the user layer may initiate read and write operations to the storage layer. Because the storage requirements of different data are different (storage space is saved preferentially, access delay is considered preferentially), the system loads are different (CPU (central processing unit), accelerator card and storage device IO), the access characteristics of the data are different (sequential access and random access), the storage characteristics of storage media are different (the read-write bandwidth difference between a mechanical disk and an SSD disk (solid state disk) is huge, and meanwhile, the SSD disk has the problems of write amplification and read-write service life).

In compressed storage, how to select an optimal storage policy in a complex hardware configuration and load environment is a problem to be solved in the art. In this embodiment, in the cost estimation module, compression calculation and storage costs are estimated, priority classification processing is performed on different cache blocks, and then compression tasks are allocated to dedicated hardware or a general-purpose processor for compression calculation, and dynamic adjustment and adaptation are performed on a data cache mode and a compression coding storage mode in the simultaneous reading and writing process.

It should be noted that, in this embodiment, through the VFS interface module of the Fuse, the user may set the hardware configuration parameters of the file system, and the storage performance parameters of the file for reading and writing are used for the calculation of the cost estimation module, where the relevant parameters are as follows:

1) Sequential read-write bandwidth of all available storage devices) Random read/write bandwidth->. For example: sequential bandwidth of solid state disk (+)>) Random read-write bandwidth of U disk (+.>）。

2) All available compression computing devices and computing power bandwidth of compression algorithmFor example: bandwidth of CPU execution gzip algorithm (+.>) Bandwidth of FPGA executing lz77 algorithm) Etc.

3) Priority of files when they are dropped ). The user will set the file when the mount file systemDrop priority of system->Meanwhile, the landing priority of a certain file can be set through the expansion attribute of the vfs（/>Can be covered with->）。

4) Storing bias valuesRead bias value +.>Write bias value +.>. Under the compression storage scene, a user sets whether the data storage is biased to save storage space or biased to read delay or biased to write delay through the bias value.

5) Optimizing storage patternsEither active or passive optimization is selected.

In this embodiment, the cost estimation module has the following functions:

function 1, carrying out priority classification on read-write requests sent by an application layer. A large number of application programs exist in the server, different data read-write accesses are performed at any time, and a reasonable evaluation mechanism needs to be set to perform priority processing on the data. In the conventional manner, when user data needs to drop, the kernel of the operating system transmits all dirty page data in the memory to the corresponding storage medium one by one, and the lack of management of priority policy often brings poor experience. For example: when the user inserts the USB flash disk into the computer to work for a period of time, the data writing operation to the local disk and the USB flash disk is performed, and when the user needs to pull out the USB flash disk, the data in the memory needs to be written back into the USB flash disk. Under the condition of the traditional mode, the falling disk of the U disk data is not processed preferentially, but all the memory dirty page data at the current moment are written back into the respective storage medium, wherein the data pages written into the local disk are included. The USB flash disk data is usually pulled out after waiting for the disk and the USB flash disk data to be completely dropped, so that a user needs longer waiting time. In a similar scenario, the power system notifies that there is a 5 minute outage, and the user needs to keep as much important data of high priority as possible during the 5 minutes, such as: PPT, word document; finally, unimportant data are saved, such as: the downloaded movie, browser recording. In this embodiment, for the above situation, the cost estimation module may place the files in sequence according to the priority of the files, so as to ensure that the data more important to the user is placed in priority.

Function 2, storage format selection of data, comprising: compressed storage, uncompressed storage, compressed in combination with patch page storage, compressed block sizes. Not all scenarios are suitable for compression storage of data. Firstly, a low-compression ratio data block is compressed, so that not only can the benefit of storage space be obtained, but also the read-write performance can be reduced; secondly, a plurality of pages are put together for compression, and the reading delay is increased due to the reading amplification effect generated in the scene that a user actually reads only a small number of pages; again, for a scenario with higher IO bandwidth, while compression can achieve storage space benefit, write latency will be lower than non-compressed storage; finally, not all scenes can be used for users to clearly know the local characteristics of file access, the read-write characteristics of the files can be known in the process of access, and the ideal effect cannot be obtained because the default block size is adopted for compression. Therefore, the cost estimation module in this embodiment estimates performance benefits of different modes according to the obtained compression ratio and the read-write characteristics of the buffer module on the data, and determines the storage format adopted by the file by combining the setting of the read-write requirements of the user on the file through the expansion attribute.

And 3, compressing calculation and storing task planning. Different compression algorithms have different compression ratios and time consumption in calculation, meanwhile, the system comprises a calculation unit for compression and a storage unit for IO, and because the load of the calculation unit and the IO storage unit at a certain moment of the system is dynamically changed, the data needing to be dropped in emergency is queued in the calculation unit or the IO storage unit due to the fact that a fixed algorithm is simply adopted regardless of the load state at the moment, and unnecessary delay is generated. Meanwhile, the hardware accelerator card often has the parallel computing capability of different compression algorithms, and can greatly improve the compression efficiency. Therefore, the cost estimation module selects a reasonable compression algorithm according to the complex condition of the current system and the landing requirement of the user, and distributes the landing task of the cache block to a reasonable calculation unit and a memory IO memory.

Function 4, memory space allocation. Sequential read performance of different storage media differs significantly from random read performance, but sequential read performance is generally better than random read performance. When data is frequently modified, the original position may not be stored due to the variation of the compression ratio, and at this time, whether the compression blocks are stored in the existing free space in a scattered manner or a continuous storage space is newly allocated, so that the cost estimation module needs to balance according to the access characteristics of the file.

And 5, storage space arrangement and optimization. With the continuous modification of compressed files, a large amount of scattered free space is generated in the storage space due to the fact that the compressed blocks are lengthened, and data blocks are randomly stored in a disk, so that the read-write performance of the data is reduced and the benefit of compressed storage is reduced. In the invention, the cost estimation module estimates the cost and benefit of recoding the file according to the monitoring information of the buffer module for the read-write access of the user and the coding information of the coding module.

For a data writing request of a user, firstly, caching is carried out in a caching module, and data are written into the storage device in batches at a proper time. In this embodiment, the drop operation is triggered as follows in 4 cases:

in the first case, the forced landing command corresponds to a user executing a forced landing command on a certain file. For example: the write word document clicks a save button, corresponding to the flush interface of the file operation vfs.

And secondly, releasing the space instruction, and when the memory in the cache module is insufficient, releasing part of the cache to load new data, wherein the data written by a user in the releasing process can be written back into the disk.

And thirdly, a timing write-back instruction is provided, and the cache module writes back the data written by the user into the disk according to the timing period.

And in the fourth case, optimizing the storage instruction, and initiating optimizing storage of the file by the cost estimation module. And rearranging the data and writing the rearranged data into a magnetic disk.

In this embodiment, there is also provided a method for storing files in a compressed manner, and fig. 14 is a flowchart of a method for storing files in a compressed manner according to an embodiment of the present application, as shown in fig. 14, where the flowchart includes the following steps:

step S1402, detecting a first landing parameter and a first compression parameter of a first file of a landing indicated by a first instruction, where the first instruction is a landing instruction with a landing priority of a landing delay time of data higher than a landing priority of a landing storage space of data, the first landing parameter is used to indicate a number of cache blocks that can be landed in a target time, and the first compression parameter is used to indicate a number of cache blocks that can be compressed in the target time;

step S1404, performing disc-drop on a first group of cache blocks in the first file, which conform to the first disc-drop parameters, and compressing a second group of cache blocks in the first file, which conform to the first compression parameters, to obtain a first compressed block set;

In step S1406, the first compressed block set is dropped.

Through the steps, when a landing instruction, namely a first instruction, indicating first landing parameters and first compression parameters of a first file of a landing is received, wherein the landing priority of the landing delay time of the data is higher than that of a landing storage space of the data, the first instruction is used for detecting the number of cache blocks capable of landing in a target time and the number of cache blocks capable of being compressed in the target time, simultaneously, the first group of cache blocks conforming to the first landing parameters in the first file are subjected to landing, the second group of cache blocks conforming to the first compression parameters in the first file are subjected to compression to obtain a first compression block set, and then the first compression block set is subjected to landing. Therefore, under the condition that the landing priority of the landing delay time of the data is higher than the landing priority of the landing storage space of the data, part of the data in the file needing landing is compressed, and meanwhile, the other part of the data is directly landed. Therefore, the technical problem of low file compression and storage efficiency can be solved, and the technical effect of improving the file compression and storage efficiency is achieved.

Wherein, the steps may be performed by, but not limited to, the cost estimation module.

In this embodiment, the first instruction is a landing instruction with a landing priority of the landing delay time of the data higher than that of the landing storage space of the data, and the first instruction may, but is not limited to, include the mandatory landing instruction, where the requirement of the landing delay time of the instruction is higher, for example: an instruction for the user to manually click a save button, an instruction for the user to close a file to automatically save the file triggered, and the like.

And for the cache suitable for the forced landing instruction, taking the landing delay time of the data as the highest priority, and finishing the landing task of the data as soon as possible. In the scene, a user needs to finish the data disc-dropping operation as soon as possible, and factors such as data storage space optimization or reading performance optimization are not considered.

For a disc-drop command issued by a user, multiple files may be simultaneously included, and typically file blocks in the same file have similar compression ratios. The hardware accelerator card may execute multiple compression algorithms in parallel, with different compression algorithms achieving different compression ratios than required compression times. When the file performs IO storage operation, the accelerator card can simultaneously perform compression calculation operation.

In one exemplary embodiment, a first landing disc parameter and a first compression parameter of a first file of a landing disc indicated by a first instruction may be detected, but are not limited to, by: determining a first time consumption of the landing cache block and a second time consumption of compressing the cache block using a first compression algorithm; determining target time for completing the landing and compression of the cache block at the same time according to the first time consumption and the second time consumption; the number of cache blocks which can be dropped in the target time is calculated as a first drop parameter, and the number of cache blocks which can be compressed in the target time is calculated as a first compression parameter.

In this embodiment, the number of cache blocks that are simultaneously compressed and dropped may be determined, but is not limited to, by a first time consuming the dropped cache blocks and a second time consuming the first compression algorithm to compress the cache blocks. The first compression algorithm may be, but is not limited to, one of the supported compression algorithms, such as: a randomly selected compression algorithm, or the compression algorithm with the slowest compression speed, etc.

In one exemplary embodiment, a first time consuming for a dropped cache block may be determined, but is not limited to, and a second time consuming for compressing the cache block using a first compression algorithm may be determined in the following manner: calculating the landing time consumption of the n cache blocks as a first time consumption, and calculating the compression time consumption of compressing the n cache blocks by using a first compression algorithm as a second time consumption, wherein the first compression algorithm is a compression algorithm with the slowest calculation speed in a plurality of compression algorithms allowed to be used; the target time for which the landing and compression of the cache block can be completed simultaneously may be determined according to the first time and the second time in the following manner: determining the least common multiple of the first time consuming and the second time consuming as a target time; calculating the number of cache blocks capable of being dropped in the target time as a first drop parameter, and calculating the number of cache blocks capable of being compressed in the target time as a first compression parameter, wherein the method comprises the following steps of: the first landing parameter is determined as a quotient of the product of n and the target time and a first time consumption, and the first compression parameter is determined as a quotient of the product of n and the target time and a second time consumption.

In this embodiment, n may be, but is not limited to, the number of cache blocks compressed at one time in the compression process. Such as: if the compression algorithm compresses 4 cache blocks at a time into compressed blocks, n may be, but is not limited to, 4. The slowest compression algorithm computes the time consuming 4 cache blocksCalculating the landing time of storing one buffer block in sequence +.>Calculation ofAnd->Is>. The number of buffer blocks in the first packet isThe number of cache blocks in the second packet is +.>. After the above-mentioned grouping, the time consumption of the first grouping to carry out direct IO storage is equal to the time consumption of the second grouping to carry out compression calculation.

For example, assuming that the GZIP algorithm is the slowest algorithm to calculate (the algorithm bandwidth is set by the extended attribute when the user mounts the file system) among the available algorithms, the time taken to calculate 4 1MB cache blocks is 9ms, the time taken to sequentially store 4 1MB cache blocks is 21ms, and the common multiple of the two is 63ms, so the number of blocks in the first group is 12, and the number of blocks in the second group is 28. The IO consumption time of two packets is the same as the compression computation time consuming.

In one exemplary embodiment, the first set of cache blocks in the first file that meet the first landing parameter may be landed, but is not limited to, in the following manner: extracting a first group of cache blocks conforming to first landing parameters from a first file; inserting the first group of cache blocks into a memory input-output queue, wherein the first group of cache blocks is preferentially dropped in the memory input-output queue.

In this embodiment, the first set of cache blocks is enqueued in the memory IO queue, such that the memory preferentially processes the cache blocks. Since this type of cache block is preferentially processed in a dequeue manner, the current dynamic load of the storage device is not considered in this scenario.

In one exemplary embodiment, the first set of compressed blocks may be obtained by, but not limited to, compressing a second set of cache blocks in the first file that meet the first compression parameter by: extracting a second group of cache blocks conforming to the first compression parameters from the first file; and carrying out compression operation on the second group of cache blocks in parallel by adopting a compression algorithm in the first compression algorithm set to obtain a plurality of compression block sets, wherein the first compression block set comprises a plurality of compression block sets, the first compression algorithm set comprises N compression algorithms, and N is larger than 1.

In this embodiment, the second group of cache blocks is inserted into task queues of all available compression computing units, and all available computing units perform compression computation in a parallel computing manner, so as to obtain compression ratios and computation time consumption in different manners. For example, the compression ratio and the calculation time of the GZIP/LZ77/Snappy algorithm are executed by using a hardware acceleration mode, and the compression ratio and the calculation time of the GZIP/LZ77/Snappy algorithm are calculated by using a CPU.

In one exemplary embodiment, the first set of compressed blocks may be dropped, but is not limited to, by: under the condition that the compression algorithm which is allowed to be used is a first compression algorithm set, the compression blocks obtained by adopting a second compression algorithm in the first compression block set are dropped, wherein the first compression algorithm set comprises N compression algorithms, N is larger than 1, the second compression algorithm is the compression algorithm with the shortest dropping time in the N compression algorithms, the first compression block set comprises a plurality of compression block sets, and the plurality of compression block sets are obtained by adopting N compression algorithms to carry out compression operation on a second group of cache blocks in parallel.

In this embodiment, the N compression algorithms compress all the second group of cache blocks in parallel, and the compression block obtained by the compression algorithm with the shortest landing time is landed, so that the shortening of the landing time is more ensured, and the landing efficiency is improved. Illustrating: assuming a 12MB cache block, the sequential write bandwidth of the disk is 120MB, and the time consumption of different compression modes is: GZIP takes 100ms and snapy takes 40ms. The compression ratios are respectively: the GZIP compression ratio is 4:1, and the Snappy compression ratio is 2:1. The drop delays for the various modes are: the uncompressed mode is 100ms, the gzip mode is 100 ms+25ms=125 ms, and the snap mode is 40+50=90 ms. Although the Gzip algorithm may achieve higher memory space benefits, in a forced landing instruction scenario, a low landing delay is required, so the snap algorithm evaluates to have a compression benefit. Meanwhile, the compression ratio of each part in the file is basically consistent through early sampling judgment, so that the later data adopts the compression storage mode, and shorter landing delay can be obtained.

In one exemplary embodiment, after the dropping of the compressed blocks of the first set of compressed blocks that are obtained using the second compression algorithm, the dropping of the first file may continue, but is not limited to, by: and compressing and landing the remaining non-landing cache blocks in the first file by adopting a second compression algorithm.

In this embodiment, the second compression algorithm with the shortest landing time may be used to compress the landing for other files in the first file, so as to maximally improve the benefits of the landing time.

In one exemplary embodiment, the compression blocks of the first set of compression blocks obtained using the second compression algorithm may be, but are not limited to, dropped by: detecting the compression ratio of a compression block obtained by each compression algorithm in N compression algorithms; and under the condition that the compression ratios of the N compression algorithms are consistent, the compression blocks obtained by adopting the second compression algorithm in the first compression block set are dropped.

In this embodiment, the disc-dropping is performed by using the second compression algorithm with the shortest disc-dropping time, which may be, but not limited to, a case where the compression ratio of each compression algorithm to the file is identical. The step of comparing the compression ratios is performed, so to speak, that at least one compression algorithm is available to gain storage delay for a certain cache block. However, since the data in the file blocks do not necessarily all have the same compression ratio, there may be high block benefits and low or even no benefits. In this embodiment, a block sampling manner is adopted to determine whether the cache blocks after file block division have similar compression ratios, so that different storage strategies are adopted.

For example, a plurality of file cache blocks are divided into a group, the cache blocks in a group are compressed by using the same compression mode, the compression ratios obtained by the cache blocks in a group are compared, the variance delta is calculated, and a smaller variance means that the file content has similarity, and the similar compression ratios can be obtained. Otherwise, the content of the file block does not have similarity, and different positions of the content in the file block have different compression ratios.

In one exemplary embodiment, after detecting the compression ratio of the compressed block obtained by each of the N compression algorithms, the first file may continue to be dropped by, but is not limited to: under the condition that the compression ratios of N compression algorithms are inconsistent, obtaining a compression block with the shortest landing time corresponding to each cache block in the second group of cache blocks from the first compression block set to obtain a second compression block set; and (5) carrying out disc dropping on the second compression block set.

Because the variance delta of the compression ratio of the data block is large, the various compression modes cannot obtain uniform gain proportion for the data block. Therefore, in this scenario, the present embodiment adopts multi-algorithm parallel computation, and only the algorithm with overall benefit remains after the computation is completed. For example, assume that there are 28 cache blocks within packet 2, three different compression algorithms are employed, gzip, LZ77, snapy. Comparing the calculation time consumption of each data block by adopting different algorithms with the storage time consumption after compression to obtain the compression algorithm of the shortest landing time of the compression block. If none of the 28 cache blocks uses Gzip as the optimal algorithm, the Gzip algorithm is eliminated from subsequent computations.

And (3) dropping the data with the minimum IO delay in different compression modes. Since packet 2 already pays the cost of compression calculation at the current time, for the landing of packet 2, only the minimum IO cost is considered, and the data is dropped.

In one exemplary embodiment, after the second set of compressed blocks is dropped, the dropping of the first file may continue, but is not limited to, by: detecting a second landing parameter and a second compression parameter of the rest cache blocks except the first group of cache blocks and the second group of cache blocks in the first file, wherein the second landing parameter is used for indicating the number of the cache blocks capable of landing in the reference time, and the second compression parameter is used for indicating the number of the cache blocks capable of being compressed in the reference time; the third group of cache blocks which are in the residual cache blocks and accord with the second disc-dropping parameter are subjected to disc-dropping, and meanwhile the fourth group of cache blocks which are in the residual cache blocks and accord with the second compression parameter are compressed to obtain a third compression block set; and carrying out disc dropping on the third compression block set.

In this embodiment, the remaining data in the first file may continue to be re-grouped, and the disc is dropped in a parallel manner of storage and compression calculation. And selecting the slowest algorithm from algorithms reserved in the compression set, and regrouping the residual data by adopting the grouping mode. And when the X-th set is stored, carrying out compression calculation on the X+1th set in parallel, so that when the X-th set completes the storage operation, the X+1th set completes the parallel calculation of various compression algorithms, and a cache block with the least IO time consumption is selected for the disc-dropping operation.

In one exemplary embodiment, the third compressed block set may be obtained by, but is not limited to, compressing a fourth group of cache blocks of the remaining cache blocks that meet the second compression parameter by: extracting a fourth group of cache blocks conforming to the second compression parameter from the remaining cache blocks; and carrying out compression operation on the fourth group of cache blocks in parallel by adopting a compression algorithm in the second compression algorithm set to obtain a third compression block set, wherein the second compression algorithm set comprises N-m compression algorithms in the first compression algorithm set, m is greater than or equal to 1, and the m compression algorithms are compression algorithms which are not used by the compression blocks in the second compression block set.

In this embodiment, it is assumed that the time consumption for calculating 4 1MB cache blocks is 4ms, the time consumption for sequentially storing 4 1MB cache blocks is 16ms, and the common multiple of the two is 16ms, so the number of the first group of blocks is 4, the number of the second group of blocks is 16, and the IO consumption time of two groups is 16ms as same as the compression calculation time; at the moment of 16ms, the first packet data completes the landing, the compression block of the landing is 4 to 1MB cache blocks, the required storage time is 4ms, and since the calculation time is 16ms, 12 1MB cache blocks are added in the second group, and 16 cache blocks are allocated in the third group, assuming that the optimal compression ratio of the second packet is 4:1; at 32ms, the second packet data completes the landing in compressed and uncompressed format, while the third packet data completes the calculation. According to the scheme of the embodiment, under the condition that the block compression ratio in the file cannot be determined, the full-load operation of the storage IO is preferentially ensured, and meanwhile compression calculation is inserted, so that possible benefits are obtained.

In one exemplary embodiment, the first set of compressed blocks may be dropped, but is not limited to, by: detecting compression benefit information of the first compression block set, wherein the compression benefit information is used for indicating whether benefits of landing delay exist in the process of landing the first compression block set relative to the process of landing the second group of cache blocks; under the condition that compression benefit information is used for indicating benefits of disc landing delay, carrying out disc landing on the first compression block set; and under the condition that the compression benefit information is used for indicating that the benefit of no landing delay exists, carrying out non-compression landing on the second group of cache blocks and the rest cache blocks except the first group of cache blocks and the second group of cache blocks in the first file.

In this embodiment, when the first compression block set is dropped, whether all compression results have compression benefits may be calculated. The method comprises the following steps: recording time consumption and compression ratio of compression in different modes, and reading and writing bandwidth according to disk sequence set by a user) And calculating time consumption of storing the non-compression and different compression algorithms into a disk, and evaluating whether the benefits of the landing delay can be obtained by using the compression storage. Cannot be used for any compression algorithm And under the condition of obtaining compression benefits, non-compression storage is used for all subsequent cache blocks of the file. Thereby saving compression time and improving tray dropping efficiency. Illustrating: if the sum of the compression time and the storage time after compression is larger than the direct storage time under any compression algorithm, the compression ratio of the file content is too small, and the file content is judged to be unsuitable for compression storage, and all the cache blocks adopt a non-compression storage mode.

Based on the above process, the embodiment performs the buffer block disk-dropping processing according to the unit of the file, and uses the parallel computing characteristics of the disk IO and the accelerator card to minimize the delay of the buffer block disk-dropping, and fig. 15 is a schematic diagram of a file compression storage process according to the embodiment of the present application, as shown in fig. 15, and the process may include, but is not limited to, the following steps:

step 1: and obtaining the cache blocks of the same file index. Traversing from the leftmost side of the red black tree, sequentially acquiring file indexes according to the weighted landing priority, and finding out the cache block pointed by the file indexes.

Step 2: the buffered blocks are sampled and grouped. And obtaining the 1 st group of cache blocks and the 2 nd group of cache blocks which have the same landing time and compression time.

Step 3: the 1 st group of cache blocks are inserted into a memory IO (input output) queue, so that the memory processes the cache blocks preferentially. Since this type of cache block is preferentially processed in a dequeue manner, the current dynamic load of the storage device is not considered in this scenario.

Step 4: and inserting the group 2 cache blocks into task queues of all available compression calculation units, and performing compression calculation on all available calculation units in a parallel calculation mode to obtain compression ratios and calculation time consumption in different modes.

Step 5: in the calculating step 4, whether all calculation results have compression benefits or not is calculated. Step 9 is performed for the case where compression benefits cannot be obtained by any compression algorithm.

Step 6: and judging whether the compression ratios are consistent. Execution proceeds to step 6, where it is demonstrated that there is at least one compression algorithm that can be used to gain storage latency for a certain cache block. However, since the data in the file blocks do not necessarily all have the same compression ratio, there may be high block benefits and low or even no benefits. In this embodiment, a block sampling manner is adopted to determine whether the cache blocks after file block division have similar compression ratios, so that different storage strategies are adopted.

Step 7: and landing the data block with the shortest landing time. When the compressed storage has benefits of a landing delay and the file content has a consistent compression ratio, the process proceeds to step 7. The 1 st group of cache blocks are inserted into a disk IO queue in a mode of sequential storage in an uncompressed format; the 2 nd group of cache blocks obtain the compression ratio of the compressed data in different compression modesTime-consuming with compression->. And calculating a compression scheme corresponding to the mode of the fastest landing disc. And obtaining the shortest time disk drop scheme according to compression time consumption, compression ratio and disk delay of different compression algorithms. And carrying out the tray-dropping operation on the compression block calculated in the shortest tray-dropping mode.

Step 8: and (5) the remaining data which are not dropped are all dropped after calculation according to the shortest time drop mode.

Step 9: all subsequent cache blocks of the file use uncompressed storage.

Step 10: the calculation time consumption and IO storage delay of different compression methods in the 2 nd group cache block are calculated. Because the variance delta of the compression ratio of the data block is large, the various compression modes cannot obtain uniform gain proportion for the data block. Therefore, in the scene, the invention adopts multi-algorithm parallel computation, and only the algorithm with integral benefit is reserved after the computation is completed.

Step 11: and (3) dropping the data with the minimum IO delay in different compression modes. Since the 2 nd group of cache blocks at the current moment already pay the cost of compression calculation, for the landing of the 2 nd group of cache blocks, only the minimum IO cost is considered, and the data is dropped.

Step 12: and (3) after the rest data are regrouped in a step 2 mode, a storage and compression calculation parallel mode is adopted for landing. Fig. 16 is a schematic diagram of a file compression and disc-dropping process according to an embodiment of the present application, as shown in fig. 16, where, among the algorithms reserved in step 10, the slowest algorithm is selected, and the remaining data is regrouped in the manner of step 2. And when the X group cache block is stored, performing compression calculation on the X+1 group cache block in parallel, so that when the X group cache block finishes the storage operation, the X+1 group cache block finishes the parallel calculation of various compression algorithms, and the cache block with the least IO time consumption is selected for the disc-dropping operation.

In one exemplary embodiment, a drop instruction of a drop priority higher than a drop delay of data for a drop storage space of data may be processed by, but is not limited to: under the condition that a second instruction is received, detecting a storage parameter of each cache block in a second file of a landing disc indicated by the second instruction, wherein the second instruction is a landing disc instruction with a landing disc priority higher than a landing disc priority of a landing disc delay time of data, and the storage parameter is used for indicating the performance of storing each cache block; searching a target storage mode matched with each cache block from a plurality of storage modes according to the storage parameters; and landing the disk on each cache block by adopting a target storage mode.

In this embodiment, the second instruction is a landing instruction with a landing priority of the landing storage space of the data higher than that of the landing delay time of the data, and the second instruction may, but is not limited to, include the above-mentioned timing write-back instruction, where the requirement of the landing storage space is higher, for example: timing save instructions for operating systems, timing save instructions for applications, and the like.

For the timing write-back instruction, the setting of the file read-write expansion attribute by the user is considered in the scene, and the saving of the storage space and the delay of the data reading are considered.

In one exemplary embodiment, the storage parameters of each cache block in the second file of the landing tray indicated by the second instruction may be, but are not limited to, detected by: detecting a storage cost parameter and a read-write cost parameter of each cache block in each storage mode of a plurality of storage modes, wherein the storage cost parameter is used for indicating benefits of using the corresponding storage mode on a storage space, the read-write cost parameter is used for indicating benefits of using the corresponding storage mode on read-write performance, and the plurality of storage modes at least comprise: non-compressed storage, each of a plurality of compression algorithms is stored.

In this embodiment, the plurality of storage modes at least includes: non-compressed storage, each of a plurality of compression algorithms is stored, and the storage mode used can be selected by combining the stored benefits and the read-write benefits without limitation. The selected storage mode better meets the condition and the requirement of each cache block.

In one exemplary embodiment, the storage cost parameter and the read-write cost parameter for each cache block in each of the plurality of storage modes may be detected, but are not limited to, by: detecting the storage space occupied by each cache block in each storage mode as a storage cost parameter; and detecting the write cost and the read cost of each cache block in each storage mode as read-write cost parameters, wherein the write cost is used for indicating the time consumption of writing data into a disk, and the read cost is used for indicating the time consumption of reading data from the disk.

In this embodiment, the storage cost parameter is used to indicate benefits on the storage space by using a corresponding storage manner, for example: occupies memory space, compression ratio, etc. The read-write cost parameter is used for indicating benefits in read-write performance by using a corresponding storage mode, such as: write time, read time, etc.

Taking non-compressed storage, gzip compressed storage and Snappy compressed storage as examples, the storage cost parameter corresponding to the non-compressed storage is 1MB of storage space, and the read-write cost parameter is 10ms of time consumed for writing and 5ms of time consumed for reading. The storage cost parameter corresponding to the Gzip compression storage is 100KB after compression, and the read-write cost parameter is 4ms in writing time and 3ms in reading time. The storage cost parameter corresponding to the Snappy compressed storage is 200KB after compression, the read-write cost parameter is 5ms for writing and 2ms for reading.

In one exemplary embodiment, the write cost and the read cost of each cache block in each storage manner may be detected as the read-write cost parameter by, but not limited to: detecting the time consumption of single writing and the time consumption of single reading of each cache block in each storage mode; acquiring the writing times and the reading times of each cache block in a target time period before the current time; the product of the write once time and the number of times of writing is determined as the write cost, and the product of the read once time and the number of times of reading is determined as the read cost.

In this embodiment, the estimated values of the read-write cost of the cache blocks in different storage modes are calculated according to the statistics of the read-write times. And the cache module performs statistics of the read-write times, and obtains the read-write cost estimation values of different modes by considering the accumulation time of multiple reads and writes under different storage schemes. Illustrating: assume that a 1MB size cache block needs to be processed. The hardware conditions are as follows: IO write bandwidth 100MB/s and IO read bandwidth 200MB/s; the compression bandwidth of a certain compression algorithm is 500MB/s, and the decompression bandwidth is 100MB/s. The compression ratio obtained was 2:1. And in a certain time period, the buffer module monitors that the data is written for 2 times and read for 10 times, and meanwhile, the reading amplification factor is 2:1. The cost of non-compressed storage includes: writing 10ms, reading 2.5ms (the whole 1MB data takes 5ms, because the reading amplification factor is 2:1, it means that only half of the data is actually read, and under the non-compressed storage condition, only the required data can be read, so that the time is 2.5 ms). After the reading and writing accumulation, the cost estimation value is as follows: 20ms is written and 25ms is read. The cost of compressed storage includes: compression takes 2ms, 500KB after compression, IO writing takes 5ms, writing takes 7ms in total, file reading takes 2.5ms, decompression takes 5ms, and reading takes 7.5ms in total. After the reading and writing accumulation, the cost estimation value is as follows: 14ms is written and 75ms is read.

In one exemplary embodiment, the target storage means matching each cache block may be searched from a plurality of storage means according to the storage parameters by, but not limited to: obtaining a storage bias value and a read-write bias value of each cache block, wherein the storage bias value is used for indicating the requirement of a user for saving storage space, and the read-write bias value is used for indicating the requirement of the user for saving writing time and reading time; determining the total storage benefit of each storage mode according to the storage bias value, the read-write bias value and the storage parameter; and determining the storage mode with the highest total storage benefit among the plurality of storage modes as a target storage mode.

In this embodiment, the landing format of the buffer block is determined according to the storage offset value of the file. The file stores the offset value, which expresses whether the user expects data to be stored in a memory, to be biased to save memory space or to be biased to read-write performance, and whether the data is biased to read or write in read-write. The compressed size multiplied by the stored offset value is compared to the original data for use in deciding whether to ignore the compression benefit. The read-write cost estimate multiplied by the read-write offset value in different ways is used to decide which compression scheme to use.

Illustrating: from the perspective of users, the space can be saved, the read-write performance can be improved, and the best selection is realized, but when the conditions cannot be met at the same time, trade-off and trade-off are needed. For example, the user has small data volume or sufficient cost, is not concerned about the occupation of disk space, can receive the writing delay when the data is modified, but is extremely sensitive to the reading delay of the data. The bias values set by the user are as follows: the memory bias is 10, the write bias is 10, and the read bias is 6. For a 1MB data block, assume the cost estimate for different storage modes as follows:

non-compressed storage: the memory space is 1MB, the writing time is 10ms, and the reading time is 5ms.

Gzip compression storage: 100KB after compression, 4ms is consumed for writing, and 3ms is consumed for reading.

Snappy compressed storage: 200KB after compression, the writing takes 5ms and the reading takes 2ms.

The different compression modes are larger than the original value after being multiplied by the storage bias value, so that the benefit of the storage space is ignored in the cost estimation strategy. Judging the benefits of the read-write performance, wherein the total cost of various non-storage modes is as follows:

after synthesis, the cost value of the snap is the smallest, so the cache is stored by the snap compression algorithm.

FIG. 17 is a schematic diagram of a timed write-back instruction triggered compressed storage process according to an embodiment of the present application, as shown in FIG. 17, which may include, but is not limited to, the following steps:

step 1: and transmitting the cache block to an accelerator card, and executing different compression algorithms in parallel. Such as: and transmitting the cache blocks to an FPGA memory, and executing available compression algorithms in parallel to obtain compression ratios of different compression algorithms.

Step 2: the available memory space information is obtained from the encoding module. It is obtained from the encoding module whether the current data block is likely to be allocated to discrete memory space. Due to the continuous modification of the data, a large amount of fragment space is generated in the coding module, and although the reading and writing performance of the discrete memory space is lower, the memory space can be saved.

Step 3: the write-once time and the read-time of different storage formats are calculated. Based on the obtained compression ratio, the calculations use non-compressed formats, compression+patch formats, writing and reading in continuous space storage and discrete space storage are time consuming. Illustrating: assume that a 1MB size cache block needs to be processed. The hardware conditions are as follows: IO sequential writing bandwidth 100MB/s and IO sequential reading bandwidth 200MB/s; the compression bandwidth of a certain compression algorithm is 500MB/s, and the decompression bandwidth is 100MB/s. File uncompressed storage cost: IO writing takes 10ms and reading takes 5ms.

Assuming an obtained compression ratio of 5:1, then the storage cost is compressed: compression takes 2ms, 200KB after compression, IO writing takes 2ms, total writing takes 4ms, file reading takes 1ms, decompression takes 2ms, and total reading takes 3ms. In the scene, better benefits can be obtained by adopting compressed storage for reading and writing, so that benefits in two aspects of reading and writing can be obtained from compression in the cache block.

Assuming an obtained compression ratio of 2:1, then the storage cost is compressed: compression takes 2ms, 500KB after compression, IO writing takes 5ms, writing takes 7ms in total, file reading takes 2.5ms, decompression takes 5ms, and reading takes 7.5ms in total. In this scenario, the writing obtains performance benefits, but the reading performance is reduced, and further cost estimation is performed according to the reading and writing times of the file later.

Assuming that 1MB cache blocks are already stored in a disk, the compression ratio is 5:1, the storage space of 200KB is actually occupied, a user modifies 64KB of data, at the moment, the cost of a scheme for compressing a storage and patch page is calculated, the original compression blocks are not modified, and only the time consumption required by the 64KB modified page is dropped for cost estimation in the subsequent step.

Step 4: and calculating the cache block read-write cost estimation values of different storage modes according to the read-write times. And the cache module performs statistics of the read-write times, and obtains the read-write cost estimation values of different modes by considering the accumulation time of multiple reads and writes under different storage schemes.

Step 5: and determining the landing format of the cache block according to the storage offset value of the file. The file stores the offset value, has expressed the user's expected data in storing, bias the memory space to save, or bias the read-write performance, bias reading or writing in the read-write, the invention carries on the choice of the final storage format according to the offset value that the user sets up. The compressed size multiplied by the stored offset value is compared to the original data for use in deciding whether to ignore the compression benefit. The read-write cost estimated value of different modes is multiplied by the read-write offset value to decide which compression scheme to use.

In one exemplary embodiment, an instruction to free up memory space may be processed, but is not limited to, by: under the condition that a third instruction is received, acquiring a target cache block to be dropped from a target memory space, wherein the third instruction is used for indicating to release the target memory space; compressing the target cache block and detecting the compression ratio of the target cache block; and landing the target cache block according to the compression ratio of the target cache block and releasing the memory space occupied by the target cache block.

In this embodiment, the third instruction is used to instruct to release the target memory space, and the third instruction may, but is not limited to, include the release space instruction described above, and this instruction is used to sort the memory space, and may be triggered by a user or automatically by a system.

For the space release instruction, the quick release memory space is the highest priority, and the focus is on releasing more free memory. When the system memory space is insufficient and the remaining memory is below the low water mark, a space release instruction can be triggered, and more memory space is expected to be released. Because the transmission time and the calculation time of the hardware compression algorithm are generally much faster than those of the disk IO, the hardware compression card has a certain capacity of memory space and can be used as a temporary backup cache.

In one exemplary embodiment, the target cache block may be compressed and the compression ratio of the target cache block detected by, but not limited to: compressing the target cache block by adopting a compression algorithm with the highest compression ratio to obtain a compressed cache block; the actual compression ratio of the compressed cache block is detected.

In this embodiment, the compression mode adopted by the target cache block of the memory space to be released may be, but not limited to, a compression algorithm with the highest compression ratio, so as to release as much memory space as possible.

In this embodiment, the compression process may be performed by, but is not limited to, an FPGA.

In one exemplary embodiment, the target cache block may be dropped and the memory space occupied by the target cache block is released according to the compression ratio of the target cache block by, but not limited to: replacing the target cache block in the target memory space with a compressed cache block according to the actual compression ratio, and releasing the memory space occupied by the target cache block compared with the compressed cache block; transmitting the compressed cache blocks to a memory input/output queue for disk dropping according to disk dropping priority; and under the condition that the compression cache block finishes the disc dropping, releasing the memory space occupied by the compression cache block.

In this embodiment, the target cache block in the target memory space is replaced by the compressed cache block according to the actual compression ratio, and the memory space occupied by the target cache block compared with the compressed cache block is released, so that part of the memory is released as soon as possible.

In one exemplary embodiment, the target cache block in the target memory space may be replaced with a compressed cache block according to the actual compression ratio by, but not limited to: comparing the actual compression ratio with a compression ratio threshold; under the condition that the actual compression ratio is greater than or equal to the compression ratio threshold, replacing the target cache block in the target memory space with the compression cache block; inserting the target cache block into a memory input/output queue under the condition that the actual compression ratio is smaller than a compression ratio threshold value, wherein the target cache block is dropped in the memory input/output queue preferentially; and under the condition that the target cache block finishes the disc landing, releasing the memory space occupied by the target cache block.

In this embodiment, the target cache block with low compression rate is directly dropped with the uncompressed cache block, that is, the cache block with low compression rate is dropped as soon as possible, so that the memory space is released as soon as possible.

Based on the above process, in the embodiment, for a landing scene of a space instruction, firstly, data to be dropped is put into an FPGA, and the highest compression ratio operator is executed; after the calculation is completed, for the cache block with high compression ratio, using a temporary replacement strategy to release part of host side cache; and for the cache block with low compression ratio, after the data is dropped as soon as possible by using the queue inserting mode, releasing the caches at the host side and the FPGA side. FIG. 18 is a schematic diagram of a release space instruction triggered compressed storage process, as shown in FIG. 18, according to an embodiment of the present application, which may include, but is not limited to, the steps of:

step 1: the cache block is transferred to DDR memory (double Rate synchronous dynamic random Access memory) of an FPGA (field programmable Gate array) while the highest compression ratio operator is enabled. And starting from the cache block with the highest drop priority, putting the cache block into a task queue of a highest compression ratio operator of the FPGA. Until the FPGA has no allocable memory.

Step 2: and replacing the high compression ratio cache block at the host side, releasing redundant cache at the host side, and releasing the memory of the FPGA cache block. And for the cache block with high compression ratio, transmitting the compressed cache block in the FPGA to the Host side, replacing the original cache block by using the compressed cache block, releasing redundant Host side cache, and releasing the memory of the cache block corresponding to the FPGA.

Step 3: and inserting the compressed cache blocks into an IO (input/output) storage queue in a classified manner, and releasing the IO to finish the cache blocks. If the compression ratio of the cache block is low, inserting the original cache block queue into an IO storage queue; and if the compression ratio of the cache block is higher, placing the compressed cache block at the tail of the IO storage queue. And for the cache blocks which are completed in the IO storage queue, releasing the corresponding caches at the host side and the FPGA side.

Step 4: and repeatedly executing the step 1 until the residual memory space at the host side is higher than the low water level mark.

Fig. 19 is a schematic diagram of a space release process according to an embodiment of the present application, as shown in fig. 19, occupied by the host side cache blocks 1 to 5 is a memory space that needs to be released, the cache blocks 1 to 5 are sent to a DDR memory of an FPGA (field programmable gate array), and the highest compression ratio operator is started to compress, where the compression ratio is from high to low, and is cache block 1, cache block 5, cache block 2, cache block 4, and cache block 3, and the compression ratio of the cache block 3 is almost 1:1. And (3) directly landing the cache block 3 and releasing the occupied memory space, replacing the cache block 1, the cache block 5, the cache block 2 and the cache block 4 with the corresponding obtained compression blocks, releasing the redundant memory space, and releasing all the memory space after landing is completed.

In one exemplary embodiment, the instructions to optimize memory space may be processed, but are not limited to, by: under the condition that a fourth instruction is received, obtaining storage data from a target storage space, wherein the fourth instruction is used for indicating to optimize the target storage space, and the storage data is file data with lower landing priority than other landing dimensions in a landing storage space of the data when the storage data is landed; and re-landing the stored data by taking the landing priority of the landing storage space of the data as the highest priority.

In this embodiment, the fourth instruction is used to instruct the optimization arrangement of the storage space, and the fourth instruction may, but is not limited to, include the above-mentioned optimization storage instruction, where the instruction is used to arrange the storage space, and may be triggered by a user or automatically by a system.

For the optimized storage instruction, in the forced disk-dropping instruction and the buffer memory releasing instruction, in order to realize the primary tasks of dropping disks and releasing buffers as soon as possible, offset value setting of file access by a user and read-write tracking of a buffer memory module on a data block are not considered. In the embodiment, when the computing unit and the storage unit are idle, the data access characteristics are counted through the cache module in the data access process, and the data storage is further optimized. The priority of the optimized storage instruction is lowest, the optimized storage instruction is inserted into the tasks of the computing unit and the storage unit, the tasks are placed at the tail end of the task queue, and when the FPGA task queue is full, the tasks sent by the optimized storage instruction are preferentially withdrawn.

In one exemplary embodiment, the stored data may be, but is not limited to, obtained from the target storage space by: obtaining a third file which is dropped by the first instruction from the target storage space and/or obtaining a fourth file which is dropped by the third instruction from the target storage space, wherein the third instruction is an instruction for indicating to release the target memory space; and extracting storage data from the acquired file.

In this embodiment, the data for executing the optimized store instruction may include, but is not limited to, files that force a drop instruction to drop, files that release a space instruction to drop, and so on.

In one exemplary embodiment, stored data may be extracted from the acquired file by, but is not limited to, the following: detecting the target proportion of the hashed and stored data in the acquired file; determining all data of the acquired file as storage data under the condition that the target specific gravity is greater than or equal to a specific gravity threshold value; in the case where the target specific gravity is smaller than the specific gravity threshold value, data whose storage performance allows improvement is extracted from the acquired file as storage data.

In this embodiment, all the files having a larger specific gravity of the hash storage are optimally stored, and the data having a smaller specific gravity of the hash storage, which allows an improvement in storage performance, are optimally stored.

In one exemplary embodiment, stored data may be re-dropped with the drop priority of the drop storage space of the data as the highest priority by, but is not limited to: reading the storage data into a memory space of a system to obtain memory data; and executing a second instruction on the memory data, wherein the second instruction is a landing instruction with a landing priority of a landing storage space of the data higher than a landing priority of a landing delay time of the data.

In this embodiment, first, the storage data to be optimally stored is read into the memory space, and then a landing instruction with a landing priority higher than that of the landing delay time of the data in the landing storage space of the data is executed to perform optimization, for example, a timing write-back function of the timing write-back instruction is called.

For the optimized storage instruction, in the active mode, when the task queues in the computing unit and the storage unit are smaller, the method is started, uncompressed storage data and hash storage data in a disk are scanned, the most frequently accessed file is extracted according to the history access record in the cache module, and the cost and benefit of recompression and sequential storage are calculated.

In the passive mode, when file data is read to the cache module, for a file which is accessed more frequently, other cache blocks of the file are loaded to the cache module, the specific gravity of uncompressed data blocks in the file and the specific gravity of discrete storage blocks are scanned, and the cost and benefit to be restored are calculated.

In both modes, the processing is performed in logical file units.

Because of the spatial locality characteristic of data access, most files in the system are rarely accessed, and few files are frequently accessed in a period of time, and better user experience can be obtained by optimizing the files. Based on the characteristics, the embodiment provides a file-oriented optimized storage strategy, and can perform storage optimization on files frequently accessed by a user when the system is idle. FIG. 20 is a schematic diagram of an optimized store instruction triggered compressed store process, as shown in FIG. 20, according to an embodiment of the present application, which may include, but is not limited to, the steps of:

step 1: meta information of the file is loaded from the encoding module. In the active mode, when the computing unit and the storage unit are idle, the storage optimization processing is actively carried out on the file with the highest access frequency in the cache module. In the passive mode, when a file is loaded into the cache module, the access frequency of the file is judged, and storage optimization processing is carried out on the file.

Step 2: and counting the number of the forced file drop data blocks. When the forced landing command is executed, the comprehensive performance optimization of storage is less considered, and only the fast landing is considered. Therefore, the part of data also has larger optimizing value, and the method for judging the forced landing data block comprises the following steps: and judging the corresponding meta information mark in the data block.

Step 3: and counting the number of data blocks of the file cache release instruction landing disc. When the buffer memory is released, the highest compression ratio algorithm is uniformly used to realize the release of more buffer memories, and the read-write performance and the optimization of the storage space are not considered. Thus, this portion of data can be further optimized.

Step 4: and counting the proportion of the hash storage blocks in the file. Hash storage can result in random access to disk, degrading IO memory read performance. Therefore, in the case where the file hash storage is large, it is considered that the file storage is optimized to be a continuous space.

Step 5: and deciding whether the file is partially optimized or fully optimized according to the statistical information. Under the condition of larger file hash storage proportion, all data are read out and reorganized, so that file storage space benefit and access performance can be better improved. For the case of less file hash storage proportion, it is more reasonable to perform partial optimization only.

Step 6: and reading the target data block to a cache module, and calling a timing write-back function. And when the data is partially optimized, the data blocks can be read in batches, and when the data is fully optimized, all the data are read to the cache module. And re-landing the data according to the regular write-back instruction landing rule.

In one exemplary embodiment, the sequence of dropping files to be dropped may be recorded by file priority as follows: recording a file to be dropped according to a file priority, wherein the file priority is used for indicating the emergency degree of file dropping, the file priority is determined according to the file type and the instruction type corresponding to the file, and the instruction type is the type of the instruction for indicating file dropping.

In this embodiment, the file priority may, but is not limited to, consider the disk-dropping requirement of the file itself and the disk-dropping requirement of the disk-dropping operation triggered by the above-mentioned various instructions. The file priority may be determined based on the file type and the instruction type to which the file corresponds.

In one exemplary embodiment, files to be dropped may be recorded by file priority, but not limited to, by: recording files to be dropped by adopting a red black tree structure, wherein the red black tree structure takes the file priority as an index, each red black tree node is a file linked list, each linked list node in the file linked list is an inode index of one file to be dropped under the corresponding file priority, and the inode index points to a cache of the corresponding file to be dropped.

In this embodiment, the file to be dropped may be recorded by using a structure of a red-black tree, which may be, but not limited to, indexed from high to low of the priority of the file, so that the file can be recorded and searched on the one hand, and the searching efficiency can be improved on the other hand.

In this embodiment, a recording method of a file to be landed is provided, fig. 21 is a schematic diagram of a recording method of a file to be landed according to an embodiment of the present application, as shown in fig. 21, the container structure is used to manage data blocks to be landed, red black trees are used to store data blocks with different priorities, and weighted priority of file landing is used during insertion) Is an index. The nodes of the red and black tree are linked lists, and the nodes in the linked lists are inode indexes of the files(inodes) the inodes of different files differ, so each node on the linked list represents a file. Each inode index points to a cache of the file, the cache is in a binary tree structure, and the partition number of the file is used as an index. The cost estimation module may directly put the buffer block into the IO queue of the memory, or allocate the buffer block to the task queue of the available computing unit, for example: operators in an FPGA (field programmable gate array), an operation core of a CPU (central processing unit) perform compression calculation, and then data of the compression calculation are submitted to an IO (input output) task queue. The IO tasks submitted by all modes of the disk are merged and then written into the disk in batches, so that the writing performance is improved.

Priority of file drop) The value range is 0-10; when inserting the red-black tree, priority weighting is performed according to different scenes, for example, the weighting rule may be, but is not limited to,: forced drop instructions plus 100000, free space instructions plus 10000, timing write back instructions plus 1000, optimized store instructions plus 100. When the buffer blocks are dropped, the target buffer blocks are firstly obtained from the right side of the red-black tree, and the dropped priorities are classified according to the scenes through weighting operation, so that the processing sequence of the dropped priorities of different files under the same scene is reserved. Due to the self-balancing characteristic of the red-black tree, the searching time of the high-priority cache block is reduced.

For the coding module in the file system, the coding module can perform coding after the file is segmented and compressed, and the file cache blocks in the cache module are coded into continuous physical files. The user files in the logic space are segmented according to the appointed segmented intervals, and the data after segmented compression are encoded into continuous physical files by the encoding module. When a physical file is created, firstly, a file header of 12KB is created, wherein the file header comprises 3 areas with the size of 4KB for recording meta-information when the physical file is created, and the meta-information is respectively: file management block, idle space management block, index block. After various access operations, the meta information is dynamically updated and inserted, fig. 22 is a schematic diagram of meta information of a file according to an embodiment of the present application, where, as shown in fig. 22, the information in the file management block, the free space management block, and the index block included in the meta information is dynamically adapted to the adjustment of the access operation on the file, for example: the data in the block section with the window number of 1 may be changed to be dropped in a manner of combining the compressed block and the patch block by a small amount.

Also provided in this embodiment is a method for compression encoding a file, and fig. 23 is a flowchart of a method for compression encoding a file according to an embodiment of the present application, as shown in fig. 23, where the flowchart includes the following steps:

step S2302, detecting a target access operation of target data performed on a target logical file;

step S2304, decompressing the first file data corresponding to the target data in the first physical file according to the initial meta information, and executing the target access operation to obtain the second file data, wherein the initial meta information is used for recording the corresponding relationship between the block interval in the target logical file and the data physical block in the first physical file;

step S2306, in the case of compressing and landing the second file data, compressing and encoding the second file data by adopting a target compression mode corresponding to the second file data to obtain third file data and reference meta information of the third file data;

in step S2308, the third file data and the reference meta information are inserted into the first physical file to obtain a second physical file corresponding to the target logical file and carrying the target meta information, where the target meta information is used to record a corresponding relationship between the partition interval in the target logical file and the physical block of the data in the second physical file.

Through the steps, when the target access operation of the target data executed on the target logic file is detected, the first file data related to the target data in the first physical file corresponding to the target logic file is decompressed according to the initial meta information, and the target access operation is executed, so that random read-write access of the file is realized, when the second file data obtained after the access operation needs to be compressed and landed, the corresponding target compression mode is adopted for compression coding, third file data and reference meta information are formed, and the third file data and the reference meta information are inserted into the first physical file, so that dynamic generation and insertion of the meta information of the physical file are realized. Therefore, the technical problem of poor file access flexibility can be solved, and the technical effect of improving the file access flexibility is achieved.

Wherein the steps may be performed by, but not limited to, the encoding module and the like.

In this embodiment, the target access operation may be, but is not limited to being, a random read operation or a random write operation, or the like. The user can modify the file at any time through writing operation and call the file through reading operation.

In this embodiment, the target logical file is a file of a logical space, the target data is part or all of data in the target logical file that the user desires to access, the first physical file is a physical file corresponding to the target logical file in the physical space before the target access operation is performed, the initial meta information is meta information of the first physical file, which records a relationship between the target logical file and the first physical file, the first file data is a part corresponding to the target data in the first physical file, the second file data is a file data after the decompression and the target access operation are performed, the third file data is a file data after the compression encoding is performed, the reference meta information is meta information dynamically generated for the third file data, the second physical file is a physical file corresponding to the target logical file in the physical space after the target access operation is performed, and the target meta information is meta information of the second physical file, which records a relationship between the target logical file and the second physical file.

In an exemplary embodiment, the second file data may be compression encoded by, but not limited to, using a target compression method corresponding to the second file data, to obtain the third file data and the reference meta information of the third file data: compressing the second file data by adopting a target compression mode corresponding to the second file data to obtain reference compressed data; and encoding the reference compressed data according to the relation between the reference compressed parameter and the initial compressed parameter to obtain the third file data and the reference meta information of the third file data, wherein the reference compressed parameter is the compressed parameter of the reference compressed data, and the initial compressed parameter is the compressed parameter of the initial compressed data before the first file data is decompressed.

In this embodiment, the target compression method may include, but is not limited to, not compressing, a compression algorithm, compressing a combined patch, and the like.

In this embodiment, the reference compression parameter is a compression parameter of the reference compressed data, and the initial compression parameter is a compression parameter of the initial compressed data before decompression of the first file data. The file data is encoded according to the relation between the compression condition of the reference compressed data and the compression condition of the initial compressed data, the compression condition of the same part of data in the logic file before and after access may change due to the access, and the data is encoded and meta-information is dynamically generated according to the change.

In one exemplary embodiment, the reference compression data may be encoded according to a relationship between the reference compression parameter and the initial compression parameter, but is not limited to, to obtain the third file data and the reference meta information of the third file data by: acquiring a window number of a block section where each compressed data block is located in reference compressed data; distributing data physical blocks for the compressed data blocks in each block interval in a physical space of the first physical file according to the relation between the reference compression parameters and the initial compression parameters to obtain third file data, and acquiring the block number of each data physical block; and recording the window number and the block number with the corresponding relation to obtain reference index information corresponding to the third file data, wherein the reference element information comprises the reference index information.

In this embodiment, a physical block of data conforming to the current compression condition of the block section is allocated to each block section according to the relationship between the compression condition of the reference compression data and the compression condition of the initial compression data, and a window number and a block number having a corresponding relationship are recorded as reference index information in the reference meta information.

In an exemplary embodiment, the reference index information corresponding to the third file data may be obtained by, but not limited to, recording the window number and the block number having the correspondence relationship in the following manner: creating a window number field and a physical block information field which have a corresponding relation; generating physical block information corresponding to the window number of each block interval according to the physical block information field according to the attribute and the block number of the compressed data block in each block interval; and adding the window number and the physical block information with the corresponding relation into the window number field and the physical block information field with the corresponding relation to obtain reference index information.

In this embodiment, the window number field may be used to record a window number, and the physical block information field may be used to record various information corresponding to a physical block, such as: physical block location related information, compression related information, physical block type related information, and the like.

In one exemplary embodiment, the window number field and the physical block information field having a correspondence relationship may be created, but are not limited to, by one of the following:

in a first mode, a window number field, a physical block number field, a block type field and a block number field which have a corresponding relation are created, wherein the physical block number field is used for recording the number of physical blocks included in a corresponding window number, and the block type field is used for recording the type of the physical blocks included in the corresponding window number;

in a second mode, a window number field, a section number field, a block type field, a block start field and a block end field with a corresponding relation are created, wherein the section number field is used for recording the number of continuous physical block sections in a corresponding window number, the block type field is used for recording the type of physical blocks included in the corresponding window number, the block start field is used for recording the start block number in one physical block section in the corresponding window number, and the block end field is used for recording the end block number in one physical block section in the corresponding window number.

In one exemplary embodiment, the block type field includes a compressed block number field, wherein the compressed block number field is used to record the number of data blocks obtained by compression; and/or, under the condition that compression of the second file data allows adoption of multiple storage modes, the window number field and the physical block information field with the corresponding relation also comprise the window number field and the storage mode field with the corresponding relation, wherein the multiple storage modes at least comprise: and (3) non-compressed storage, wherein each compression algorithm in one or more compression algorithms is stored, and a storage mode field is used for recording the identification of the storage mode adopted by the corresponding window number.

In this embodiment, the index block is used to manage the mapping relationship between the window number after the file is partitioned and the physical file block, and the physical block attribute, and according to the size of the partition specified by the user, one index block manages the mapping between a segment of logical space and a segment of physical space, and when the index in the index block is insufficient, one physical block can be allocated from the idle block as the index block.

An index block, in the form of fig. 20, contains indices of equal size equal to: window number (4B) +physical block number (4B) +compressed block number (4B) +compression algorithm (4B) + (window length/4 KB), using the window number for lookup. The index contains the following fields:

window number: after the logic file is segmented, a compression window is formed, and compression core decompression is carried out on the logic file by taking the compression window as a unit. When a user reads and writes data in a certain window, the corresponding physical block is searched by taking the window number as an index.

Physical block number: the number of storage blocks occupied by the data after block compression and the patch blocks together.

Number of compressed blocks: the method is used for recording the number of compressed blocks after the file is compressed, and the number of patch blocks is the difference value between the number of physical blocks and the number of compressed blocks. The number of patch blocks may also be recorded in the index block with a small number of modifications, the patch blocks being physically located behind the compressed blocks, the number of compressed blocks being the difference between the number of physical blocks and the number of patch blocks.

Block numbers 1-N: bit 0 of the block number is used to indicate the type of block: original blocks, compressed blocks, segment patch blocks, page patch blocks. Bits 1-4 of the block number are used for representing the logical page number corresponding to the page patch page; the remaining bits are used to represent the corresponding physical block number after compression within the window.

Compression algorithm: bits 0-1 represent the scene when the compression window drops: forced drop instructions, cache release instructions, timed write-back instructions, and optimized store instructions. The remaining bits represent the compression algorithm ID number used by the data within the window, and are represented by 0 when stored uncompressed.

In an exemplary embodiment, in the case that the number of consecutive physical block intervals in the corresponding window number is 1, the block start field is used to record a start block number in one physical block interval in the corresponding window number, and the block end field is used to record an end block number in one physical block interval in the corresponding window number; when the number of consecutive physical block intervals in the corresponding window number is greater than 1, the block start field is used for pointing to interval index information of the corresponding window number, the interval index information is used for recording each physical block interval in the plurality of physical block intervals, and the block end field is used for recording an end block number in the last physical block interval in the corresponding window number.

In this embodiment, the index block adopts an equal amount of metadata allocation method for the logical blocks and the physical blocks, where one logical block corresponds to one physical block and one 8B metadata is required. Technically, the method of using interval blocks can be extended, and the required metadata is reduced by using the beginning and the end of an interval expression block. The free space preferentially allocates a continuous section, and when no continuous space is allocatable, a plurality of section blocks are allocated, and the inside of the section blocks is continuous. Fig. 24 is a schematic diagram of meta information of another file according to an embodiment of the present application, as shown in fig. 24, the number of intervals is used to indicate that after the compression of the partitioned interval, there are several consecutive physical block intervals, and when there is only one interval, the scene is a vast majority of scenes because continuous space is allocated as much as possible by using the block start and the block end in the index block. When larger than one interval, the block start points to another index block, the index block comprises a plurality of intervals with indefinite length, and the block end represents the last data block of the last interval. The beginning of a block and the end of a block are used to represent the beginning and end of a block.

In one exemplary embodiment, after the block number of each physical block of data is acquired, the free space management information may also be generated as reference meta information, but is not limited to, by: and recording the block numbers with the corresponding relation and the idle marks to obtain idle space management information corresponding to the third file data, wherein the reference element information further comprises the idle space management information, and the idle marks are used for indicating whether the corresponding physical blocks are idle or not.

In the present embodiment, the idle condition of the physical block may be managed by the idle flag, but not limited to, using the idle space management information as the reference meta information. The idle identity may be 0 and 1,0 identity idle and 1 identity occupied.

In one exemplary embodiment, the free space management information corresponding to the third file data may be obtained by, but not limited to, recording the block number and the free identifier having the correspondence relationship in the following manner: under the condition that data physical blocks are redistributed for compressed data blocks in a block interval, recording a block number and a first idle mark of an original data physical block with a corresponding relation, and a block number and a second idle mark of the redistributed data physical block with the corresponding relation, wherein the original data physical block is the data physical block occupied by the block interval in a first physical file, the first idle mark is used for indicating that the corresponding physical block is idle, and the second idle mark is used for indicating that the corresponding physical block is not idle.

In this embodiment, if a new physical block of data is reassigned to the third file data, the physical block it would otherwise occupy may be released and marked as free.

In one exemplary embodiment, the third file data may be obtained by, but is not limited to, allocating physical blocks of data for the compressed data blocks in each chunk interval in the physical space of the first physical file according to the relationship between the reference compression parameter and the initial compression parameter by: comparing the first storage space with the second storage space, wherein the reference compression parameter comprises the first storage space, the initial compression parameter comprises the second storage space, the first storage space is the storage space required by the compressed data blocks in the reference block interval of the current allocated data physical block in the reference compressed data, and the second storage space is the storage space occupied by the reference block interval in the first physical file; under the condition that the first storage space is larger than the second storage space, clearing data in the first physical file between the reference blocks, and acquiring idle blocks conforming to the first storage space from the first physical file; adding the data of the compressed data block of the reference block interval into the acquired idle block; under the condition that the first storage space is smaller than or equal to the second storage space, clearing data of the reference block interval in the first physical file; the data of the compressed data block of the reference block section is added to the data physical block where the data is clear.

In this embodiment, if the physical space originally occupied by the file data is insufficient to store the accessed file data, the free blocks may be reallocated for storing the accessed file data.

In one exemplary embodiment, the free blocks that fit in the first storage space may be obtained from the first physical file by, but are not limited to: searching a first continuous idle block conforming to the first storage space from a first physical file; under the condition that the first continuous idle block is found, determining the first continuous idle block as the acquired idle block; under the condition that the first continuous idle block is not found, applying for a second continuous idle block conforming to the first storage space after the first physical file; and determining the second continuous idle block as the acquired idle block.

In this embodiment, a continuous storage space is preferentially allocated to the physical file, so that the read-write efficiency of data is improved.

In one exemplary embodiment, a first contiguous free block conforming to a first memory space may be found from a first physical file by, but is not limited to: determining an operation heat of the reference block section, wherein the operation heat is used for indicating the frequency of the reference block section operated; and searching a first continuous idle block conforming to the first storage space from the first physical file under the condition that the operation heat is greater than a heat threshold.

In this embodiment, continuous storage positions are allocated according to the operation heat of the partition section, if the operation heat is greater than the heat threshold, the frequency of the reference partition section being operated is higher, and continuous space is preferentially allocated for storage, so as to ensure that frequently accessed data can be accessed smoothly and efficiently.

In one exemplary embodiment, after determining the operating heat of the reference block interval, the manner of storing the data may be optimized, but is not limited to, by: determining a storage difference between the first storage space and the second storage space when the operation heat is greater than a heat threshold; searching a target idle block conforming to the storage difference value from a first physical file; and determining the data physical block and the target idle block occupied by the reference block interval in the first physical file as the acquired idle block.

In this embodiment, the continuous storage locations are allocated according to the operation heat of the partition, and if the operation heat is less than or equal to the heat threshold, it indicates that the frequency of the reference partition being operated is low, and it is prioritized that the storage space that is originally occupied by the reference partition occupies other storage space after the storage space is occupied by the reference partition.

In an exemplary embodiment, the third file data and the reference meta information may be, but are not limited to, inserted into the first physical file to obtain a second physical file corresponding to the target logical file and carrying the target meta information in the following manner: and inserting the third file data into the physical space where the first physical file is located according to the reference meta information, and updating the initial meta information by using the reference meta information to obtain a second physical file carrying the target meta information.

In this embodiment, the file data is dynamically inserted into the physical space where the physical file is located, and the original initial meta information is updated by using the dynamically generated reference meta information, so as to obtain the second physical file carrying the target meta information.

In one exemplary embodiment, the initial meta information may be updated using the reference meta information, but is not limited to, by: and updating initial index information included in an initial index block in the first physical file by using reference index information included in the reference meta information to obtain a target index block, wherein the initial meta information comprises the initial index information, and the reference index information is used for recording a window number of a block interval included in third file data with a corresponding relationship and a block number of a data physical block occupied by the third file data in a physical space.

In the present embodiment, the update of the meta information includes an update of the index information in the index block.

In one exemplary embodiment, the initial index information included in the initial index block in the first physical file may be updated using the reference index information included in the reference meta information, but is not limited to, by: searching a target window number recorded in the reference index information in the initial index block; the block number below the target window number in the initial index block is replaced with the block number recorded in the reference index information.

In this embodiment, the window number is used as an index to search, and the information under the searched window number is replaced.

In one exemplary embodiment, after updating the initial index information included in the initial index block in the first physical file using the reference index information included in the reference meta information, the free space management information in the meta information may be dynamically updated, but is not limited to, by: and updating initial idle space management information included in the idle space management block in the first physical file by using the idle space management information included in the reference meta information to obtain a target idle space management block, wherein the initial meta information comprises the initial idle space management information, the idle space management information is used for recording a block number and an idle identifier of a data physical block occupied by third file data with a corresponding relationship in the physical space, and the idle identifier is used for indicating whether the corresponding physical block is idle or not.

In the present embodiment, the updating of the meta information further includes updating of the free space management information. Fig. 25 is a schematic diagram of a free space management block in meta information of a file according to an embodiment of the present application, where the free space management block records free space management information, as shown in fig. 25, and is used for managing free space in a physical file, and when the file is opened, all the free blocks are loaded into a memory through a linked list and used as meta data of the file. The form of the free space management block is as follows: the use of 1bit in the free block indicates whether a physical block of 4KB size is used. And when the other space blocks except the first free block are insufficient free space in the physical file, dynamically generating, and inserting the free blocks into the tail part of the physical file by increasing the size of the physical file. The tail of the idle block uses 16 bytes, the offset position and the length of the next idle block are stored, when a large amount of data is written into the file at one time, a plurality of idle blocks can be continuously generated, and the tail 16 bytes between the adjacent idle blocks are empty. By reading the free blocks in the physical file, a free block linked list can be formed for managing free space.

In one exemplary embodiment, after updating the initial free space management information included in the free space management block in the first physical file using the free space management information included in the reference meta information, file management information in the meta information may be updated, but is not limited to, by: updating a file management block in the first physical file according to the target index block and the target free space management block to obtain a target file management block, wherein the initial meta information comprises the file management block, the file management block is used for recording the file information of the first physical file, the position of the initial index block and the position of the free space management block, the target file management block is used for recording the file information of the second physical file, and the position of the target index block and the position of the target free space management block.

In the present embodiment, the updating of the meta information further includes updating of the file management information. Fig. 26 is a schematic diagram of a file management block in meta information of a file according to an embodiment of the present application, and as shown in fig. 26, the encoding rule of the file management block in a physical file is as follows: and the file management block is used for managing and managing the overall information of the file blocks and the positions of the index blocks. The method comprises the following steps: magic number: for managing block flags. Version number: for version management. Free space block address: the head address of the free space block in physical space is marked. Index block address: the method is used for marking the address of the index block in the physical file, the management mode is similar to ext3, 3-level indexes are used, one-level indexes directly point to the index block and are used for searching the mapping relation between the logic space and the physical space, two-level index blocks point to a plurality of one-level index blocks, and three-level index blocks point to a plurality of two-level index blocks.

In an exemplary embodiment, the second file data may be obtained by, but not limited to, decompressing the first file data corresponding to the target data in the first physical file according to the initial meta information and performing the target access operation in the following manner: extracting first file data corresponding to the target data from the first physical file according to the initial meta information; decompressing the first file data to obtain decompressed data; and executing target access operation on the decompressed data to obtain second file data.

In one exemplary embodiment, the first file data corresponding to the target data may be extracted from the first physical file according to the initial meta information by, but not limited to: determining a target block section in which target data are located in the block sections of the target logic file; searching a target data physical block corresponding to the target block interval from the initial meta information; the target data physical block is extracted from the first physical file as first file data.

In this embodiment, corresponding data is extracted according to the partition intervals, first, a target partition interval in which target data is located in the partition intervals of the target logical file is determined, and a target data physical block corresponding to the target partition interval is searched from the initial meta information, for example: and extracting file data from the first physical file according to the corresponding physical block number and the target data physical block.

In order to support random access to files, the methods of using the index block and the free space management block are as follows:

during reading operation, the cache module splits the read-write request according to the block interval. And obtaining an index block from the coding module, reading the index of the index block, obtaining physical blocks corresponding to each block interval, and performing decompression patching operation after reading the physical blocks.

For example, fig. 27 is a schematic diagram of a file reading operation procedure according to an embodiment of the present application, and as shown in fig. 27, a user randomly reads data in the range of 12KB to 20KB of the file; the cache module is used for finding that the 16KB is taken as a block interval in the encoding module, splitting the request into 2 read requests, wherein the read requests are respectively 0KB-16KB and 16KB-32KB, and the corresponding compression window numbers are 0 and 1; reading the index block to obtain a physical block corresponding to the 0,1 compression window; for 0-16KB, the 3,4 physical block is read, the 3 physical block is decompressed, and then patched with the 4 physical block. For 16KB-32KB, reading 5,6 physical blocks, and decompressing; the buffer memory module obtains 0-32KB buffer memory data through the coding module; the VFS interface module extracts data between 12KB and 20KB from the 0-32KB cache.

During writing operation, the cache module splits the read-write request according to the block division interval defined by the encoding module. Writing data into a cache module, and when the encoding module encodes the modified data in the cache module, if the data is modified only by a small amount, estimating the modification by a cost estimation module as a patch mode, and marking a modification position by using a patch page; if the data is more modified, the block interval is recompressed, then the idle block is searched for writing, and if the original space is insufficient due to the change of the compression ratio after compression, the original space is released, and a new idle space block is searched for storage; if the modified interval is non-compressed storage, only the corresponding position data block in the physical space is modified, and compression calculation is not performed.

For example, fig. 28 is a schematic diagram illustrating a writing operation process of a file according to an embodiment of the present application, as shown in fig. 28, a user writes data between 8KB and 9KB, and since the modified content is less, the cost estimation module finds that the compression is higher in the original interval, and records the modified content in a patch page manner. The user writes data into the space between 20KB and 24KB, and the original interval compression ratio is less due to more modified contents, so that the mode of recompressing the interval is adopted, the compression ratio after compression is changed from original 2:1 to 4:3, and the original position cannot be stored, so that the original position is released from the idle blocks, and the modified compressed data are stored by finding out 3 continuous idle blocks from other positions. And modifying field information in the index at the same time, and marking the physical block corresponding to the window interval. Noteworthy are: the mode of selecting 2 original spaces to apply for 3 new spaces instead of covering the original positions and applying for 1 new free block is to allocate continuous storage space as much as possible because of discontinuous reading of the continuous reading performance of the normal storage medium. The user writes data between 40KB and 44KB, and the section adopts an uncompressed storage mode (determined by a cost estimation module), so that the modified data is directly written according to the block size, and compression calculation is not performed.

The set of file block compression coding mode provided by the embodiment is dynamically inserted into the free space management block and is used for distributing physical storage space; dynamically inserting an index block for managing mapping relations before and after block compression; and introducing patch blocks for reducing IO and compression calculation delay under the scene of small amount of modification.

In the coding module, variable-length data blocks and free spaces are managed through dynamic insertion of meta-information so as to support random read-write access after file block compression. By inserting the free space management block in the file, a certain volume of free space is managed. When the free space in the file is insufficient, an index is inserted into the tail part of the free space block to point to the newly produced free space block, and the newly produced free space block is used for managing the storage space of the newly added part of the file. By inserting an index block into a file, the mapping relation between the logical space and the physical space of the file in a certain range is managed, the logical space of the file is fixed in length, and the physical space is prolonged. The indexing of the entries in the block, also manages the marking of the physical space, includes: whether to compress and store, the compression algorithm adopted, and the number of compressed blocks and patch blocks. By inserting a management block in the file, the file is managed: index block location, free block location, file logical and physical size, compression algorithm flag, read-write access statistics. The patch blocks act on a small number of modified compression blocks, so that IO consumption and repeated IO calculation are reduced. The mixed storage of different compression algorithms is supported, the non-compressed storage in the file blocks can be also realized, and the non-compressed storage is carried out on the file blocks with more read and amplification or low compression ratio, so that the reading performance is improved. The allocation of free blocks allows for continuous allocation under burst writing to reduce performance delay caused by non-continuous reading of free blocks. And selecting the continuous storage space and the discontinuous storage space according to the continuous access performance and the random access performance of different storage media. And triggering defragmentation according to the free space defragmentation threshold value, and continuously storing the physical file blocks. Recording the access heat of the file blocks, and storing the file blocks with higher access heat in a physical continuous way preferentially, so as to improve the reading performance. In order to improve the sequential access performance of the data, all the written data are uniformly written into the file in an additional mode, the original position is marked as a failure position (the idle space block needs to be in a 2bit representation state), and when the system is idle, the defragmentation is carried out again in a block recycling mode.

For a write-back module in the file system, the write-back module is used for carrying out write-back operation on the encoded data, writing the encoded data into the underlying file system, and is divided into the following steps from the fact of write-back: forced write back and periodic write back, forced synchronous call (flush) entered by the user, indicates that the user needs to drop the data immediately, at which point the user's corresponding delay is emphasized, rather than overall throughput. Therefore, the caching module, the compression module and the encoding module can execute operations on the file data which is cached, and the write-back module blocks waiting for the execution of the underlying file system to complete after writing the data into the underlying file system. The user does not input forced synchronous call, and the whole throughput rate is focused at the moment, and when the data quantity needed to be written back in the cache is more or the interval time is longer, batch write back operation is executed, so that the whole throughput rate is improved.

Transparent compression and decompression of the file system are realized in a user mode, and after the file system is compatible with the existing file system without reformatting a hard disk, data is copied again to realize compression storage; the kernel of the operating system is not required to be modified, and the downtime risk of the system is not introduced; the user mode application program is not required to modify codes for adapting and compatibility. By providing a method for dynamically inserting metadata, the index management and the free space management of the compressed storage blocks are carried out, and the random access to the compressed file is supported. Users can set file system parameters through the VFS interface to realize customized compression and read-write performance optimization, such as: configurable compression block size, compression ratio threshold, compression algorithm selection. And by means of a user state cache management strategy, the common pages are cached, the read-write locality characteristics of the pages are tracked, the pages with poor locality are stored in an uncompressed mode, the read amplification effect caused by random access is reduced, and IO consumption is reduced. And adopting a patch page mode for the compressed pages, and reducing repeated compression calculation and IO operation. By delaying the compression strategy, frequent compression and writing problems caused by random writing are reduced. And the compression calculation under different scenes is respectively processed through the cost estimation strategy, the data blocks with small compression gain are not compressed and stored, and the management strategy for non-compressed data is added, so that the random read-write performance is improved.

In this embodiment, the file system described above may be, but is not limited to, applicable to the following scenarios:

the file system installed by the user does not support transparent compression, but the user has the requirements of saving storage space and improving read-write performance, and meanwhile, random access to the file needs to be supported.

Users are reluctant to change to kernel file systems that support transparent compression, for reasons that may include: the storage medium or usage scenario is not applicable, the time cost of data re-copying, loss of original file system characteristics, unknown risk of migration to the new file system.

Users are reluctant to pay additional cost to purchase expensive computing storage hardware, and it is more desirable to implement compressed storage of data at a lower cost.

User data may not all achieve the desired compression ratio, and it is desirable to be able to control the compression of files in a more flexible manner, including: the method comprises the steps of performing compression storage on files in part of folders, setting compression window sizes for different files to achieve optimal access efficiency, designating different compression algorithms for different files, supporting external compression and decompression hardware accelerator cards and the like.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.

In this embodiment, the term "module" may be a combination of software and/or hardware that implements the predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.

The embodiment of the application also provides an operating system which comprises an application layer and a local file system layer, wherein a middle layer is further added between the application layer and the local file system layer, the middle layer is allowed to track the read-write characteristics of the application layer on files, and the middle layer deploys the file system.

The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, the processor is provided with the file system, and the processor realizes the operation of the file system when executing the computer program.

In an exemplary embodiment, the electronic device may further include a transmission device connected to the processor, and an input/output device connected to the processor.

Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.

The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principles of the present application should be included in the protection scope of the present application.

Claims

1. A file system, characterized in that,

the file system is disposed in an intermediate layer added between an application layer and a local file system layer in an operating system, the intermediate layer is allowed to track read-write characteristics of files in the application layer, and the file system comprises: the device comprises a buffer module, a compression module and an encoding module, wherein,

the caching module is used for caching the logic file written in the file system according to the target read-write characteristics of the logic file corresponding to the application layer to obtain user data of the logic file; detecting whether the disc-falling time of the logic file is reached or not; sending a compression request to the compression module under the condition that the arrival of the landing time is detected;

the compression module is used for responding to the received compression request and compressing the user data of the logic file to obtain the compressed data of the logic file;

The encoding module is configured to encode the compressed data of the logical file into a continuous physical file, where the physical file further records a mapping relationship between a physical space where the physical file is located and a logical space where the logical file is located, and the physical file is used for landing.

2. The file system of claim 1, wherein the file system is configured to store the file system,

the cache module is used for:

obtaining the block length of a block interval of the logic file;

dividing the block length into a minimum number of length values, wherein each length value is an integer multiple of a block unit supported by a cache space;

distributing the minimum number of cache blocks for each block interval of the logic file according to the minimum number of length values, wherein the length values correspond to the cache blocks one by one, and each cache block is a continuous cache space;

and caching the data of each block interval of the logic file to a group of the minimum number of cache blocks to obtain the user data of the logic file.

3. A file system according to claim 2, wherein,

the cache module is used for:

extracting a length value of the maximum integer multiple of the block unit from a remaining length from the block length until the remaining length is less than or equal to the block unit, wherein the remaining length is a difference between the block length and the extracted data length;

And determining a length value extracted from the block length and a length value of the block unit as the minimum number of length values.

4. A file system according to claim 2, wherein,

the compression module is used for: determining the number of units of the compression calculation units and the cache occupation amount of each compression calculation unit according to the cache capacity of the compression calculation and the maximum length value in the minimum number of length values; creating one or more target computing units according to the number of the units, and distributing occupied caches for each target computing unit according to the occupied cache amount;

the buffer module is used for determining the transmission time interval between the minimum number of buffer blocks according to the data transmission time and the data compression time of the unit data; and transmitting the least number of cache blocks in the corresponding block interval to each target computing unit according to the transmission time interval.

5. The file system of claim 4, wherein the file system is configured to store the file system in a file system,

the cache module is used for:

constructing direct memory access descriptors of the current buffer blocks to be transmitted in the minimum number of buffer blocks under the condition that the transmission time interval is reached;

And transmitting the current buffer memory block to be transmitted to a corresponding target computing unit through the direct memory access descriptor.

6. A file system according to claim 2, wherein,

the buffer module is further configured to one of the following:

determining the block length according to the file block size expansion attribute edited by the user;

determining the block length according to the file types supported by the file system;

the storage unit of the database used by the file system is determined as the block length.

7. The file system of claim 1, wherein the file system is configured to store the file system,

the cache module is further configured to:

distributing block indexes to the block intervals of the logic file and extracting meta-information of each block interval of the logic file, wherein the meta-information comprises: the system comprises a state field of a cache block, a cache block pointer array field and a descriptor field, wherein the state field of the cache block is used for recording information of a cache block included in each block interval to be operated, the cache block pointer array field is used for recording the position of the cache block included in each block interval, and the descriptor field is used for recording a descriptor used for carrying out data transmission on the cache block included in each block interval;

And constructing an index management structure of meta information of the block intervals of the logic file according to the block index.

8. The file system of claim 7, wherein the file system is configured to store the file system,

the status field of the cache block includes: a block index field, the status field of the cache block, further comprising at least one of: the system comprises an acceleration card synchronization field, an input and output synchronization field, a locking field, a compression mark field and a statistic information field, wherein the acceleration card synchronization field is used for recording whether a corresponding block interval is transmitted to a compression module, the input and output synchronization field is used for recording whether the corresponding block interval is completed in a disc, the locking field is used for recording whether the corresponding block interval is allowed to be released or exchanged into a disc, the compression mark field is used for recording whether the corresponding block interval is completed in compression, and the statistic information field is used for recording the statistic information generated in the operation process of the corresponding block interval.

9. The file system of claim 1, wherein the file system is configured to store the file system,

the buffer module comprises: the storage space stores compressed data of the plurality of block intervals, and the compressed data of each block interval comprises a compressed page;

The cache module is used for executing the modification operation of the user on the logic file to obtain a modified target user page;

the compression module is used for generating a patch page corresponding to the target user page under the condition that the target block interval where the target user page is located is not recompressed, wherein the patch page is used for recording the modification position and the modification content of the modification operation to the logic file; and storing the patch page into a target compressed page of the target block interval in the storage space, wherein the patch page is used for executing the modification operation on a decompressed page of the target compressed page.

10. The file system of claim 9, wherein the file system is configured to store the file system,

and recording each modification position of the modification operation on the logic file and the position of the modification content corresponding to each modification position in the patch page from top to bottom, and recording the modification content of the modification operation on the logic file from bottom to top.

11. The file system of claim 1, wherein the file system is configured to store the file system,

the file system further includes: a cost evaluation module, wherein,

The buffer module is used for detecting the read amplification coefficient of the user data in each block interval of the logic file, wherein the read amplification coefficient is used for indicating the degree of the read amplification phenomenon of the data;

the cost evaluation module is configured to determine compression information when the user data in each block interval is recompressed according to the read amplification factor, where the compression information includes: whether compression and compression algorithms;

and the compression module is used for recompressing the user data of each block interval according to the compression information.

12. The file system of claim 11, wherein the file system is configured to store the file system,

the cache module is used for:

detecting the data quantity read out and the data quantity accessed by a user when each reading operation is executed on the user data of each block interval;

calculating the ratio between the read data quantity and the data quantity accessed by the user to obtain the reading amplification ratio of each reading operation;

and determining the accumulated sum of the read amplification ratios in a target time window as the read amplification ratio.

13. The file system of claim 11, wherein the file system is configured to store the file system,

the cost evaluation module is further configured to:

Predicting compression performance benefits of user data of each block interval of the logical file according to the hardware resource attribute of the file system;

and controlling the compression module to compress the user data of each block interval according to the compression performance gain.

14. The file system of claim 1, wherein the file system is configured to store the file system,

the cache module is used for:

detecting a read-ahead coefficient of a user on the logic file, wherein the read-ahead coefficient is used for indicating the continuous read size which is used most by the user on the logic file;

and when the next reading operation of the user on the logic file is detected, reading the physical file of the logic file according to the pre-reading coefficient, and decompressing the read physical file in parallel.

15. The file system of claim 1, wherein the file system is configured to store the file system,

the cache module is further configured to:

before the logic file written in the file system is cached, detecting the type of an acceleration card used by the file system, wherein the acceleration card is used for hardware acceleration for data compression or decompression;

and creating a cache space matched with the accelerator card type, wherein the logic file is cached in the cache space.

16. The file system of claim 1, wherein the file system is configured to store the file system,

the buffer module is further configured to at least one of:

under the condition that a disc-falling instruction initiated by a user is received, determining that the disc-falling time is reached;

determining that the landing time is detected to be reached under the condition that the target period is detected to be reached;

and determining that the landing time is reached when the cached data volume is detected to be greater than or equal to a data volume threshold.

17. The file system of claim 1, wherein the file system is configured to store the file system,

the physical file comprises an index block, wherein the index block is used for recording the mapping relation between the physical space where the physical file is located and the logical space where the logical file is located.

18. The file system of claim 17, wherein the file system is configured to store the file system,

the index block includes: the window number field is used for indicating the number of the block interval of the logic file, and the physical block field is used for indicating the physical block corresponding to the block interval of the logic file in the physical file.

19. The file system of claim 18, wherein the file system is configured to store the file system,

the physical block field includes: the system comprises a physical block digital section, a compressed block digital field, a block number field and a compression algorithm field, wherein the physical block digital section is used for representing the number of physical blocks occupied by a corresponding block interval, the compressed block digital section is used for representing the number of compressed blocks in the physical blocks occupied by the corresponding block interval, the block number field is used for representing the number of the physical blocks occupied by the corresponding block interval, and the compression algorithm field is used for representing the compression algorithm adopted by the corresponding block interval.

20. The file system of claim 17, wherein the file system is configured to store the file system,

the physical file also comprises a free space management block, wherein the free space management block is used for recording whether each physical block in the physical file is free.

21. The file system of claim 20, wherein the file system is configured to store the file system,

the header of the physical file further comprises a file management block, wherein the file management block is used for recording the position of each index block and the position of the free space management block in the physical file.

22. An operating system, characterized in that,

The operating system comprises an application layer and a local file system layer, wherein an intermediate layer is further added between the application layer and the local file system layer, the intermediate layer is allowed to track the read-write characteristics of the application layer on files, and the intermediate layer deploys the file system as claimed in any one of claims 1-21.

23. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that,

the processor having disposed thereon a file system as claimed in any of claims 1-21, the processor implementing the running of the file system when executing the computer program.