CN109101639B

CN109101639B - Aggregation mode for improving performance of file system

Info

Publication number: CN109101639B
Application number: CN201810948103.9A
Authority: CN
Inventors: 吴火城
Original assignee: Cyphy Technology Xiamen Co ltd
Current assignee: Cyphy Technology Xiamen Co ltd
Priority date: 2018-08-21
Filing date: 2018-08-21
Publication date: 2021-03-23
Anticipated expiration: 2038-08-21
Also published as: CN109101639A

Abstract

The invention discloses an aggregation mode for improving the performance of a file system, which does not generate a temporary large file, and sequentially reads each small file as a large file to upload by splicing file header information so as to avoid one-time file writing operation to a disk and subsequent one-time file reading operation, wherein the head of the aggregation file stores database information of all files in the aggregation file, a metadata base of the file system is damaged, the database information is recovered according to the file header information of the aggregation file, when a backup file needs to be recovered, the whole aggregation file is not read to recover all the aggregation files, but one small file in the aggregation file is recovered independently, the waste of system resources and the long-time recovery are avoided, the metadata base information corresponding to all the files is recorded to the head of the file, the method avoids reading the whole aggregation file when the database is restored, and only the file head needs to be read.

Description

Aggregation mode for improving performance of file system

Technical Field

The invention relates to the field of big data storage, in particular to an aggregation mode for improving the performance of a file system.

Background

The blue-ray equipment realizes the data storage for more than 10 years by burning the data on the blue-ray disc without manual interference. And because the storage capacity of the blue-ray disc is large, a large amount of data can be stored for a long time.

The cloud platform provides storage service by providing a use mode of object storage for the outside. The data file is saved to the disk. The filing system automatically operates in the background, and automatically backs up the files to the blue-ray equipment under the condition of meeting the conditions according to the rules configured by the user.

The archiving system is slow backup software running at the back end of the cloud platform. The method realizes the backup of the files on the disk to the blue-ray equipment, and can restore the files when the files are needed.

The size range of the file uploaded by the user is basically unlimited, and the size range of the file uploaded by the user can be only a few bytes, or hundreds of G, and dozens of T.

The blue light device has 2 physical requirements: the files uploaded by the blue-ray equipment are required to be between 256MB and 5GB, and the physical characteristics of the burning times of the blue-ray equipment are required to be within 7000 times. And the rate at which data files are uploaded to the blu-ray device in the range of about 4GB is highest.

Based on the above requirements of these blu-ray devices, the archiving system must use an aggregation mode: a large number of small files are aggregated into a large file of about 4G, and then the large file is backed up on a blue-ray device through a network interface. There are 3 of these benefits: 1. the speed of file transmission on the network is improved; 2. the burning times on the blue light equipment are reduced, and the service life of the equipment is delayed; 3. the number of files below one volume above the blue-ray equipment is reduced, and the efficiency of the blue-ray equipment is improved.

Disclosure of Invention

The present invention is directed to an aggregation module for improving the performance of a file system, so as to solve the problems mentioned in the background art.

In order to achieve the purpose, the invention provides the following technical scheme:

an aggregation mode for improving the performance of a file system comprises the following specific steps:

(1) a database, a compression unit, an aggregation storage unit, a blue light equipment system, a reading unit and a database updating unit are sequentially arranged in the aggregation mode system, and system data read in the reading unit are fed back to the aggregation storage unit;

(2) the filing system is internally provided with a filing task library and a recovery task library.

As a further scheme of the invention: and (2) storing files to be aggregated in the database in the step (1), and scanning the database to obtain a file set to be archived.

As a further scheme of the invention: and (2) the compression unit in the step (1) performs compression operation on the file needing to be compressed.

As a further scheme of the invention: in the aggregation mode of the files in the aggregation storage unit in the step (1), a temporary large file is not generated, and the small files are sequentially read by splicing file header information and uploaded as a large file, so that one-time file writing operation and one-time subsequent file reading operation are avoided, and the database information of all files in the aggregation file is stored in the header of the aggregation file. The essence of this technique is to read all file information, all file contents in turn into a large block of memory, and then upload this block of memory as a large temporary aggregate file to the blu-ray device.

As a further scheme of the invention: when the file needs to be restored, the restoration task library in the step (2) searches a metadata library of the filing task library according to the information of the object to obtain the information of the file stored on the blue-ray equipment system: and the IP, the volume, the file name, the offset in the aggregation file, the file size and whether the file is compressed in the aggregation file of the blue-ray equipment system are respectively read according to the information on the blue-ray equipment and the offset in the aggregation file, and the sub-file is decompressed according to whether the sub-file is compressed or not to be restored into the original file.

As a still further scheme of the invention: the reading unit in step (1) reads the metadata base information of the files, and the files are sequentially read out, loaded into a buffer with a fixed size, such as 512KB, and then uploaded to the blu-ray device system as a single file.

As a still further scheme of the invention: and (2) updating the database storage data by the database updating unit in the step (1) to perform the next round of aggregation storage, so that the filing task is complete.

As a still further scheme of the invention: and the metadata database of the filing task library is damaged, so that the database information can be recovered according to the file header information of the aggregation file. The metadata base information corresponding to all the files is recorded to the head of the file, so that the reading of the whole aggregated file is avoided when the database is restored, the head of the file only needs to be read, and the metadata base information of the key is put together, so that the reading efficiency of the optical disc can be improved. When the backup file needs to be restored, the whole aggregation file is not read and all the aggregation files are restored, but one small file in the aggregation files is independently restored, so that the waste of system resources is avoided, and the restoration needs to be carried out for a long time.

Compared with the prior art, the invention has the beneficial effects that: in the aggregation mode, a temporary large file is not generated, and the small files are sequentially read by splicing file header information and uploaded as a large file, so that one-time file writing operation and subsequent one-time file reading operation are avoided. The header of the aggregated file stores the database information of all files in the aggregated file, the metadata base of the file system is damaged, the database information can be recovered according to the file header information of the aggregated file, the metadata base information corresponding to all the files is recorded to the header of the file, the situation that the whole aggregated file is read when the database is recovered is avoided, only the header of the file needs to be read, and the key metadata base information is put together, so that the reading efficiency of the optical disc can be improved. When the backup file needs to be restored, the whole aggregation file is not read and all the aggregation files are restored, but one small file in the aggregation files is independently restored, so that the waste of system resources is avoided, and the restoration needs to be carried out for a long time.

Drawings

FIG. 1 is a block diagram of a flow diagram in an aggregation mode for improving file system performance.

FIG. 2 is a block diagram of an embodiment of an aggregation mode for improving file system performance.

FIG. 3 is a diagram illustrating an aggregate file structure in an aggregate mode for improving file system performance.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the embodiment of the invention, a convergence mode for improving the performance of a file system comprises the following specific management steps:

(1) the aggregation mode system is internally provided with a database, a compression unit, an aggregation storage unit, a blue-ray equipment system, a reading unit and a database updating unit in sequence, system data read in the reading unit is fed back to the aggregation storage unit, files to be aggregated are stored in the database, a file set to be filed is obtained by scanning the database, the compression unit compresses the files to be compressed, the aggregation mode of the files in the aggregation storage unit is not to generate a large file stored to a disk, all small files are read in sequence by splicing file header information and are uploaded as a large file (which is equivalent to generating an aggregation file in a memory), so that the operation of writing the file to the disk once and the subsequent operation of reading the file once are avoided, and the database information of all the files in the aggregation file is stored in the head of the aggregation file, the reading unit sequentially reads the metadata base information of the files, each file is loaded into a buffer memory with a fixed size such as 512KB (when network transmission is carried out, a large block data mode with a fixed size is used, the uploading speed is the highest), then the files are used as a single file and uploaded to the blue-ray equipment system, the database updating unit updates the database storage data, the next round of aggregation storage is carried out, and the filing task is complete.

(2) The filing system is internally provided with a filing task library and a recovery task library, when the file needs to be recovered, the recovery task library searches a metadata library of the filing task library according to the information of the object to obtain the information of the file stored on the blue-ray equipment system: the IP, the volume, the file name, the offset in the aggregation file, the file size and whether the file is compressed in the aggregation file of the blue-ray equipment system are independently read according to the information on the blue-ray equipment and the offset in the aggregation file, decompression operation is carried out according to whether the file is compressed or not, the original file is restored, the metadata base of the task base is filed to be damaged, and database information can be restored according to the file header information of the aggregation file. The metadata base information corresponding to all the files is recorded to the head of the file, so that the reading of the whole aggregated file is avoided when the database is restored, the head of the file only needs to be read, and the metadata base information of the key is put together, so that the reading efficiency of the optical disc can be improved. When the backup file needs to be restored, the whole aggregation file is not read and all the aggregation files are restored, but one small file in the aggregation files is independently restored, so that the waste of system resources is avoided, and the restoration needs to be carried out for a long time.

Example (b):

it will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims

1. An aggregation mode for improving the performance of a file system is characterized by comprising the following specific management steps:

(2) a filing task library and a recovery task library are arranged in the filing system;

files needing to be aggregated are stored in the database in the step (1), and a file set needing to be archived is obtained by scanning the database;

the compression unit in the step (1) performs compression operation on the files needing to be compressed, and ignores the past files needing not to be compressed;

in the aggregation mode of the files in the aggregation storage unit in the step (1), a temporary large file is not generated, and the small files are sequentially read by splicing file header information and uploaded as a large file, so that one-time file writing operation and one-time subsequent file reading operation are avoided, and the database information of all files in the aggregation file is stored in the header of the aggregation file.

2. The aggregation mode for improving the performance of the file system according to claim 1, wherein the recovery task library in step (2) searches a meta database of the archive task library according to the information of the object when the file needs to be recovered, so as to obtain the information of the file stored in the blu-ray device system: the IP address, volume, file name, offset in the aggregated file, file size of the blue-ray equipment system, whether the file is compressed in the aggregated file or not, and the section of file is independently read according to the information in the database and the offset in the aggregated file, and is restored to the original file according to whether the file is compressed or not and decompression operation is carried out.

3. The assembly mode for improving the performance of the file system according to claim 1, wherein the reading unit in step (1) reads the metadata base information of the files, and each file is sequentially read out, loaded into a fixed-size cache, and then uploaded to the blu-ray device system as a single file; the method is a virtual technology for gathering files, the content of each file is spliced by splicing file header information, and the spliced data content is sequentially transmitted through a fixed-size cache external interface, so that the external interface considers that the read content is read from an integral large file.

4. The aggregate mode for improving file system performance according to claim 1, wherein the database updating unit in step (1) updates the database storage data for the next aggregate storage until the archiving task is completed.

5. The aggregate mode for improving file system performance of claim 4, wherein the metadata base of the archive system is damaged, and the database information can be recovered according to the file header information of the aggregate file; the metadata base information corresponding to all the files is recorded to the head of the file, so that the reading of the whole aggregated file is avoided when the database is restored, the head of the file only needs to be read, and the metadata base information of the key is put together, so that the reading efficiency of the optical disc can be improved; when the backup file needs to be restored, the whole aggregation file is not read and all the aggregation files are restored, but one small file in the aggregation files is independently restored, so that the waste of system resources is avoided, and the restoration needs to be carried out for a long time.