CN107766374B

CN107766374B - Optimization method and system for storage and reading of massive small files

Info

Publication number: CN107766374B
Application number: CN201610697088.6A
Authority: CN
Inventors: 丁晓杰; 颜新波; 曹敬涛; 朱雷军; 徐启亮
Original assignee: Shanghai Kaixiang Information Technology Co ltd
Current assignee: Shanghai Kaixiang Information Technology Co ltd
Priority date: 2016-08-19
Filing date: 2016-08-19
Publication date: 2021-05-25
Anticipated expiration: 2036-08-19
Also published as: CN107766374A

Abstract

The invention discloses an optimization method and an optimization system for storage and reading of massive small files, which can solve the problem that the service capacity is influenced due to insufficient storage performance of the massive small files and accelerate the overall access efficiency on the premise of not obviously increasing the cost and not obviously changing the flow. The technical scheme is as follows: metadata is stored by adding one or more high-speed disks so as to accelerate the access of data and improve the overall service performance. The newly added high-speed disk and the original partition are integrated into a piece of equipment, the file system which is well formatted is optimized, after the processing is completed, the metadata part of the file system is migrated into the high-speed disk, and the original metadata storage area and the original real data area are used as a new real data area. After that, all operations of reading, writing, adding and deleting of metadata and the like and the operation of the directory are performed on the high-speed disk so as to accelerate the overall access speed.

Description

Optimization method and system for storage and reading of massive small files

Technical Field

The invention relates to the field of data storage in the field of computers, in particular to a technology for reading and optimizing the storage of massive small files.

Background

When a computer stores data, the computer stores data such as files by using software called a file system. In each universal file system, when files are stored, the files are stored in two parts, wherein one part is metadata (inode) and is used as an index; the other part is real data.

When data is read and written, the metadata (inode) is read first, and then the real data is read according to the information in the metadata. The metadata stores a file name, a file creation time, a file owner, and the like, and the most important location where the real data is stored.

At present, a large number of small files exist in internet application, such as video files divided into small segments (ios platform), pictures in a treasure-making network, pictures of news websites and the like, a large website may store more than one billion-level pictures, and the storage and reading efficiency of data becomes a key problem affecting service performance.

The universal file system is suitable for application scenes of large files, and mass small files are not optimized; when a large amount of small files are stored, the times of reading the metadata are equivalent to the times of reading the real data, and the reading of the metadata is not optimized, so that the overall reading performance is poor.

Aiming at the increasing application of a large amount of small files, the industry gradually forms an optimized method: several small files are combined into a medium file to reduce the amount of metadata, and the position information of the small files in the medium file is additionally recorded.

The working principle is as follows: the same physical file is shared by a plurality of logic files, and a plurality of small files are merged and stored into one large file, so that the efficient small file storage is realized. The large file is added with a mapping table, and the combination of the small files into one large file is equivalent to a one-layer small file system, so that the mechanism is suitable for the condition of writing once and reading many times, and the mode of writing many times is not suitable.

The realization mode is as follows: a mapping database, referred to herein as a mapping table, is implemented at the application level. The mapping table stores four tuples: small filename, large filename + start position + length.

The general flow of data access is shown in fig. 1 and 2. Wherein fig. 1 shows a write flow of small file merging: firstly, calling a function by an application program, and inputting a small file name; the file mapping module is used for calculating a hash value according to the input small file name and searching whether the mapping of the small file name exists or not; if the record exists, returning the existing record, wherein the record comprises a large file name, a starting position in the large file and current length information, and if the record does not exist, searching and recording a large file with a residual space; and returning a record, wherein the record comprises the name of the large file, the starting position in the large file and the current length information. The large file is opened directly and writing starts at the aforementioned start position.

Fig. 2 shows a read flow of small file merging. Firstly, according to the small file name, the corresponding large file name and the position information are inquired from the mapping table, and then according to the returned information, the corresponding data is read from the large file.

The advantage of this optimization is that the file fragmentation is reduced much and the amount of metadata is reduced significantly, e.g. 1M small file, 64M one large block, that is, the amount of merged metadata of the file is only 1.5% of the original amount.

However, this optimization also has significant disadvantages:

1. positioning the small file requires accessing the mapping table first and then accessing the designated position pointing to the large file; during reading and writing, the actual access address can be positioned only by two operations; one-time disk access is increased, and the performance is influenced;

2. as the point 1, the mapping table and the operation on the real file need to be transacted, and the transacting has high requirement on the implementation of fault tolerance;

3. the inconsistency of file sizes presents a significant challenge to the implementation.

Disclosure of Invention

The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.

The invention aims to provide an optimization method and an optimization system for storage and reading of massive small files, which can solve the problem that the service capability is influenced due to insufficient storage performance of the massive small files and accelerate the overall access efficiency on the premise of not obviously increasing the cost and not obviously changing the flow.

The technical scheme of the invention is as follows: the invention discloses an optimization method for storage and reading of massive small files, which comprises the following steps:

the method comprises the following steps: forming a logical volume based on the original disk and the newly added disk;

step two: formatting the logical volume;

step three: and performing migration adjustment of the metadata and remapping between data structures, reserving one of the newly added disk and the original disk to store the metadata, and using the other of the newly added disk and the original disk to store real data.

According to an embodiment of the method for optimizing the storage and reading of the mass small files, the first step comprises the following steps:

integrating the original disk and the newly added disk;

respectively creating an original disk and a newly added disk into a physical volume;

stringing the physical volumes into a physical volume group;

all space is drawn from the physical volume group to make a logical volume.

According to an embodiment of the method for optimizing storage and reading of a large amount of small files, in step three, migration adjustment of metadata and remapping between data structures are performed just before formatting is finished and a file system is not used.

According to an embodiment of the method for optimizing storage and reading of the mass small files, the third step further includes:

step 1: reading a first block group in the first block group, copying data from a super block in the first block group to a metadata index table to the initial position of the first block group, and recording the end position as an offset;

step 2: for each subsequent block group in the first element block group, sequentially copying data from a super block in each block group to a metadata index table to an address with the beginning of an offset, updating the numerical value of the offset, and clearing the content of the copied address;

and step 3: for each meta-block group after the first meta-block group, sequentially copying the meta-data in all the meta-block groups to the front part of the file system in the processing manner of the step 1 and the step 2, so that all the meta-data are completely migrated to one of the original disk and the newly added disk;

and 4, step 4: storing a first block group address of a subsequent starting block group with the beginning of the offset as a starting address of the block data into a block bitmap/metadata index table in the first metadata, clearing bitmap information, updating the value of the offset to the address of the next block group, and updating the value of the offset to the address of the first block group of the next block group if the currently processed block group is finished;

and 5: repeating the step 4 to process all the element block groups, and associating all the real data with the element data;

step 6: and (3) repopulating all special metadata containing the super block/block description table, constructing a data structure of the block description table according to the distribution of the block data, and filling the data structure into the block description table or keeping the data structure in the block description table.

According to an embodiment of the optimization method for storing and reading the mass of the small files, the reading and writing speed of the newly added disk is higher than that of the original disk.

According to an embodiment of the method for optimizing storage and reading of the mass small files, the method further includes:

step four: when creating the catalog, storing the real data of the catalog into the newly added disk so as to lead the subsequent reading and writing operations of the catalog to be carried out on the newly added disk.

The invention also discloses an optimization system for storing and reading the mass small files, which comprises the following steps:

the logical volume generating module is used for forming a logical volume based on the original magnetic disk and the newly added magnetic disk;

the formatting module is used for formatting the logical volume;

and the data migration and structure reconstruction module is used for performing migration adjustment of the metadata and remapping between data structures, so that one of the newly added disk and the original disk is reserved for storing the metadata, and the other of the newly added disk and the original disk is used for storing the real data.

According to an embodiment of the present invention, the logical volume generation module includes:

the integration unit integrates the original disk and the newly added disk;

a physical volume creating unit, which creates a physical volume from the original disk and the newly added disk;

a physical volume group creation unit that strings each physical volume into a physical volume group;

and the logical volume creating unit is used for drawing all the space from the physical volume group to form a logical volume.

According to an embodiment of the optimization system for storage and reading of the mass small files, the data migration and structure reconstruction module performs migration adjustment of the metadata and remapping between the data structures just before the formatting module completes formatting and the file system is not used.

According to an embodiment of the present invention, the system further includes:

and the directory processing module is used for storing the real data of the directory into the newly added disk when the directory is created so as to ensure that the subsequent reading and writing operations on the directory are all carried out on the newly added disk.

Compared with the prior art, the invention has the following beneficial effects: the invention stores the metadata by adding one or more high-speed disks (such as SSD) so as to accelerate the data access and improve the overall service performance. The newly added high-speed disk and the original partition are integrated into a piece of equipment, the file system which is well formatted is optimized, after the processing is completed, the metadata part of the file system is migrated into the high-speed disk, and the original metadata storage area and the original real data area are used as a new real data area. After that, all operations of reading, writing, adding and deleting of metadata and the like and the operation of the directory are performed on the high-speed disk so as to accelerate the overall access speed.

Drawings

Fig. 1 shows a write flow diagram of small file merging in a conventional optimization method.

Fig. 2 shows a read flow diagram of small file merging in a conventional optimization method.

Fig. 3 is a schematic diagram showing a data distribution manner of a general file system.

Fig. 4A is a schematic diagram showing a distribution of metadata of a mark color and real data.

FIG. 4B is a diagram illustrating the distribution of metadata integration into a high-speed disk.

FIG. 5 is a schematic diagram of the integration of low speed disks to store only real data.

FIG. 6 is a diagram showing the overall distribution of disk data after metadata migration.

FIG. 7 is a diagram illustrating comparison of small file read and write performance before and after metadata migration.

FIG. 8 illustrates a migration of space occupied by a directory to a high-speed disk.

Fig. 9 shows a flowchart of a first embodiment of the optimization method for storage and reading of mass small files.

Fig. 10 shows a flowchart of a second embodiment of the optimization method for storage and reading of mass small files.

FIG. 11 illustrates a detailed flow chart of metadata migration adjustment and data structure remapping of the present invention.

Fig. 12 shows a schematic diagram of a first embodiment of the optimization system for storage and reading of mass small files of the present invention.

FIG. 13 is a schematic diagram of a second embodiment of the present invention optimizing system for storage and reading of a large number of small files.

Detailed Description

The above features and advantages of the present disclosure will be better understood upon reading the detailed description of embodiments of the disclosure in conjunction with the following drawings. In the drawings, components are not necessarily drawn to scale, and components having similar relative characteristics or features may have the same or similar reference numerals.

While many file systems are available in computers, the present invention will be described with reference to the Ext4 file system, which is currently the mainstream, and is also applicable to file systems such as Ext 3.

To better illustrate the embodiments of the present invention, the working principle of the system is first explained before the embodiments are introduced.

In general, a server needs to be formatted into a file system. The disk is formatted, and the distribution of the formatted disk is shown in fig. 3. In the normal file system data distribution mode, the logical distribution diagram of the entire disk is in the first row, starting from the metablock Group 0 (Meta Group Block) and going up to the metablock Group n. Generally, a Group of elements is composed of 64 blocks (Block Group), and the space occupied by a Block Group is 128M.

As shown in fig. 3, the layout of the 5 th Block Group to the 63 rd Block Group is mainly the layout of the 3 rd Block Group (i.e. Block Group 2) for storing data. All subsequent element Block groups are mainly the layout mode of the Block Group 2.

After the formatting is completed, for example, to write a file/ab/c.dat, the following process is required:

step 1: an metadata Index Table (IT) is accessed to see if an ab directory exists.

Step 2: if the ab directory does not exist, the following creation operation needs to be done: (1) creating an ab item in a metadata index table, and setting a corresponding position of a metadata bitmap (IB) to be 1 to indicate that the inode item is occupied; (2) allocating a data block to the ab directory, and setting a corresponding position of a Block Bitmap (BB) to 1, indicating that the data block has been used; (3) and modifying the information of the data block occupied by the metadata in the metadata index table.

And step 3: if the c.dat does not exist, a c.dat entry is created in the metadata index table, and a corresponding position of the metadata bitmap is set to 1, indicating that the inode entry is already occupied.

And 4, step 4: if a data block exists for c.dat and the last allocated block is not full, then write to this data block. Otherwise, allocating a block to the c.dat, and modifying the corresponding position of the block bitmap to 1, which indicates that the data block is occupied, and modifying the block information occupied by the metadata in the metadata index table.

For example, to read a file/ab/c.dat, the following procedure is required:

step 1: and reading the ab item in the metadata index table, reading the block corresponding to the ab item, and confirming that the c.dat exists.

Step 2: and reading a c.dat item in the metadata index table, and reading a block corresponding to the c.dat item.

And step 3: and reading the data of the block and returning, and if more data are read, reading all blocks occupied by the file for many times.

Data referred to in the present invention is divided into two types, one is a data Block (data Block) storing real data, and the other includes all Super Blocks (SB), Block bitmaps, metadata index tables, and the like, which are collectively referred to as metadata. As can be seen from the figure, the distribution of real data and metadata on the disk is cross-distributed, the first part of each 128M block group is metadata, and generally occupies space of several M levels, and the rear part is 120 more M real data space.

As can be seen from the above read-write flow, in the read-write process, more operations are all operations on the metadata portion, and especially for small files, the operation ratio on the metadata portion is higher because the small files occupy fewer blocks. For small files smaller than 128K, the metadata is read once, and the real data is read once.

First embodiment of optimization method for storage and reading of massive small files

Fig. 9 shows a flow of a first embodiment of the optimization method for storage and reading of a large number of small files in the present invention. Referring to fig. 9, the following is a detailed description of each implementation step of the optimization method of this embodiment.

Step S11: and forming a Logical Volume (LV) based on the original disk and the newly added disk.

For convenience of description, it is assumed in this embodiment that the original disk is a low-speed disk, and the newly added disk is a high-speed disk. This step is further accomplished by: firstly, the LVM command is used for integrating the original disk and the newly added disk. The pvcreate command is then used to create a Physical Volume (PV) for each of the high-speed disk and the low-speed disk. Each physical volume is then serialized into a physical Volume Group (VG) using the vgcreate command. Finally, a Logical Volume (LV) is formed by drawing all space from the physical volume group by using an lvcreate command.

Step S12: the logical volume is formatted.

For example, the logical volume is formatted into an ext4 file system format in accordance with the normal flow of formatting.

Step S13: and performing migration adjustment of the metadata and remapping between data structures, reserving one of the newly added disk and the original disk as storage metadata, and using the other of the newly added disk and the original disk for storing real data.

Since the new magnetic disk is a high-speed magnetic disk and the original magnetic disk is a low-speed magnetic disk in this embodiment, the high-speed magnetic disk is used for storing metadata, and the low-speed magnetic disk is used for storing real data.

Considering the irregularity in the distribution of metadata if it is already used in the file system, the migration of metadata must be started before the file system is not used. At this time, the position of the metadata does not record a real file, so that the continuity of the migrated metadata is good, and the performance of subsequent access is improved.

The separation is carried out on the unused file system, so that the storage continuity of the metadata is ensured, and all the subsequent metadata can be stored on the high-speed disk without causing fragments; on the other hand, the migration and arrangement of a large amount of real data caused by migration in the using process are also avoided. After the file system is used daily through reading, writing, adding, deleting and the like, the distribution of real data is relatively discrete, so that the subsequent data arrangement is very time-consuming, and the larger the data amount is, the longer the time is.

A refinement of this step is shown in fig. 11, the following is the process in which it is embodied.

Step S31: reading a first Block Group (Block Group 0) of the first Block Group (Meta Group Block 0), copying data from Super blocks (Super Block, SB) in the first Block Group to a metadata Index Table (IT) to the start position of the first Block Group, and recording an offset (offset) as an end position.

Step S32: for each subsequent block group in the first metablock group, sequentially copying data from the superblock in each block group to the metadata index table to the address at which the offset starts, then updating the value of the offset, and emptying the content of the copied address.

Reading the second Block Group (Block Group 1) of the first element Block Group (Meta Group Block 0), copying the data from the super Block SB to the metadata index table IT to the address at the beginning of the offset, recording the new ending position as offset, and clearing the content of the copied address.

And then, similarly processing the third Block Group (Block Group 2), and so on, gradually copying the metadata in the first metablock Group to the front part of the file system (namely, a high-speed disk), and clearing the copied address.

Step S33: for each tuple group subsequent to the first tuple group, the metadata in all the tuple groups are sequentially copied to the front part of the file system in the processing manner of the above steps S31 to S32, so that all the metadata are completely migrated to one of the original disk and the newly added disk.

In this embodiment, since the new disk is a high-speed disk, all metadata is migrated to the new disk.

Step S34: the first Block Group address of the subsequent start Block Group (Meta Group Block) where the offset is started is stored as the start address of the Block Data (Block Data) into the Block Bitmap (Block Bitmap, BB)/metadata Bitmap (Inode Bitmap, IB)/metadata Index Table (IT) in the first metadata, and the Bitmap information is cleared (because no space is used), and the value of the offset is updated to the address of the next Block Group (Block Group), and if the currently processed metablock Group is finished, the value of the offset is updated to the address of the first Block Group of the next metablock Group.

Step S35: the step S34 is repeated to process all the tuple sets, and all the real data and the metadata are associated.

Step S36: and (2) refilling all special metadata including the Super Block (SB)/block description Table (GDT), constructing a data structure of the block description Table according to the distribution of block data, and filling the data structure into the block description Table or keeping the data structure in the block description Table.

At this point, the metadata migration adjustment is finished, and the mapping between the data structures is completed, so that normal file system services can be performed.

The data is adjusted as shown in fig. 4B by performing the adjustment in the manner of this embodiment, and the metadata is integrated into the cache disk. FIG. 5 shows that the adjusted low-speed disks are integrated to store only real data. Fig. 6 shows the overall distribution of the disk data after metadata migration.

Second embodiment of optimization method for storage and reading of massive small files

Fig. 10 shows a flow of a second embodiment of the optimization method for storage and reading of mass small files in the present invention. Referring to fig. 10, the following is a detailed description of each implementation step of the optimization method of this embodiment.

Step S21: and forming a Logical Volume (LV) based on the original disk and the newly added disk.

Step S22: the logical volume is formatted.

Step S23: and performing migration adjustment of the metadata and remapping between data structures, reserving one of the newly added disk and the original disk as storage metadata, and using the other of the newly added disk and the original disk for storing real data.

Step S24: when creating the directory, the real data area of the directory stores information such as the file name under the directory, and the information is relatively less and has a relatively high frequency of use, so the real data area of the directory is also stored in the high-speed disk (newly added disk), and subsequent read-write operations are performed on the high-speed disk (newly added disk).

By storing the real data area of the directory in the high-speed disk, high-speed access to the directory can be realized.

First embodiment of optimization system for storage and reading of massive small files

Fig. 12 shows the principle of the first embodiment of the optimization system for storage and reading of mass small files of the present invention. Referring to fig. 12, the optimization system of the present embodiment includes: the system comprises a logical volume generating module 11, a formatting module 12 and a data migration and structure reconstruction module 13.

The logical volume generation module 11 forms a logical volume based on the original disk and the newly added disk. The logical volume generation module 11 further includes an integration unit 111, a physical volume creation unit 112, a physical volume group creation unit 113, and a logical volume creation unit 114. For convenience of description, it is assumed in this embodiment that the original disk is a low-speed disk, and the newly added disk is a high-speed disk.

The integration unit 111 integrates the original disk and the new disk using the LVM command. The physical volume creation unit 112 creates a Physical Volume (PV) for each of the original disk and the newly added disk using the pvcreate command. The physical volume group creation unit 113 strings the respective physical volumes into one physical Volume Group (VG) using the vgcreate command. The logical volume creation unit 114 uses the lvcreate command to draw all the space from the physical volume group to make one Logical Volume (LV).

The formatting module 12 formats the logical volume. For example, the logical volume is formatted into an ext4 file system format in accordance with the normal flow of formatting.

The data migration and structure reconstruction module 13 performs metadata migration adjustment and remapping between data structures, so that one of the newly added disk and the original disk is reserved for storing metadata, and the other of the newly added disk and the original disk is used for storing actual data.

The distribution of metadata is irregular in view of the file system if it is already used, and therefore. The data migration and structure reconstruction module 13 performs the migration adjustment of metadata and the remapping between data structures just before the formatting module 12 completes the formatting and the file system is not used. At this time, the position of the metadata does not record a real file, so that the continuity of the migrated metadata is good, and the performance of subsequent access is improved.

The specific implementation process of the data migration and structure reconstruction module 13 is shown in fig. 11, and the following is the specific implementation process thereof.

Second embodiment of optimization system for storage and reading of massive small files

Fig. 13 shows the principle of the second embodiment of the optimization system for storage and reading of mass small files of the present invention. Referring to fig. 13, the optimization system of the present embodiment includes: a logical volume generating module 21, a formatting module 22, a data migration and structure reconstruction module 23, and a directory processing module 24.

The logical volume generation module 21 forms a logical volume based on the original disk and the newly added disk. The logical volume generation module 21 further includes an integration unit 211, a physical volume creation unit 212, a physical volume group creation unit 213, and a logical volume creation unit 214. For convenience of description, it is assumed in this embodiment that the original disk is a low-speed disk, and the newly added disk is a high-speed disk.

The integration unit 211 integrates the original disk and the new disk using the LVM command. The physical volume creation unit 212 creates a Physical Volume (PV) for each of the original disk and the newly added disk using the pvcreate command. The physical volume group creation unit 213 strings the respective physical volumes into one physical Volume Group (VG) using the vgcreate command. The logical volume creation unit 214 uses the lvcreate command to draw all the space from the physical volume group into one Logical Volume (LV).

The formatting module 22 formats the logical volume. For example, the logical volume is formatted into an ext4 file system format in accordance with the normal flow of formatting.

The data migration and structure reconstruction module 23 performs metadata migration adjustment and remapping between data structures, so that one of the newly added disk and the original disk is reserved to store metadata, and the other of the newly added disk and the original disk is used for storing actual data.

The distribution of metadata is irregular in view of the file system if it is already used, and therefore. The data migration and structure reconstruction module 23 performs the migration adjustment of metadata and the remapping between data structures just before the formatting module 22 completes the formatting and the file system is not used. At this time, the position of the metadata does not record a real file, so that the continuity of the migrated metadata is good, and the performance of subsequent access is improved.

The specific implementation process of the data migration and structure reconstruction module 23 is shown in fig. 11, and the following is the specific implementation process thereof.

The catalog processing module 24 stores the actual data of the catalog in the newly added disk (high-speed disk) when creating the catalog, so that the subsequent reading and writing operations to the catalog are all performed on the newly added disk (high-speed disk).

Combining the four embodiments and the inventive idea of the present invention, the improvements made by the present invention are summarized as follows:

1. the method has the advantages that the high-speed and low-speed separation of the metadata and the real data is immediately carried out on the newly-built file system, the separation is carried out on the file system which is not used, the storage continuity of the metadata is ensured, and all the subsequent metadata can be stored on the high-speed disk without causing fragments. On the other hand, the migration and arrangement of a large amount of real data caused by migration in the using process are also avoided; after the file system is used daily through reading, writing, adding, deleting and the like, the distribution of real data is relatively discrete, so that the subsequent data arrangement is very time-consuming, and the larger the data amount is, the longer the time is.

2. The real data area of the directory is also stored in the high-speed disk to realize high-speed access of the directory.

3. The high-speed disk is fully utilized, and higher performance is obtained. A high-speed disk is used to store the metadata. High-speed disks typically use SSD, or higher speed NVMe. Compared with a common stat/sas magnetic disk, the reading iops performance of the high-speed magnetic disk is improved by 2 orders of magnitude, and the improvement on the overall performance is greatly improved. Because of the small file, if one metadata and one real data calculation are consumed according to reading: SSD calculates according to 10000iops, SATA calculates according to 100iops (iops is an index for measuring the performance of a magnetic disk and represents the number of times of read-write operation which can be executed every second); after using SSD as an optimization for storing metadata, read once time consuming: 0.1ms +10ms is 10.1 ms. Time consuming using only SATA: 10ms +10ms is 20 ms.

Therefore, if the file is small, the data can be read out at one time, which is the performance expressed after the ssd acceleration, and can reach 2 times of the SATA. If the file is slightly larger, 8 times of reading are needed to read the file completely, namely when the file is about 1M, after the SSD is used for acceleration, the time consumed for reading the file completely is as follows: 0.1ms +10ms +9 × 3ms is 37.1 ms; time consuming using only SATA: 10ms +10ms +9 × 3ms is 47 ms; (47-37.1)/47 ═ 20%; for files of about 1M, the performance can be improved by about 20%.

The read-write data after the metadata migration is subjected to test analysis, and as shown in fig. 7, the read-write performance of the small files before and after the metadata migration is compared. As shown in FIG. 7, as the size of the file increases, the optimization effect after the metadata migration decreases, so the present invention has a larger optimization for files smaller than 1M.

4. Transparent to applications, uses file system applications without the need to alter logic. Because the adjustments of the present invention are made at the file system level, transparent to upper level applications, no logic changes are required, and the use of some of the analysis viewing tools of the present file system is not affected. The existing solutions mentioned in the background section suffer from the problem of requiring changes to the application logic and do not use some of the analysis viewing tools provided by the original file system.

5. The modularization improvement on the existing realization is that the file system is read and written by application, the flow is not lengthened, the complexity is not increased, and the like.

The above-described embodiments of the present invention are merely examples, and there are some variations, which are also based on the inventive idea of the present invention, and therefore should be considered as falling within the scope of the present invention. Examples are as follows.

1. For the newly added disk in the original system, in addition to the newly added high-speed disk in the example of the embodiment, a low-speed disk may also be added, and formatting and subsequent reconstruction of the new disk are also required after the addition.

2. For the migration adjustment of the metadata, the embodiment explains how to perform the migration adjustment, and the performance optimization degree is influenced by performing the migration adjustment on all the metadata or performing the migration adjustment on part of the metadata, and the specific influence is determined according to an application scenario.

3. For the Data Block occupied by the directory, it can also be stored in the high-speed disk, that is, the space occupied by the directory is migrated to the high-speed disk as shown in fig. 8.

In each metablock Group Meta Group Block, content information for storing a directory as a Data Block is reserved. The latter part of the Block Group 63 in fig. 8 is the data information for storing the directory. The reason for migrating the directory to the high-speed disk is as well: the occupied space of the directory is not large; directory information often needs to be used; after the part is stored in the high-speed disk, the reading of the part can be accelerated.

Therefore, the content information (data block) of the directory is migrated to the high-speed disk, and partial performance can be improved. In general equipment, the memory capacity is limited, and all directory data cannot be cached, so that the data of the directory needs to be read frequently; in the application of massive small files, the file names of the small files are stored in the directory, so that the space occupied by the directory is increased along with the number of the small files. The capacity of the high-speed disk can reach the level of hundreds G or even T, and the data can be easily stored in the high-speed disk to accelerate the access of massive small files.

While, for purposes of simplicity of explanation, the methodologies are shown and described as a series of acts, it is to be understood and appreciated that the methodologies are not limited by the order of acts, as some acts may, in accordance with one or more embodiments, occur in different orders and/or concurrently with other acts from that shown and described herein or not shown and described herein, as would be understood by one skilled in the art.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.

In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk (disk) and disc (disc), as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks (disks) usually reproduce data magnetically, while discs (discs) reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An optimization method for storage and reading of massive small files comprises the following steps:

step two: formatting the logical volume;

step three: performing migration adjustment of metadata and remapping between data structures, reserving one of the newly added disk and the original disk as storage metadata, and using the other of the newly added disk and the original disk for storing real data;

wherein the third step further comprises:

2. The method for optimizing the storage and reading of the mass small files according to claim 1, wherein the first step comprises:

integrating the original disk and the newly added disk;

stringing the physical volumes into a physical volume group;

all space is drawn from the physical volume group to make a logical volume.

3. The method for optimizing the storage and reading of the mass small files according to claim 1, wherein in step three, the migration adjustment of the metadata and the remapping between the data structures are performed just before the formatting is finished and the file system is not used.

4. The method for optimizing the storage and reading of the mass of the small files according to claim 1, wherein the reading and writing speed of the newly added disk is higher than that of the original disk.

5. The method for optimizing the storage and reading of the mass small files according to claim 4, wherein the method further comprises:

6. An optimization system for storage and reading of a large number of small files comprises:

the formatting module is used for formatting the logical volume;

the data migration and structure reconstruction module performs migration adjustment of metadata and remapping between data structures, so that one of the newly added disk and the original disk is reserved for storing the metadata, and the other one of the newly added disk and the original disk is used for storing real data, wherein the data migration and structure reconstruction module is configured to execute the following processing:

7. The system for optimizing the storage and reading of the small mass files according to claim 6, wherein the logical volume generation module comprises:

the integration unit integrates the original disk and the newly added disk;

8. The system of claim 6, wherein the data migration and structure reconstruction module performs the migration adjustment of the metadata and the remapping between the data structures just before the formatting module completes the formatting and the file system is not used.

9. The optimization system for storage and reading of the mass small files according to claim 6, wherein the system further comprises: