CN108132759A

CN108132759A - A kind of method and apparatus that data are managed in file system

Info

Publication number: CN108132759A
Application number: CN201810035872.XA
Authority: CN
Inventors: 林烽; 陈文生
Original assignee: Wangsu Science and Technology Co Ltd
Current assignee: Wangsu Science and Technology Co Ltd
Priority date: 2018-01-15
Filing date: 2018-01-15
Publication date: 2018-06-08
Anticipated expiration: 2038-01-15
Also published as: CN108132759B

Abstract

The invention discloses the method and apparatus that data are managed in a kind of file system, belong to technical field of data storage.The method includes：Original file systems are formatted, new file system is established based on pre-assigned high IOPS storage mediums subregion and low IOPS storage mediums, in the high IOPS storage mediums subregion setting I node lists；Metadata to be stored is stored in the I node lists, and file data to be stored is stored in the low IOPS storage mediums.Using the present invention, the service performance of file system can be improved.

Description

A kind of method and apparatus that data are managed in file system

Technical field

The present invention relates to technical field of data storage, the method and dress of data are managed in more particularly to a kind of file system It puts.

Background technology

A large amount of text is generally stored in CDN (Content Delivery Network, content distributing network) server Part, and establish the file system for being useful for managing these files.Data in file system can be divided into file data and first number According to file data refers to the particular content data of file, and metadata refers to the system data for describing file attribute, such as visits Ask description information (the available sky of such as file system of permission, file owner, the distributed intelligence of storage region and file system Between) etc..

The capacity requirement that CDN server stores data is very big, the general mechanical hard disk for selecting amount of storage big, at low cost Carry out the data in storage file system, the file system for managing file, CDN service can be specifically established based on mechanical hard disk During device storage file, the metadata of file and file data can be stored in mechanical hard disk, and pass through file system to file It is managed operation.When receiving file access request of the external world for some file, CDN server can pass through file system System carries out I/O (Input/Output, read/write) to mechanical hard disk and operates, and the metadata of this document is obtained from mechanical hard disk, so The storage location of file data is navigated to by metadata afterwards, and then can outwardly feed back the number of files stored in mechanical hard disk According to.

In the implementation of the present invention, inventor has found that the prior art has at least the following problems：

CDN server generally requires a large amount of file access request of concurrent processing, to the IOPS (Input/ of storage medium Output Operations Per Second, read/write number per second) demand is increasing, and the IOPS abilities of mechanical hard disk It is poor, it can not realize the rapid feedback for file access request, therefore the service performance of file system is poor.

Invention content

In order to solve problem of the prior art, an embodiment of the present invention provides a kind of methods that data are managed in file system And device.The technical solution is as follows：

In a first aspect, a kind of method that data are managed in file system is provided, the method includes：

Original file systems are formatted, are built based on pre-assigned high IOPS storage mediums subregion and low IOPS storage mediums Vertical new file system, in the high IOPS storage mediums subregion setting I-node lists；

Metadata to be stored is stored in the I-node lists, and file data to be stored is stored in the low IOPS Storage medium.

Optionally, the method further includes：

If the high IOPS storage mediums subregion is there are residual memory space, by catalogue file to be stored and indirectly Block number is according to the deposit residual memory space；

When the residual memory space deficiency, the catalogue file to be stored and indirect block data are continued to be stored in institute State low IOPS storage mediums.

Optionally, the method further includes：

The mean file size of all files to be stored is estimated, capacity based on the low IOPS storage mediums and described flat Equal file size estimates I-node number；

According to the unit amount of storage of the I-node number, I-node unit capacitys and the low IOPS storage mediums, really The memory capacity of the fixed high IOPS storage mediums subregion.

Optionally, it is described that file data to be stored is stored in the low IOPS storage mediums, including：

According to the maximum amount of parallelism that the process of file system is read and write, the low IOPS storage mediums are divided into size system One continuous multiple storage regions；

For a file to be stored, using default random algorithm in the multiple storage region selection target memory block Domain, and the file data of the file to be stored is written since first storage available in the target storage domain.

Optionally, it is described using default random algorithm selection target storage region in the multiple storage region, and from First storage available in the target storage domain starts that the file data of the file to be stored is written, including：

If the file size of the file to be stored is more than default value, using default random algorithm the multiple Selection target storage region in storage region, and institute is written since first storage available in the target storage domain State the file data of file to be stored；

If the file size of the file to be stored is not more than the default value, from the free time of position vernier direction Storage unit starts that the file data of the file to be stored is written, and updates the free memory locations that position vernier is directed toward, In, the free memory locations that the position vernier is directed toward are always the first free memory locations in the multiple storage region.

Optionally, the high IOPS storage mediums are solid state disk, and the low IOPS storage mediums are mechanical hard disk.

Second aspect, provides the device that data are managed in a kind of file system, and described device includes：

Module is established, for formatting original file systems, based on pre-assigned high IOPS storage mediums subregion and low IOPS storage mediums establish new file system, in the high IOPS storage mediums subregion setting I-node lists；

Memory module, for metadata to be stored to be stored in the I-node lists, and by file data to be stored It is stored in the low IOPS storage mediums.

Optionally, the memory module, is additionally operable to：

Optionally, described device further includes：

Estimation block, for estimating the mean file size of all files to be stored, based on the low IOPS storage mediums Capacity and the mean file size, estimate I-node number；

Determining module, for according to the I-node number, I-node unit capacitys and the low IOPS storage mediums Unit amount of storage determines the memory capacity of the high IOPS storage mediums subregion.

Optionally, the memory module, is specifically used for：

The third aspect, provides a kind of file-storage device, and the file-storage device includes processor and memory, institute It states and at least one instruction, at least one section of program, code set or instruction set is stored in memory, at least one instruction, institute At least one section of program, the code set or instruction set is stated to be loaded by the processor and performed to realize such as claim 1 to 6 times The method that data are managed in file system described in one.

Fourth aspect provides a kind of computer readable storage medium, at least one finger is stored in the storage medium Enable, at least one section of program, code set or instruction set, at least one instruction, at least one section of program, the code set or Instruction set is loaded by processor and is performed to realize the side for managing data in the file system as described in claim 1 to 6 is any Method.

The advantageous effect that technical solution provided in an embodiment of the present invention is brought is：

In the embodiment of the present invention, format original file systems, based on pre-assigned high IOPS storage mediums subregion and Low IOPS storage mediums establish new file system, will be to be stored in the high IOPS storage mediums subregion setting I-node lists Metadata deposit I-node lists, and file data to be stored is stored in low IOPS storage mediums.It is in this way, hard using solid-state The strong high IOPS storage mediums storage metadata of the rapid feedbacks such as disk ability, shared metadata access in file system handle to The IOPS pressure that the low IOPS storage mediums such as mechanical hard disk are brought, so as to improve the service performance of file system.

Description of the drawings

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present invention, for For those of ordinary skill in the art, without creative efforts, other are can also be obtained according to these attached drawings Attached drawing.

Fig. 1 is a kind of data storage schematic diagram of file system provided in an embodiment of the present invention；

Fig. 2 is the method flow diagram that data are managed in a kind of file system provided in an embodiment of the present invention；

Fig. 3 is the apparatus structure schematic diagram that data are managed in a kind of file system provided in an embodiment of the present invention；

Fig. 4 is the apparatus structure schematic diagram that data are managed in a kind of file system provided in an embodiment of the present invention；

Fig. 5 is a kind of structure diagram of file-storage device provided in an embodiment of the present invention.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.

An embodiment of the present invention provides a kind of method that data are managed in file system, the executive agent of this method can be File-storage device, wherein, file-storage device can be stored with heap file, and have the arbitrary of file management facilities and set It is standby, can be terminal or server.At least there are storage mediums different two kinds of IOPS, texts in file-storage device Part storage device can be based on single storage medium and build local file system, and it is local can also to be based on multi storage structure File system.Processor, memory, transceiver can be provided in file-storage device, processor can be used for file system The process that data are managed in system is handled, and memory can be used for storing the data needed in following processing procedures and generation Data, transceiver can be used for file-storage device and carries out data interaction with extraneous.In the present embodiment, with file-storage device Be solid state disk for CDN server, high IOPS storage mediums, low IOPS storage mediums are to illustrate for mechanical hard disk, Its situation is similar therewith, no longer introduces one by one.It is appreciated that be readily applicable to other non-for the method described in the embodiment of the present invention In the server or file-storage device of CDN field.

Fig. 1 is that the data of solid state disk and mechanical hard disk store schematic diagram in the present embodiment, wherein, CDN server side Technical staff can mark off the solid state disk subregion for establishing file system, solid state disk subregion in solid state disk in advance In can include I-node lists and reserved memory space, I-node lists are by the identical I-node of a large amount of sizes The list that (information node, information node) is formed, each file is corresponding with one in the list in file system Some attribute informations of each file are recorded in a I-node, I-node, such as the size of file, file owners and create when Between etc.；Mechanical hard disk can include the storage unit for being largely used to storage file data, and the size of all storage units is identical, Respectively corresponding unique number.

Below in conjunction with specific embodiment, process flow shown in Fig. 2 is described in detail, content can be as Under：

Step 201, original file systems are formatted, are deposited based on pre-assigned high IOPS storage mediums subregion and low IOPS Storage media establishes new file system, and I-node lists are set in high IOPS storage mediums subregion.

In force, the technical staff of CDN server side can control CDN server to the data inside file system into Row separation storage.Specifically, CDN server detects the preset trigger condition for the process that data are managed in following file system When, then original file systems of the existing foundation on mechanical hard disk can be formatted, while can be based on dividing in advance The solid state disk subregion and mechanical hard disk matched establish new file system.Later, CDN server can be in above-mentioned solid state disk subregion Middle setting I-node lists.It is appreciated that above-mentioned preset trigger condition can be by technical staff according to the operation shape of CDN server State is arbitrarily set, and can receive sign on input by user or reach preset time point or detect It is not more than default load value etc. to present load.

Step 202, metadata to be stored is stored in I-node lists, and file data to be stored is stored in low IOPS Storage medium.

In force, after CDN server establishes new file system based on solid state disk subregion and mechanical hard disk, Ke Yiji Record the start-stop address of solid state disk subregion and mechanical hard disk.Later, CDN server can obtain file to be stored metadata and File data, and metadata to be stored is stored in above-mentioned I-node lists, file data to be stored is then stored in machinery Hard disk.

Optionally, catalogue file and indirect block high priority data can also be stored in solid state disk by CDN server, accordingly Processing can be as follows：If high IOPS storage mediums subregion there are residual memory space, by catalogue file to be stored and Block number is connect according to deposit residual memory space；When residual memory space deficiency, by catalogue file to be stored and indirect block data Continue to be stored in low IOPS storage mediums.

Wherein, in order to realize the management of file directory, usually file directory is preserved in the form of a file, this text Part is then referred to as catalogue file, and catalogue file is the fixed record-oriented file of length.

If the size of file data is more than the capacity of a storage unit on mechanical hard disk, file data if, needs to store In multiple storage units.The distribution situation of file data can be recorded in the corresponding I-node of file and included by CDN server Disk serial number list in.When the quantity for the storage unit that file data occupies is more than the capacity of disk serial number list, then need By in a storage unit of the number record of partial memory cell outside disk serial number list, and in disk serial number list The pointer of the storage unit is recorded, CDN server can find above-mentioned partial memory cell by this pointer, and this refers to Needle may be considered indirect block data.

In force, it is also (i.e. surplus there are available memory space if in solid state disk subregion in addition to I-node lists Remaining memory space), CDN server then can be empty by the above-mentioned remaining storage of catalogue file to be stored and the deposit of indirect block data Between.It, can be with specifically, CDN server stores data to be stored as unit of file after new file system is built I-node lists are first stored the metadata into, relevant catalogue file and indirect block data are then stored in solid state disk subregion.When Residual memory space is insufficient in solid state disk subregion, and when can not continue to be stored in catalogue file and indirect block data, CDN server is then Subsequent catalogue file and indirect block data can be continued to be stored in mechanical hard disk.

Optionally, CDN server can be pre- to the memory capacity of solid state disk subregion advanced according to the memory state of file Row setting, corresponding processing can be as follows：The mean file size of all files to be stored is estimated, based on low IOPS storage mediums Capacity and mean file size, estimate I-node number；It is stored according to I-node number, I-node unit capacitys and low IOPS The unit amount of storage of medium determines the memory capacity of high IOPS storage mediums subregion.

In force, CDN server can estimate the mean file size of all files to be stored in advance, be then based on machine The capacity and mean file size of tool hard disk estimate the quantity of the storable file of mechanical hard disk, that is, estimate I-node number. For example, estimating mean file size as 1MB, the capacity of mechanical hard disk is 1TB, then can obtain at least needs 1,000,000 I- node.Later, CDN server can first calculate I-node lists institute according to the unit capacity of I-node number and I-node The memory capacity of occupancy, the unit capacity of such as one I-node is 128B, and I-node number is 1,000,000, I-node lists Memory capacity is then 128MB.Further, a part of memory space can be reserved in solid state disk subregion and carrys out storage catalogue text Part and indirect block data can specifically be deposited according to the unit of the quantity and mechanical hard disk of catalogue file or indirect block data Reserves calculate the size of reserved memory space.Assuming that 1,000,000 I-node, there are 1/4 catalogue files or indirect block Data, while the unit amount of storage of mechanical hard disk is 4KB, then needs the memory space of reserved 1000000*1/4*4KB.And then The memory capacity of above-mentioned I-node lists is added with reserved memory space, you can to determine the storage of solid state disk subregion Capacity.

Optionally, storage region division can be carried out to mechanical hard disk, then in storage file data in multiple storages Storage region is randomly choosed in region, correspondingly, step 202 can be as follows：According to the maximum of the process of file system read-write simultaneously Low IOPS storage mediums are divided into the unified continuous multiple storage regions of size by line number amount；For a file to be stored, Using default random algorithm in multiple storage regions selection target storage region, it is and available from first of target storage domain Storage unit starts that the file data of file to be stored is written.

In force, the maximum amount of parallelism that CDN server can be read and write with the process of statistics file system, i.e., in same a period of time Then the maximum quantity of interior existing document manipulation can divide mechanical hard disk according to above-mentioned maximum amount of parallelism Unified for multiple sizes and continuous storage region so that the quantity of storage region is not more than maximum amount of parallelism, each to store Region can include multiple storage units.Later, when the file data to a file to be stored stores, CDN service Default random algorithm may be used in device, and a storage region (such as target storage domain) is selected in above-mentioned multiple storage regions, Then the file data of file to be stored since first storage available in target storage domain, can be written.In this way, It is possible to prevente effectively from when multiple files to be stored are written simultaneously, there is multifile interleaved, and generate a large amount of fragment files Situation.

Optionally, file can be divided into big file and small documents, to big file and small documents according to different storage sides Formula is stored, and corresponding processing can be as follows：If the file size of file to be stored is more than default value, using default Random algorithm selection target storage region in multiple storage regions, and from first storage available in target storage domain Start the file data of write-in file to be stored；If the file size of file to be stored is less than or equal to default value, from The free memory locations that position vernier is directed toward start that the file data of file to be stored is written, and update the sky that position vernier is directed toward Not busy storage unit.

Wherein, the free memory locations that position vernier is directed toward are always that the first idle storage in multiple storage regions is single Member.

In force, CDN server can set a numerical value (i.e. pre- according to the size distribution situation of file to be stored If numerical value) criteria for classifying as big file and small documents, if the size of file is more than the default value, for big file, If the size of file is less than or equal to the default value, for small documents.And then in the number of files to a file to be stored During according to being stored, if file to be stored is big file, default random algorithm may be used in multiple storages in CDN server if Selection target storage region in region, and file to be stored is written since first storage available in target storage domain File data, if file to be stored is small documents, CDN server if, can deposit from the first free time in multiple storage regions Storage unit starts that the file data of file to be stored is written, mechanical when small documents are concurrently written so as to reduce file system The IOPS pressure of hard disk.Herein, CDN server can use the first free time that position vernier is directed toward in above-mentioned multiple storage regions Storage unit in this way, in the file data for storing small documents, can search free memory locations by position vernier, and After storage is completed, the free memory locations that can be directed toward to position vernier are updated.

In the embodiment of the present invention, format original file systems, based on pre-assigned high IOPS storage mediums subregion and Low IOPS storage mediums establish new file system, and metadata to be stored is stored in I-node lists, and by file to be stored Data are stored in low IOPS storage mediums.In this way, it is stored using the strong high IOPS storage mediums of the rapid feedbacks such as solid state disk ability Metadata, the access process for having shared metadata in file system are pressed to the IOPS that the low IOPS storage mediums such as mechanical hard disk are brought Power, so as to improve the service performance of file system.

On the other hand, in generic-document system metadata and the ratio of file data 1:100 hereinafter, solid state disk is contour IOPS storage mediums are only used for storing metadata, and corresponding storage capacity requirement is relatively low, and can be directed to application scenarios and estimate And appropriately sized solid state disk subregion is distributed, the storage resource of the high IOPS storage mediums such as solid state disk can be saved, effectively Reduce cost.

Based on identical technical concept, the device that data are managed in a kind of file system is provided, as shown in figure 3, described Device includes：

Establish module 301, for formatting original file systems, based on pre-assigned high IOPS storage mediums subregion and Low IOPS storage mediums establish new file system, in the high IOPS storage mediums subregion setting I-node lists；

Memory module 302, for metadata to be stored to be stored in the I-node lists, and by the text to be stored Number of packages is according to the deposit low IOPS storage mediums.

Optionally, the memory module 302, is additionally operable to：

Optionally, as shown in figure 4, described device further includes：

Estimation block 303 for estimating the mean file size of all files to be stored, is situated between based on the low IOPS storages The capacity of matter and the mean file size estimate I-node number；

Determining module 304, for according to the I-node number, I-node unit capacitys and the low IOPS storage mediums Unit amount of storage, determine the memory capacity of the high IOPS storage mediums subregion.

Optionally, the memory module 302, is specifically used for：

It should be noted that：The device of management data manages in file system in the file system that above-described embodiment provides It, can be as needed and by above-mentioned work(only with the division progress of above-mentioned each function module for example, in practical application during data It can distribute and be completed by different function modules, i.e., the internal structure of device is divided into different function modules, more than completion The all or part of function of description.In addition, the device of management data and file system in the file system that above-described embodiment provides The embodiment of the method that data are managed in system belongs to same design, and specific implementation process refers to embodiment of the method, no longer superfluous here It states.

Fig. 5 is the structure diagram of file-storage device provided in an embodiment of the present invention.This document storage device 500 can be because Configuration or performance are different and generate bigger difference, can include one or more central processing units 522 (for example, one A or more than one processor) and memory 532, one or more store application programs 542 or the storage of data 544 is situated between Matter 530 (such as one or more mass memory units).Wherein, memory 532 and storage medium 530 can be of short duration deposit Storage or persistent storage.One or more modules (diagram does not mark) can be included by being stored in the program of storage medium 530, often A module can include operating the series of instructions in file-storage device.Further, central processing unit 522 can be set It is set to and communicates with storage medium 530, the series of instructions operation in storage medium 530 is performed in file-storage device 500.

File-storage device 500 can also include one or more power supplys 529, one or more wired or nothings Wired network interface 550, one or more input/output interfaces 558, one or more keyboards 556 and/or, one Or more than one operating system 541, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..

File-storage device 500 can include memory and one or more than one program, one of them or More than one program of person is stored in memory, and be configured to by one or more than one processor perform it is one or More than one program of person includes the instruction for managed in above-mentioned file system data.

One of ordinary skill in the art will appreciate that hardware can be passed through by realizing all or part of step of above-described embodiment It completes, relevant hardware can also be instructed to complete by program, the program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent replacement, improvement and so on should all be included in the protection scope of the present invention.

Claims

1. the method for data is managed in a kind of file system, which is characterized in that the method includes：

Original file systems are formatted, are established based on pre-assigned high IOPS storage mediums subregion and low IOPS storage mediums new File system, in the high IOPS storage mediums subregion setting I-node lists；

Metadata to be stored is stored in the I-node lists, and the file data deposit low IOPS to be stored is stored Medium.

2. according to the method described in claim 1, it is characterized in that, the method further includes：

If the high IOPS storage mediums subregion is there are residual memory space, by catalogue file to be stored and indirect block number According to the deposit residual memory space；

When the residual memory space deficiency, the catalogue file to be stored and indirect block data are continued to be stored in described low IOPS storage mediums.

3. according to the method described in claim 1, it is characterized in that, the method further includes：

The mean file size of all files to be stored is estimated, capacity and the average text based on the low IOPS storage mediums Part size estimates I-node number；

According to the unit amount of storage of the I-node number, I-node unit capacitys and the low IOPS storage mediums, institute is determined State the memory capacity of high IOPS storage mediums subregion.

4. according to the method described in claim 1, it is characterized in that, described be stored in the low IOPS by file data to be stored Storage medium, including：

According to the maximum amount of parallelism that the process of file system is read and write, the low IOPS storage mediums are divided into size unification Continuous multiple storage regions；

For a file to be stored, using default random algorithm in the multiple storage region selection target storage region, And the file data of the file to be stored is written since first storage available in the target storage domain.

5. according to the method described in claim 4, it is characterized in that, described use default random algorithm in the multiple memory block Selection target storage region in domain, and wait to deposit described in write-in since first storage available in the target storage domain The file data of file is stored up, including：

If the file size of the file to be stored is more than default value, using default random algorithm in the multiple storage Selection target storage region in region, and treated described in write-in since first storage available in the target storage domain The file data of storage file；

If the file size of the file to be stored is not more than the default value, the free time being directed toward from position vernier stores Unit starts that the file data of the file to be stored is written, and updates the free memory locations that position vernier is directed toward, wherein, institute The free memory locations that rheme puts vernier direction are always the first free memory locations in the multiple storage region.

6. according to claim 1-5 any one of them methods, which is characterized in that the high IOPS storage mediums are hard for solid-state Disk, the low IOPS storage mediums are mechanical hard disk.

7. the device of data is managed in a kind of file system, which is characterized in that described device includes：

Module is established, for formatting original file systems, based on pre-assigned high IOPS storage mediums subregion and low IOPS Storage medium establishes new file system, in the high IOPS storage mediums subregion setting I-node lists；

Memory module for metadata to be stored to be stored in the I-node lists, and file data to be stored is stored in The low IOPS storage mediums.

8. device according to claim 7, which is characterized in that the memory module is additionally operable to：

9. device according to claim 7, which is characterized in that described device further includes：

Estimation block, for estimating the mean file size of all files to be stored, the appearance based on the low IOPS storage mediums Amount and the mean file size estimate I-node number；

Determining module, for the unit according to the I-node number, I-node unit capacitys and the low IOPS storage mediums Amount of storage determines the memory capacity of the high IOPS storage mediums subregion.

10. device according to claim 7, which is characterized in that the memory module is specifically used for：

11. device according to claim 10, which is characterized in that the memory module is specifically used for：

12. according to claim 7-11 any one of them devices, which is characterized in that the high IOPS storage mediums are hard for solid-state Disk, the low IOPS storage mediums are mechanical hard disk.

13. a kind of file-storage device, which is characterized in that the file-storage device includes processor and memory, described to deposit Be stored at least one instruction, at least one section of program, code set or instruction set in reservoir, at least one instruction, it is described extremely Few one section of program, the code set or instruction set are loaded by the processor and are performed to realize such as any institute of claim 1 to 6 The method that data are managed in the file system stated.

14. a kind of computer readable storage medium, which is characterized in that at least one instruction, extremely is stored in the storage medium Few one section of program, code set or instruction set, at least one instruction, at least one section of program, the code set or the instruction Collection is loaded by processor and is performed to realize the method for managing data in the file system as described in claim 1 to 6 is any.