CN107943867A

CN107943867A - High-performance hierarchical storage system supporting heterogeneous storage

Info

Publication number: CN107943867A
Application number: CN201711106687.7A
Authority: CN
Inventors: 佘平; 高超; 邹仕华; 张楠; 李程; 程裕强; 谢彬; 李宁波
Original assignee: No32 Research Institute Of China Electronics Technology Group Corp
Current assignee: No32 Research Institute Of China Electronics Technology Group Corp
Priority date: 2017-11-10
Filing date: 2017-11-10
Publication date: 2018-04-20
Anticipated expiration: 2037-11-10
Also published as: CN107943867B

Abstract

The invention provides a high-performance hierarchical storage system supporting heterogeneous storage, which comprises: the file system module is used for providing uniform access to data files on different storage media; the data block storage module is used for dispersedly storing the complete data to different nodes in a data slicing mode; the metadata management module combines the advantages of centralized management and decentralized management and adopts a multi-copy distributed redundant storage method; the storage scheduling module realizes the scheduling of the data copy among all nodes on one hand and realizes the scheduling of the data block among three different storage media, namely an internal memory, an SSD and an HDD on the other hand; and the visualization module is mainly used for providing the service condition of the whole storage system, displaying the real-time monitoring and visualizing operation of dynamic adjustment. The method and the device effectively solve the problem of low efficiency of massive large data access, and improve the data storage and access efficiency of the platform.

Description

Support the high-performance stratification storage system of isomery storage

Technical field

The present invention relates to a kind of storage system, and in particular, to a kind of high-performance Hierarchical storage for supporting isomery to store System.

Background technology

Under the demand of magnanimity big data, the data distribution and data access speed of data storage are huge on data processing influence Greatly, rational big data storage capacity and distribution capability decide data accessibility, and efficient data access capabilities decide Calculate processing speed.

In image data processing platform, data are mainly using centrally stored, in face of high-resolution, the shadow of big data quantity As data file, if platform storage processing is inefficient, when being pressed for time, system can not complete the number of scale According to processing.

The content of the invention

For in the prior art the defects of, the object of the present invention is to provide it is a kind of support isomery store high-performance stratification Storage system, it carries out the bedding storage of big data by that can be stored using isomery, coordinates rational data hierarchyization storage to calculate Method, can effectively solve magnanimity big data and access inefficiency problem, lift the data storage and access efficiency of platform.

According to an aspect of the present invention, there is provided a kind of high-performance stratification storage system for supporting isomery to store, it is special Sign is, including：

File system module, there is provided the unified access of the data file on different storage mediums；File system be responsible for tissue, Manage, safeguard distributed file system storage all data files, file in file system with master file, catalogue file, Meta file is stored；

Data block memory module, disperses storage by partial data by data slicer mode and arrives each different node, number Access speed can be improved in a manner of concurrent according to accessing；Data block memory module carries out data block using fixed length block algorithm Division；The algorithm carries out cutting using the good block size of justice in advance to file, and carries out the weak strong check value of check value and md5；It is weak Check value first calculates weak check value and carries out hash lookups, if it find that then counting primarily to the performance of lifting difference coding Calculate the strong check values of md5 and make further hash and search；, can be effective since weak check value calculation amount is more much smaller than md5 Improve coding efficiency；

Metadata management module, metadata are for recording the corresponding informance of file and data block in file system, first number According to the advantage of management module combination centralization and decentralization management, using the method for more copy distribution redundant storages；Provide first number According to more back-up processings and Version Control, to realize the fault-tolerance of metadata and high availability；

Scheduler module is stored, on the one hand it realizes scheduling of the data copy between each node, and on the other hand it realizes number According to scheduling of the block between memory, SSD and the different storage mediums of tri- kinds of HDD；

Visualization model, mainly provides the service condition of whole storage system, the displaying monitored in real time, and dynamic adjusts Visualized operation.

Preferably, the file system module includes two submodules again, and file describes submodule and file body submodule； File describes submodule and records description to file size, type, file mark, access limit etc.；File body submodule is remembered Record the actual data information of file.

Preferably, the file system module includes following functions：

Document creation, all metadata operations, and the read operation for File Open and write operation, pass through file system System object returns to file flow object to perform；

File is write, and file system obtains fileinfo, mould is dispatched in conjunction with file system by accessing meta data block Block, selects data block storage location, and file write data block can be selected the accumulation layer of storage by file system；

File is read, and file metadata can be changed by file system object, or by obtain one by one inlet flow come Read file.

Preferably, the data block memory module includes following functions：

Data block is distributed, and all files are stored with block, and block memory module is according to the number of file size allocation fixed length File data storage is carried out according to block, while data block is distributed on different machine nodes；

Data chunk redundancy, in order to ensure the high reliability of data, bottom ensures the more copies of data by data chunk redundancy, Number of copies can carry out pre-selection setting；

Data block galassing weighs, and during ensureing file system use, data block is evenly distributed, and will not be concentrated because of data block And data access performance is caused to lose.

Preferably, the metadata management module includes following functions：

Central repositories：It is centrally stored to preserve all metadata, is responsible for Metadata Service when system runs well and accesses；

Distributed redundancy backup：Metadata is carried out according to different nodes in system and the network distance of central repositories standby Part so that it can recover in time when metadata meets with and destroys；Under extreme case, when some storage medium is damaged, It can still be backed up by metadata and access data block；

The Version Control of metadata：The old version of certain amount is preserved using tree structure, so that user is to different Recovery between version, reduce the change of mistake influences caused by system.

Preferably, the allocation schedule of the storage scheduler module between the individual nodes has following three kinds of policy selections：

Greedy allocation strategy：Data block is distributed to the first memory node for having sufficient space；

Maximum residual space allocation policy：Data block is distributed to the memory node for having maximum residual space；

Polling dispatching allocation strategy：Distribution data block selects node to by polling dispatching.

Preferably, the visualization model includes following functions：

Additions and deletions data：Additions and deletions data provide the basic pipes such as upload and the deletion to data at all levels in storage system Science and engineering is made, and facilitates the operation of user；

System demonstration：System demonstration module shows the use overview of whole system, the loading condition of each storage hierarchy Deng；

Monitoring management：Monitoring management is the real-time monitoring to the operation conditions of whole system, so as to find in time overload or Data therein are backed up and migrated by the storage medium that person is damaged；

Storage configuration：Storage configuration is then the storage configuration management module of system, realizes the visual configuration to system, Easy to the management of cluster；The function of the Version Control of configuration file is provided at the same time, to realize the fast quick-recovery between version.

Compared with prior art, the present invention has following beneficial effect：

(1) since data hierarchy stores, hot spot data copy is preferentially loaded into the high storage of performance, and system is special according to data Property give full play to isomery storage high-performance.

(2) data dynamic adjusts, and existing distributed storage scheme is all data copy static storage, and data do not migrate, This programme can realize the dynamic memory of data according to the timeliness of data.

(3) it is data cached using distributed memory, relative to centrally stored, hot spot data only have common copy can including Deposit or high speed storing, other copies can be persisted to disk or low speed storage, on the one hand accelerate Distributed Calculation, on the one hand carry High disk utilization.

Brief description of the drawings

Upon reading the detailed description of non-limiting embodiments with reference to the following drawings, further feature of the invention, Objects and advantages will become more apparent upon：

Fig. 1 is the functional block diagram for the high-performance stratification storage system that the present invention supports isomery storage.

Embodiment

With reference to specific embodiment, the present invention is described in detail.Following embodiments will be helpful to the technology of this area Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill to this area For personnel, without departing from the inventive concept of the premise, various modifications and improvements can be made.These belong to the present invention Protection domain.

As shown in Figure 1, the present invention supports the high-performance stratification storage system of isomery storage to include：

File system module, there is provided the unified access of the data file on different storage mediums.File system be responsible for tissue, Manage, safeguard distributed file system storage all data files, file in file system with master file, catalogue file, Meta file is stored.

Data block memory module, disperses storage by partial data by data slicer mode and arrives each different node, number Access speed can be improved in a manner of concurrent according to accessing.Data block memory module carries out data block using fixed length block algorithm Division.The algorithm carries out cutting using the good block size of justice in advance to file, and carries out the weak strong check value of check value and md5.It is weak Check value first calculates weak check value and carries out hash lookups, if it find that then counting primarily to the performance of lifting difference coding Calculate the strong check values of md5 and make further hash and search., can be effective since weak check value calculation amount is more much smaller than md5 Improve coding efficiency；

Metadata management module, metadata are for recording the corresponding informance of file and data block in file system, first number According to the advantage of management module combination centralization and decentralization management, using the method for more copy distribution redundant storages.Provide first number According to more back-up processings and Version Control, to realize the fault-tolerance of metadata and high availability；

File system module includes two submodules again, and file describes submodule and file body submodule；File description Module records the description to file size, type, file mark, access limit etc.；File body submodule records file Actual data information.

File system module includes following functions：

Data block memory module includes following functions：

Metadata management module includes following functions：

The Version Control of metadata：The old version of certain amount (user can voluntarily be set) is preserved using tree structure, So that user is the recovery different versions, reduce the change of mistake influences caused by system.

The allocation schedule of storage scheduler module between the individual nodes has three kinds of strategies to select：

(1) greedy allocation strategy：Data block is distributed to the first memory node for having sufficient space；

(2) maximum residual space allocation policy：Data block is distributed to the memory node for having maximum residual space；

(3) polling dispatching allocation strategy：Distribution data block selects node to by polling dispatching.

Visualization model includes following functions：

Storage scheduling of the data between memory, SSD and HDD, then mainly to the greatest extent may be used according to data temperature evaluation mechanism Can be so that the high data copy of cache access frequency as far as possible in memory, SSD preserves the data copy that visiting frequency takes second place, and HDD Then preserve cold data copy.On the one hand the temperature of data can be set by system for each user, can also upload number in user According to when specify a kind of temperature.Temperature is divided into 0 to 3 four kinds of ranks, sets the accumulation layer of three copies on different nodes It is secondary, as shown in table 1.Such a storage strategy can play high access efficiency, and storage is improved while data high reliability is ensured The utilization rate of capacity.

1 data temperature storage strategy of table

Temperature rank	Primary copy	From copy	From copy
				0	Memory	SSD	SSD
1	Memory	SSD	HDD
				2	SSD	HDD	HDD
3	HDD	HDD	HDD

Since distributed file system uses the strategy of copy redundancy to realize data high availability, but reality is using In, it is not that each copy possesses very high accessed property.System realizes the dynamic migration of dsc data copy, hot copy loading To memory, cold copy is persisted to disk.

Storage migration ensure that data according to the actual requirements or actually use situation, different between a node to store Migration between level.Generally, the average I/O of some file nearest a period of time can reflect the temperature of this file；But Even if being that some files infrequently access, but it is needed to be called in time when accessing.So we have provided one to the user The parameter of a configurable file importance.

If regulation file is vital document, retains a copy and never degrade in memory, if regulation file is time weight Want file；Then retain a copy to determine whether to move back in SSD, the I/O that is averaged in the recent period according to file；If regulation file is not weigh Will, then retain copy and determine whether to move back in HDD, the I/O that is averaged in the recent period also according to file.The problem of in view of memory capacity, Final storage tier is finally determined according to file importance, average I/O and remaining three parameters of disk size.

Determine that the strategy of file level is as follows based on average I/O：

Average temperature approximate formula such as following formula (1)：

I/Oavg=(Anew-Aold)/(ST) ... ... (1)

Wherein, Anew is the statistical value for the nearest I/O activities of file read according to record log, and Aold is earliest statistics Value, S represent file size, and T represents default time value.

Workflow is as follows：In certain time interval T, the I/Oavg of All Files under the level is calculated, and finds out maximum Max and minimum value min.Certain I/O threshold values are set, will be stored less than the file destination of this value to migration queue.For sublevel The file hierarchy of level, sets threshold value of moving back, if more than the value, then puts files into queue of moving back.

The strategy of file level is determined according to disk space：The speed and appearance of tri- memory, SSD and HDD storage hierarchys Pyramid is presented in amount, in order to solve the problems, such as that the memory space of upper level more easily takes, to memory layers and SSD layers points Not She Zhi a headspace utilization rate threshold value, maximum Cmax and minimum value Cmin.If practical efficiency is more than Cmax, File system forces progress file to be moved out, if practical efficiency between Cmin and Cmax, suspends data write operation, it is allowed to Data are moved out operation；If practical efficiency is less than Cmin, pause data, which are moved out, operates the write-in of permission data.

The preferentially file of never labeled severity level, and select which data to move out in small significance data. Select the strategy moved out, there is provided following three kinds of modes：Greedy take-back strategy：Arbitrary block is removed until discharging required size Space.LRU take-back strategies：Remove space of the least recently used data block until discharging required size.Part LRU is returned Receive strategy：Based on least recently used removal, but selection has the storage catalogue in maximum residual space, only removes number from the catalogue According to block.

The present invention is the high efficiency storage technologies of a dynamic adjustment, and efficient data-storage system can be built by it. Data-storage system is by memory, the difference storage medium composition such as SSD and HDD, between different storage mediums by software definition into Row data are managed collectively.Data-storage system mainly includes file system module, data block memory module, metadata management module With storage scheduler module, the functions such as data communication and network interface, data management, data storage operations are provided to outside.

The specific embodiment of the present invention is described above.It is to be appreciated that the invention is not limited in above-mentioned Particular implementation, those skilled in the art can make various deformations or amendments within the scope of the claims, this not shadow Ring the substantive content of the present invention.

Claims

A kind of 1. high-performance stratification storage system for supporting isomery to store, it is characterised in that including：

File system module, there is provided the unified access of the data file on different storage mediums；File system is responsible for tissue, pipe All data files of distributed file system storage are managed, safeguard, file is in file system with master file, catalogue file, member File is stored；

Data block memory module, disperses storage by partial data by data slicer mode and arrives each different node, data are visited Access speed can be improved in a manner of concurrent by asking；Data block memory module draws data block using fixed length block algorithm Point；The algorithm carries out cutting using the good block size of justice in advance to file, and carries out the weak strong check value of check value and md5；Weak school Value is tested primarily to the performance of lifting difference coding, first calculates weak check value and carry out hash lookups, if it find that then calculating The strong check values of md5 simultaneously make further hash lookups；Since weak check value calculation amount is more much smaller than md5, can effectively carry High coding efficiency；

Metadata management module, metadata are for recording the corresponding informance of file and data block in file system, metadata pipe The advantage of module combination centralization and decentralization management is managed, using the method for more copy distribution redundant storages；It is more to provide metadata Back-up processing and Version Control, to realize the fault-tolerance of metadata and high availability；

Scheduler module is stored, on the one hand it realizes scheduling of the data copy between each node, and on the other hand it realizes data block Scheduling between memory, SSD and the different storage mediums of tri- kinds of HDD；

Visualization model, mainly provides the service condition of whole storage system, the displaying monitored in real time, and dynamic adjustment can Operated depending on changing.
2. the high-performance stratification storage system according to claim 1 for supporting isomery storage, it is characterised in that the text Part system module includes two submodules again, and file describes submodule and file body submodule；File describes submodule and records Description to file size, type, file mark, access limit etc.；File body submodule record the actual number of file it is believed that Breath.
3. the high-performance stratification storage system according to claim 1 for supporting isomery storage, it is characterised in that the text Part system module includes following functions：

Document creation, all metadata operations, and the read operation for File Open and write operation, pass through file system pair Performed as returning to file flow object；

File is write, and file system obtains fileinfo, in conjunction with file system scheduler module, choosing by accessing meta data block Data block storage location is selected, file write data block can be selected the accumulation layer of storage by file system；

File is read, file metadata can be changed by file system object, or is read by obtaining inlet flow one by one File.
4. the high-performance stratification storage system according to claim 1 for supporting isomery storage, it is characterised in that the number Include following functions according to block memory module：

Data block is distributed, and all files are stored with block, and block memory module is according to the data block of file size allocation fixed length File data storage is carried out, while data block is distributed on different machine nodes；

Data chunk redundancy, in order to ensure the high reliability of data, bottom ensures the more copies of data, copy by data chunk redundancy Number can carry out pre-selection setting；

Data block galassing weighs, and during ensureing file system use, data block is evenly distributed, and will not be made because data block is concentrated Lost into data access performance.
5. the high-performance stratification storage system according to claim 1 for supporting isomery storage, it is characterised in that the member Data management module includes following functions：

Central repositories：It is centrally stored to preserve all metadata, is responsible for Metadata Service when system runs well and accesses；

Distributed redundancy backup：Metadata is backed up according to different nodes in system and the network distance of central repositories, Enable and recover in time when metadata meets with and destroys；Under extreme case, when some storage medium is damaged, still It can be backed up by metadata and access data block；

The Version Control of metadata：The old version of certain amount is preserved using tree structure, so that user is to different versions Between recovery, reduce mistake change influenced caused by system.
6. the high-performance stratification storage system according to claim 1 for supporting isomery storage, it is characterised in that described to deposit The allocation schedule of storage scheduler module between the individual nodes has following three kinds of policy selections：

Greedy allocation strategy：Data block is distributed to the first memory node for having sufficient space；

Maximum residual space allocation policy：Data block is distributed to the memory node for having maximum residual space；

Polling dispatching allocation strategy：Distribution data block selects node to by polling dispatching.
7. it is according to claim 1 support isomery storage high-performance stratification storage system, it is characterised in that it is described can Include following functions depending on changing module：

Additions and deletions data：Additions and deletions data provide the basic management works such as upload and the deletion to data at all levels in storage system Make, facilitate the operation of user；

System demonstration：System demonstration module shows the use overview of whole system, loading condition of each storage hierarchy etc.；

Monitoring management：Monitoring management is the real-time monitoring to the operation conditions of whole system, so as to find in time overload or by Data therein are backed up and migrated by the storage medium of damage；

Storage configuration：Storage configuration is then the storage configuration management module of system, realizes the visual configuration to system, is easy to The management of cluster；The function of the Version Control of configuration file is provided at the same time, to realize the fast quick-recovery between version.