A kind of distributed meta-data management method and system
Technical field
The present invention relates to metadata management technical field, more particularly to a kind of distributed meta-data management method and system.
Background technology
At present, in the face of increasing mass data, because the restriction of performance and price, existing storage mode has been got over
Can not more meet demand, market demand is the data-storage system with large buffer memory, expansible, safety and High Availabitity,
So as to distributed storage is arisen at the historic moment under this demand.
In order to effectively manage each memory node in distributed memory system, according to metadata sum in file system
According to storage and accessing characteristic, distributed file system is typically stored separately metadata and data.Metadata storage system is
Connection user and the bridge of data storage server.Therefore efficient metadata management is to realizing the high property of distributed memory system
Can be most important with extensibility, the distributed management of metadata becomes an important study hotspot.Existing metadata pipe
There is load imbalance in reason strategy, renaming operation can cause the substantial amounts of metadata to move and metadata management system expands
The problems such as malleability is bad.
The content of the invention
It is an object of the invention to provide a kind of distributed meta-data management method and system, so as to solve prior art in
The foregoing problems of presence.
To achieve these goals, the technical solution used in the present invention is as follows:
A kind of distributed meta-data management method, including:The static load balancing method and the dynamic of metadata of metadata
Load-balancing method;
The static load balancing method of the metadata is:Taken using the uniformity hash function and metadata of dummy node
Business device list, by the data allocations to metadata server node;Wherein, the meta data server list is record
All dummy nodes store one to the table of the mapping relations between meta data server on each described metadata server node
The list of the dummy node stored on the individual node;
The dynamic load balancing method of the metadata is:By the way of metadata migration, by part metadata from mistake
Load node is moved to and kicked the beam on node.
Preferably, the static load balancing method of the metadata, comprises the steps:
A1, after start-up, meta data server manager matches somebody with somebody system according to each meta data server information and list items
Confidence breath generates meta data server list;
A2, according to the fullpath of file, using uniformity hash function, in finding the meta data server list
, and find corresponding target metadata server;
A3, according to the list of the dummy node stored on the metadata server node, in target metadata clothes
Addition metadata information in the dummy node of business device.
Preferably, each described meta data server occurs in the item number in the meta data server list, using such as
Minor function is calculated:
Wherein, what Ui was represented is the number of times that i-th meta data server occurs in list, and C represents the item number of list,
N represents the sum of the meta data server.
Preferably, the uniformity hash function is:
NameNode_Locator=Hash (f) mod NNT_Length,
Wherein, NameNode_Locator represents the item in the meta data server list of selection, and f is the complete road of file
Footpath title, NNT_Length is the total item in meta data server list.
Preferably, the dynamic load balancing method of the metadata, comprises the steps:
B1, the meta data server taken at regular intervals load information, and it is sent to meta data server manager;
B2, the meta data server manager periodically calculates the load balancing degrees of the meta data server, if institute
The load balancing degrees for stating meta data server have exceeded the threshold value of setting, then the meta data server is overload node, if
The threshold value that the load balancing degrees of the meta data server not up to set, then the meta data server is the node that kicks the beam;
Part metadata is moved to the node that kicks the beam by B3, the meta data server manager from the overload node
On;
B4, the overload node and the node updates load information that kicks the beam, and it is sent to the meta data server pipe
Reason person.
Preferably, the load balancing degrees of the meta data server are calculated using equation below:
Ti=η1di+η2mi,
In formula,
jiIt is i-node load balancing index in time t;
wiIt is the loading index of i-th metadata server node in t;
N is the number of units of meta data server;
η1+η2=1,
TiIt is the loading index of the i items in meta data server list in t, common n items;
diIt is the operation operating lag of i items in meta data server list in t;
miIt is the number of i item meta data servers in t meta data server list.
Preferably, the dynamic load balancing method of the metadata, also including step:
The overall load degree of computing system, if the overall load degree of system exceedes the threshold value of setting, adds in systems
Metadata server node;Wherein, using the overall load degree of the system as described in minor function calculating:
Wherein,
E is the loading index of system,
N is metadata server node number;
wiIt is the loading index of i-th metadata server node in t.
Preferably, also include:Metadata is carried out using catalogue redirection table and postpones movement, solve metadata locally consistent
The method of sex chromosome mosaicism, specially:
A directory path redirection table, the directory path redirection table are safeguarded on each meta data server
For storage metadata information not on current meta data server;
Each item in the directory path redirection table is a pair of key assignments<Hash (directory path), dummy node>,
The former is the cryptographic Hash of the directory path after renaming, and the latter is the storage location for needing mobile metadata current.
A kind of distributed meta-data management system, including:Meta data server manager and meta data server, the unit
Data server manager includes meta data server list maintenance module, the selecting module of meta data server and load balancing
Module;The meta data server includes metadata processing module and load measure module;
The meta data server list maintenance module is responsible for safeguarding between dummy node and metadata server node
Correct corresponding relation;
The selecting module of the meta data server is used to complete the random distribution of metadata,
The load balancing module is used to receive each meta data server load information, computing system load value
And the load to each meta data server is ranked up, system load it is unbalanced or metadata server cluster need
The movement of metadata is carried out during adjustment;
The load measure module is used to be responsible for collecting the load information on current server, calculates each dummy node
Load, the load on current server is thus calculated again, and load information is sent to into meta data server manager;
The metadata processing module includes read through model, writing module and the modified module of metadata, and the read through model is responsible for
The acquisition of metadata, the writing module is responsible for the storage of metadata, and the modified module is responsible for after renaming operation to metadata
Process, safeguard a catalogue redirection table, the directory path redirection table be used for storage not current meta data take
Metadata information on business device.
Preferably, also including backup server, the backup server includes the gerentocratic backup clothes of meta data server
The backup server of business device and meta data server, the gerentocratic backup server of the meta data server, in first number
It is responsible for replacing its work when breaking down according to server managers, and its data is recovered;The meta data server
Backup server, is responsible for carrying out the recovery of data to it when metadata server node breaks down.
The invention has the beneficial effects as follows:Distributed meta-data management method and system provided in an embodiment of the present invention, pass through
Using the two kinds of strategies of dynamic load leveling in the static load balancing and system operation during meta-data distribution, metadata is improve
The utilization rate of Service Source, it is ensured that the load balancing of system, improves the extensibility of system;In addition, by using being based on
The metadata that the mobile scheme of metadata delay of catalogue redirection table solves renaming operation and can cause moves in a large number problem,
Ensure that stablizing for system effectiveness.
Description of the drawings
Fig. 1 is the Organization Chart of the distributed file system containing metadata management system of the invention;
Fig. 2 is NameNode list exemplary plots in metadata management system provided in an embodiment of the present invention;
Fig. 3 is the reading flow process of metadata provided in an embodiment of the present invention;
Fig. 4 is guaranteed reliability's policy map of metadata management system provided in an embodiment of the present invention.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, below in conjunction with accompanying drawing, the present invention is entered
Row is further described.It should be appreciated that specific embodiment described herein is not used to only to explain the present invention
Limit the present invention.
Embodiment one
A kind of distributed meta-data management method is embodiments provided, including:The static load balancing of metadata
The dynamic load balancing method of method and metadata;
The static load balancing method of the metadata is:Taken using the uniformity hash function and metadata of dummy node
Business device list, by the data allocations to metadata server node;Wherein, the meta data server list is record
All dummy nodes store one to the table of the mapping relations between meta data server on each described metadata server node
The list of the dummy node stored on the individual node;
The dynamic load balancing method of the metadata is:By the way of metadata migration, by part metadata from mistake
Load node is moved to and kicked the beam on node.
In the embodiment of the present invention, including the distributed file system of distributed meta-data management system, its framework can be found in
Shown in Fig. 1, it will be seen from figure 1 that the overall architecture of the distributed file system includes four parts:Data storage server DN
(DataNode), as the memory node of application data, the data block after storage file cutting;Meta data server NN
(NameNode), as metadata response and more new node, it is responsible for safeguarding global name space, wherein comprising file and file
Folder attribute, NN safeguards NameSpace tree and preserves mapping of the data block to DN in file.NN has one or many in a cluster
It is individual;User Client, supports to the reading and writing of file system, deletes file, establishment and the operation such as deltree, and Client and NN is handed over
Mutual control information (metadata), with DN interaction data streams (application data);The management node of NameNode
NNManager, is responsible for the status information of each NN of periodic harvest, safeguards NameNode lists;Wherein, NameNode lists
NNT, for storing NameNode;NameNode Personal (NNP), for storing place NN in respective items information;NNT
It is responsible for safeguarding and is updated by NNManger with NNP;Catalogue redirection table DPRT, for storing metadata information not current
Directory information list on meta data server, can safeguard a DPRT on each NN.
In said method, when initial distribution is carried out to metadata, by consistent using what is optimized using virtual machine point
Property hash function is assigning it on metadata server node, it is ensured that load balancing of the metadata in static distribution;
With the operation of system, metadata server node occurs the situation of load imbalance, by adopting metadata
The mode of migration, part metadata is moved to from overload node and is kicked the beam on node, so as to realize multiple meta data server sections
Load balancing between point;When the metadata of system storage is sufficiently large, it may appear that the overall load degree of system exceedes threshold value
Phenomenon, by adding the method for metadata server node to system system load is reduced;
So, the embodiment of the present invention, by using the static load balancing and system operation during meta-data distribution in it is dynamic
Two kinds of strategies of state load balancing, improve the utilization rate of Metadata Service resource, it is ensured that the load balancing of system, improve and are
The extensibility of system.
In a preferred embodiment of the invention, the static load balancing method of the metadata, comprises the steps:
A1, after start-up, meta data server manager matches somebody with somebody system according to each meta data server information and list items
Confidence breath generates meta data server list;
A2, according to the fullpath of file, using uniformity hash function, in finding the meta data server list
, and find corresponding target metadata server;
A3, according to the list of the dummy node stored on the metadata server node, in target metadata clothes
Addition metadata information in the dummy node of business device.
Meta data server manager safeguards that a meta data server list (is expressed as:NameNode lists or NNT),
NameNode lists are the tables for recording all dummy nodes to the mapping relations between meta data server.After system starts, table
Interior item number is constant, i.e. the number of dummy node is constant.To make meta data server adjustment of load process more flexible, granularity is more
Little, item number is sufficiently large within the specific limits.
Wherein, each described meta data server occurs in the item number in the meta data server list, using as follows
Function is calculated:
Wherein, what Ui was represented is the number of times that i-th meta data server occurs in list, and C represents the item number of list,
N represents the sum of the meta data server.
Fig. 2 shows meta data server list exemplary plot.7 are had in list, respectively to there is A, B, C and D etc. 4
Meta data server.
When actually used, said method can be adopted to be implemented with the following method:
Client can obtain NameNode lists when accessing first time to meta data server manager, afterwards in system
During operation, if NameNode lists change, meta data server manager can be newest Metadata Service
Device list is sent to client.When client reads a file, storage is calculated according to the cryptographic Hash of file complete path name
Dummy node numbering, then according to NameNode list lookups go out metadata storage server which is.
In a preferred embodiment of the invention, the uniformity hash function is:
NameNode_Locator=Hash (f) mod NNT_Length,
Wherein, NameNode_Locator represents the item in the meta data server list of selection, and f is the complete road of file
Footpath title, NNT_Length is the total item in meta data server list.
In a preferred embodiment of the invention, the dynamic load balancing method of the metadata, comprises the steps:
B1, the meta data server taken at regular intervals load information, and it is sent to meta data server manager;
B2, the meta data server manager periodically calculates the load balancing degrees of the meta data server, if institute
The load balancing degrees for stating meta data server have exceeded the threshold value of setting, then the meta data server is overload node, if
The threshold value that the load balancing degrees of the meta data server not up to set, then the meta data server is the node that kicks the beam;
Part metadata is moved to the node that kicks the beam by B3, the meta data server manager from the overload node
On;
B4, the overload node and the node updates load information that kicks the beam, and it is sent to the meta data server pipe
Reason person.
Wherein, the load balancing degrees of the meta data server are calculated using equation below:
Ti=η1di+η2mi,
In formula,
In formula,
jiIt is i-node load balancing index in time t;
wiIt is the loading index of i-th metadata server node in t;
N is the number of units of meta data server;
η1+η2=1,
TiIt is the loading index of the i items in meta data server list in t, common n items;
diIt is the operation operating lag of i items in meta data server list in t;
miIt is the number of i item meta data servers in t meta data server list.
In the embodiment of the present invention, the dynamic load balancing method of the metadata, also including step:
The overall load degree of computing system, if the overall load degree of system exceedes the threshold value of setting, adds in systems
Metadata server node;Wherein, using the overall load degree of the system as described in minor function calculating:
Wherein,
E is the loading index of system,
N is metadata server node number;
wiIt is the loading index of i-th metadata server node in t.
In the embodiment of the present invention, the dynamic load balancing method of metadata includes two aspects:
One is when the threshold value that the load of certain metadata server node is arranged beyond system, then to need maximum from load
Meta data server on select the maximum dummy node of load, the metadata information above it is moved to into the minimum unit of load
On data server;Two is, when the load of whole system has exceeded the threshold value of setting, to illustrate the meta data server of current scale
Cluster can not meet the demand of system, need to add meta data server, then enter according still further to the strategy of the first situation
The adjustment of row load balancing, after the completion of adjustment, meta data server manager can be adjusted to NameNode lists, and will most
New NameNode lists are sent to client, meta data server and data storage server.
Distributed meta-data management method provided in an embodiment of the present invention, can also include:Using catalogue redirection table
Carry out metadata and postpone movement, the method for solving metadata locally consistent sex chromosome mosaicism, specially:
A directory path redirection table, the directory path redirection table are safeguarded on each meta data server
For storage metadata information not on current meta data server;
Each item in the directory path redirection table is a pair of key assignments<Hash (directory path), dummy node>,
The former is the cryptographic Hash of the directory path after renaming, and the latter is the storage location for needing mobile metadata current.
Wherein, catalogue redirection table can be expressed as DPRT;
As shown in figure 3, the specific implementation process of said method can be:
When client accesses file, compiled according to the dummy node that the cryptographic Hash of file complete path name calculates storage
Number, which the server that metadata storage is gone out according to meta data server list lookup is, i.e. target metadata server, because
The metadata adopted in this paper systems postpones mobile method, it is possible that the metadata that access occurs does not take in current goal
Situation on business device, so, when target metadata information is inquired about, search on target metadata server safeguard thereon first
DPRT on either with or without the corresponding cryptographic Hash item of file complete path name, if it has, then illustrate the metadata information to be inquired about not
On target metadata server, then the server that metadata of arriving is located inquires about up target metadata information, and by unit
Data message is moved on target metadata server, deletes the respective items on the DPRT for safeguarding thereon;If it is not, in mesh
Target metadata information is inquired about on mark meta data server.
In the method, just moved when catalogue or filename is changed, but metadata is just carried out when accessing
Movement, so, the movement of metadata is postponed, when large-scale metadata occurring moving, it is possible to ensure system throughput
The stability of amount.
Embodiment two
A kind of distributed meta-data management system is embodiments provided, including:Meta data server manager and
Meta data server, the meta data server manager includes meta data server list maintenance module, meta data server
Selecting module and load balancing module;The meta data server includes metadata processing module and load measure module;
The meta data server list maintenance module is responsible for safeguarding between dummy node and metadata server node
Correct corresponding relation;
The selecting module of the meta data server is used to complete the random distribution of metadata,
The load balancing module is used to receive each meta data server load information, computing system load value
And the load to each meta data server is ranked up, system load it is unbalanced or metadata server cluster need
The movement of metadata is carried out during adjustment;
The load measure module is used to be responsible for collecting the load information on current server, calculates each dummy node
Load, the load on current server is thus calculated again, and load information is sent to into meta data server manager;
The metadata processing module includes read through model, writing module and the modified module of metadata, and the read through model is responsible for
The acquisition of metadata, the writing module is responsible for the storage of metadata, and the modified module is responsible for after renaming operation to metadata
Process, safeguard a catalogue redirection table, the directory path redirection table be used for storage not current meta data take
Metadata information on business device.
The distributed meta-data management system of said structure, its management method to metadata has been entered in embodiment one
Detailed description is gone, will not be described in detail herein.
The distributed meta-data management system of the structure, it is possible to achieve following function:Static state in meta-data distribution is born
Carry the dynamic load leveling in balanced and system operation, improve the utilization rate of Metadata Service resource, it is ensured that system it is negative
Carry balanced, improve the extensibility of system;Furthermore it is possible to postpone to move by using the metadata based on catalogue redirection table
Dynamic scheme solves the problems, such as that the renaming metadata that can cause of operation is mobile in a large number, it is ensured that system effectiveness is stablized.
Distributed meta-data management system provided in an embodiment of the present invention, can also include backup server, the backup
Server includes the backup server of the gerentocratic backup server of meta data server and meta data server, the metadata
The backup server of server managers, for being responsible for replacing its work when meta data server manager is broken down, and
Its data is recovered;The backup server of the meta data server, is responsible for being broken down in metadata server node
When the recovery of data is carried out to it.
Used as the guaranteed reliability of metadata management system, its actual course of work can be above-mentioned backup server:
The gerentocratic backup server backNNM of meta data server, what is taken is redundancy scheme;Meta data server
Backup server backNN, what is taken is log mechanism.
Main NNM and backNNM runs identical program simultaneously, is led to using network by module for reading and writing between the two
Letter, is mainly responsible for supervision main NNM when backNNM is flat, the state of main NNM is analyzed by message processing module, and main NNM can be regular
Heartbeat message is sent to backNNM, the state of its own is informed, if it exceeds a cycle backNNM does not receive main NNM sending out
The heartbeat message for coming, then it is considered that main NNM there occurs failure, backNNM can take over all working on main NNM, be
System provides service, and main NNM is recovered.After main NNM recovers, heartbeat message can be sent to backNNM, inform it
Recovered normal, and the adapter all working from backNNM, backNNM then returns to listening state, and the strategy can ensure that clothes
Business is not interrupted;NN (meta data server) is also to be interacted by communication module and backNN between, and NN and backNN sets up
After connection, data are received and send, wherein data mainly include journal file and metadata mirror image.Then backNN is by synthesis
Journal file and metadata mirror image are synthesized new metadata image file by module in internal memory, and the strategy can cause in service
It is disconnected.Specifically can be found in Fig. 4.
By using above-mentioned technical proposal disclosed by the invention, having obtained following beneficial effect:The embodiment of the present invention is carried
For distributed meta-data management method and system, by using in the static load balancing and system operation during meta-data distribution
Two kinds of strategies of dynamic load leveling, improve the utilization rate of Metadata Service resource, it is ensured that the load balancing of system, improve
The extensibility of system;In addition, postponing mobile scheme by using the metadata based on catalogue redirection table solves weight
The metadata that naming operation can cause moves in a large number problem, it is ensured that system effectiveness is stablized.
Specifically, it is real by carrying out meta-data distribution using the uniformity hash function being optimized using dummy node
Static load balancing during meta-data distribution is showed;When the load balancing degrees of metadata server node exceed given threshold,
The dynamic load leveling of metadata is realized by way of using metadata migration;When the overall load of system exceedes setting threshold
During value, by the load that system is reduced to the addition of metadata server node;When renaming operation causes the big of metadata
When amount is mobile, by postponing mobile scheme using the metadata based on catalogue re-direction table stablizing for system effectiveness is ensured.
Each embodiment in this specification is described by the way of progressive, what each embodiment was stressed be with
The difference of other embodiment, between each embodiment identical similar part mutually referring to.
Those skilled in the art should be understood that the sequential of the method and step that above-described embodiment is provided can be entered according to actual conditions
Row accommodation, is concurrently carried out also dependent on actual conditions.
All or part of step in the method that above-described embodiment is related to can be instructed by program correlation hardware come
Complete, described program can be stored in the storage medium that computer equipment can read, for performing the various embodiments described above side
All or part of step described in method.The computer equipment, for example:Personal computer, server, the network equipment, intelligent sliding
Dynamic terminal, intelligent home device, wearable intelligent equipment, vehicle intelligent equipment etc.;Described storage medium, for example:RAM、
ROM, magnetic disc, tape, CD, flash memory, USB flash disk, portable hard drive, storage card, memory stick, webserver storage, network cloud storage
Deng.
Finally, in addition it is also necessary to explanation, herein, such as first and second or the like relational terms be used merely to by
One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation
Between there is any this actual relation or order.And, term " including ", "comprising" or its any other variant meaning
Covering including for nonexcludability, so that a series of process, method, commodity or equipment including key elements not only includes that
A little key elements, but also including other key elements being not expressly set out, or also include for this process, method, commodity or
The intrinsic key element of equipment.In the absence of more restrictions, the key element for being limited by sentence "including a ...", does not arrange
Except also there is other identical element in including the process of the key element, method, commodity or equipment.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should
Depending on protection scope of the present invention.