CN103729436A - Distributed metadata management method and system - Google Patents

Distributed metadata management method and system Download PDF

Info

Publication number
CN103729436A
CN103729436A CN201310741599.XA CN201310741599A CN103729436A CN 103729436 A CN103729436 A CN 103729436A CN 201310741599 A CN201310741599 A CN 201310741599A CN 103729436 A CN103729436 A CN 103729436A
Authority
CN
China
Prior art keywords
node
metadata
data
primary copy
check
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310741599.XA
Other languages
Chinese (zh)
Inventor
王海平
王树鹏
张永铮
吴广君
周晓阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201310741599.XA priority Critical patent/CN103729436A/en
Publication of CN103729436A publication Critical patent/CN103729436A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases

Abstract

The invention relates to a distributed metadata management method and system. The distributed metadata management method particularly includes the steps of carrying out storage: carrying out partition to form independent metadata nodes for storing metadata and independent user table nodes for storing user tables, adopting a plurality of metadata nodes to store multiple copies of the metadata, and forming main copy nodes and auxiliary copy nodes, wherein the same metadata are stored in the main copy nodes and the auxiliary copy nodes; carrying out verifying: carrying out data verification on the main copy nodes and the auxiliary copy nodes to guarantee that the metadata stored in the main copy nodes and the metadata stored in the auxiliary copy nodes are coincident; carrying out repairing: building monitoring rings based on the main copy nodes and the auxiliary copy nodes through the ZooKeeper technology, and when it is monitored that some main copy nodes or some auxiliary copy codes fail, triggering switching between the main copy nodes and the auxiliary copy nodes through the monitoring rings to achieve repairing of the failed nodes. The distributed metadata management system corresponds to the technical schemes of the distributed metadata management method in a one-to-one mode. The distributed metadata management method and system solve the problem of single-node failures in metadata management and achieves the aim of coincidence among multiple copies.

Description

A kind of distributed meta-data management method and system
Technical field
The invention belongs to mass data storage management review field, particularly relate to the metadata management of large field of data storage, is a kind of distributed meta-data management method and system.
Background technology
In recent years, along with the development of information society, increasing information, by datumization, is especially accompanied by the development of Internet, and data are explosive growth.First be the sharply expansion of memory capacity, thereby proposed larger demand for storage server; Next is the increase of data duration; Finally, the management of data storage is had higher requirement.Especially, the variation of data, geographic dispersiveness, protection of significant data etc. is all had higher requirement to data management.Along with the blast of unstructured data, distributed data base has also entered the gold period of development, from high-performance calculation to data center, from data sharing to internet, applications, has been penetrated into each face of each side of market demand.For most of distributed data bases, conventionally that metadata and data is independent, be about to control stream and data stream carry out separated, thereby obtain higher tactful extendability and I/O concurrency.Thereby metadata management model seems most important, directly have influence on tactful extendability, performance, reliability and stability etc.
Capacity increase in data storage is endless, and the management of metadata is also had higher requirement.When distributed storage, exist many machines to read while write the sight that metadata table is carried out to read and write access, require metadata management strategy that high stable, high performance Metadata Service are provided.Existing metadata management strategy probably has three classes: centralized metadata management strategy, without Metadata Service strategy and distributed meta-data management strategy.Storage and client query request that centralized Metadata Service strategy provides a central meta data server to be responsible for metadata, it provides unified NameSpace, and processes the access control functions such as location resolution and data location.Its shortcoming is very outstanding, wherein two the most key be performance bottleneck and Single Point of Faliure problem.Without Metadata Service strategy, adopt elasticity hash algorithm, directly abandon Metadata Service, allow metadata and data all together with storage.Data consistent problem is more complicated like this, and read-write operation inefficiency lacks global monitoring management function.Also cause client to bear more function, increased the load of client, take suitable CPU and internal memory simultaneously.Traditional distributed metadata management strategy use multiple servers forms cluster and works in coordination with and provide Metadata Service for distributed data base, thereby eliminate performance bottleneck and the Single Point of Faliure problem of centralized Metadata Service model, also eliminated inefficiency and the difficult problem of overall situation supervision without Metadata Service strategy.But traditional distributed metadata management strategy also has its defect, as the consistency problem between performance cost and many copies.
Therefore,, for the limitation of metadata management in prior art, the present invention proposes a kind of new distributed meta-data management method and system.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of distributed meta-data management method and system, for solving Single Point of Faliure that present technology metadata management exists and the problem such as consistance between several.
The technical scheme that the present invention solves the problems of the technologies described above is as follows: a kind of distributed meta-data management method, specifically comprises the following steps:
Storing step: divide independently metadata node and subscriber's meter node, be respectively used to storing metadata and subscriber's meter, and adopt many copies of a plurality of metadata node storing metadatas, form all for storing the primary copy node of same metadata and from replica node;
Checking procedure: carry out data check to primary copy node with from replica node, with the consistance of the metadata that guarantees primary copy node and store from replica node;
Repair step: adopt ZooKeeper technology to set up based on primary copy node with from the supervision ring of replica node, when monitoring that ring has monitored primary copy node or delayed machine from replica node, it triggers primary copy node and from the switching between replica node, realizes the reparation to the machine node of delaying.
On the basis of technique scheme, the present invention can also do following improvement.
Further, described storing step also comprises employing dynamical fashion or static mode extended metadata node;
Dynamical fashion specifically comprises: increase the empty node of metadata, find after the empty node of metadata, to the empty node transmission unit of the metadata of finding data by verification;
Static mode specifically comprises: after all metadata node shutdown, then increase new metadata node, and revise its configuration when this newly-increased metadata node starts.
Further, described to primary copy node with carry out data check from replica node and adopt lightweight data check mode, specifically comprise: when metadata node starts, to all metadata node, send request, obtain the number that records of each metadata table burst in each metadata node, if it is inconsistent to record number, illustrated that data are inconsistent, close the data fragmentation service in ineligible metadata node, delete the data of this data fragmentation simultaneously, and trigger copy reparation operation.
Further, described to primary copy node with carry out data check from replica node and adopt periodic data slicing files verification mode, specifically comprise: whether the data file of the data fragmentation of metadata node meeting periodic check self maintained loses, if find, lose, stop this data fragmentation in the data, services of present node, delete the data of this data fragmentation simultaneously, and trigger at once copy reparation operation.
Further, described to primary copy node and the data check mode of carrying out from replica node between the different copies of data check employing periodic data burst: primary copy node is obtained the piecemeal foundation of self, and to sending piecemeal foundation and check request from replica node, primary copy node and from replica node all according to this piecemeal according to obtaining md5 value, and md5 value is deposited in check_map; From replica node, check_map is returned to primary copy node, primary copy node is received the check_map from copy, compares, if all consistent from replica node data with the check_map of self, think data consistent, otherwise be as the criterion with primary copy data.
Further, in described reparation step, by judging that whether the session of metadata node and ZooKeeper is expired, determined whether primary copy node or from the replica node machine of delaying, if session is expired, the machine of delaying, otherwise the machine of not delaying.
Corresponding above-mentioned distributed meta-data management method, technical scheme of the present invention also comprises a kind of distributed meta-data management system, specifically comprises with lower module:
Memory module, be used for dividing independently metadata node and subscriber's meter node, make it be respectively used to storing metadata and subscriber's meter, and adopt many copies of a plurality of metadata node storing metadatas, form all for storing the primary copy node of same metadata and from replica node;
Correction verification module, for carrying out data check to primary copy node with from replica node, with the consistance of the metadata that guarantees primary copy node and store from replica node;
Repair module, for adopting ZooKeeper technology, set up based on primary copy node with from the supervision ring of replica node, when monitoring that ring has monitored primary copy node or delayed machine from replica node, it triggers primary copy node and from the switching between replica node, realizes the reparation to the machine node of delaying.
Further, described memory module is also for adopting dynamical fashion or static mode extended metadata node;
Dynamical fashion specifically comprises: increase the empty node of metadata, find after the empty node of metadata, to the empty node transmission unit of the metadata of finding data by verification;
Static mode specifically comprises: after all metadata node shutdown, then increase new metadata node, and revise its configuration when this newly-increased metadata node starts.
Further, correction verification module comprises lightweight data check module, periodic data slicing files correction verification module and periodic data burst copy correction verification module;
Described lightweight data check module, its for: when metadata node starts, to all metadata node, send request, obtain the number that records of each metadata table burst in each metadata node, if it is inconsistent to record number, illustrated that data are inconsistent, close the data fragmentation service in ineligible metadata node, delete the data of this data fragmentation simultaneously, and trigger copy reparation operation.
Described periodic data slicing files correction verification module, its for: whether the data file of the data fragmentation of metadata node periodic check self maintained is lost, if find, lose, stop this data fragmentation in the data, services of present node, delete the data of this data fragmentation simultaneously, and trigger at once copy reparation operation.
Described periodic data burst copy correction verification module, its for: make primary copy node obtain the piecemeal foundation of self, and to send piecemeal foundation and check request from replica node, primary copy node and from replica node all according to this piecemeal according to obtaining md5 value, and md5 value is deposited in check_map; From replica node, check_map is returned to primary copy node, primary copy node is received the check_map from copy, compares, if all consistent from replica node data with the check_map of self, think data consistent, otherwise be as the criterion with primary copy data.
Further, in described reparation module, by judging that whether the session of metadata node and ZooKeeper is expired, determined whether primary copy node or from the replica node machine of delaying, if session is expired, the machine of delaying, otherwise the machine of not delaying.
The invention has the beneficial effects as follows: the present invention is independent of subscriber's meter by metadata and is stored on different nodes, when subscriber's meter node load is higher, can not affect the read-write of metadata, improved stability and the efficiency of metadata read-write; Meanwhile, the present invention can realize dynamic expansion metadata node, support many copies storages of metadata, has reduced the delay risk of machine of metadata node; The present invention has designed metadata data check link, and the metadata of storing in each replica node is consistent, and makes the stable performance of metadata cluster.In addition, owing to being provided with many copy storages, when killing wherein abnormal metadata node, other available metadata node can complete rapidly upgrading and repair, and have avoided the Single Point of Faliure phenomenon being prone in metadata management process.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of distributed meta-data management method of the present invention;
Fig. 2 dynamically increases the schematic diagram of metadata node in the embodiment of the present invention;
Fig. 3 is that in the embodiment of the present invention, metadata node triggers the schematic diagram that copy is revised;
Fig. 4 is the schematic flow sheet of the data check mode between the different copies of periodic data burst in the embodiment of the present invention;
Fig. 5 monitors ring upgrading modification process schematic diagram in the embodiment of the present invention;
Fig. 6 is the schematic flow sheet of distributed meta-data management method of the present invention.
Embodiment
Below in conjunction with accompanying drawing, principle of the present invention and feature are described, example, only for explaining the present invention, is not intended to limit scope of the present invention.
As shown in Figure 1, the present embodiment has provided a kind of distributed meta-data management method, specifically comprises the following steps:
Storing step: divide independently metadata node and subscriber's meter node, be respectively used to storing metadata and subscriber's meter, and adopt many copies of a plurality of metadata node storing metadatas, form all for storing the primary copy node of same metadata and from replica node;
Checking procedure: carry out data check to primary copy node with from replica node, with the consistance of the metadata that guarantees primary copy node and store from replica node;
Repair step: adopt ZooKeeper technology to set up based on primary copy node with from the supervision ring of replica node, when monitoring that ring has monitored primary copy node or delayed machine from replica node, it triggers primary copy node and from the switching between replica node, realizes the reparation to the machine node of delaying.
Based on these three steps, the specific implementation process of the present embodiment is divided into following three parts.
One, metadata store and metadata node expansion
Metadata and subscriber's meter data are stored respectively, and metadata node is supported dynamic expansion and static expansion.Because metadata and subscriber's meter are stored in different nodes, when subscriber's meter node load is high, can not have influence on the read-write of metadata, improved stability and the efficiency of metadata read-write.
In metadata store process, in order to guarantee data fault-tolerant, adopt many copies of a plurality of metadata node storing metadatas; In order to alleviate the work load of Master node, promote the cluster scale of mass storage system (MSS) simultaneously, introduced principal and subordinate's copy mechanism.In addition, also need to consider the expansion of carrying out metadata node according to actual conditions, metadata profile comprises dynamical fashion and static mode.Described dynamical fashion specifically comprises: increase the empty node of metadata, find after the empty node of metadata, to the empty node transmission unit of the metadata of finding data by verification.And static mode specifically comprises: after all metadata node shutdown, then increase new metadata node, and revise its configuration when this newly-increased metadata node starts.
As shown in Figure 2, treatment scheme while having provided a dynamically newly-increased metadata node, the metadata node that indicates data with META_RS_01, META_RS_02, and META_RS_03 represents not have the newly-increased metadata node of data, Master is the managing process of being in charge of all back end.When newly-increased metadata node META_RS_03 starts, as a connecting object, initiatively arrive the registration of Master node, then Master preserves the data structure of this connecting object.The data structure of the connecting object of its preservation of Master periodic scanning, whether judgement has node to be registered to Master in the recent period, if had, carry out following operation: take out this connecting object, it is Zookeeper abbreviation that this newly-increased metadata node is registered to ZK(ZK), and upgrade the loop configuration (being the supervision of setting up by Zookeeper encircles) of the metadata node that ZK safeguards; As shown in Figure 3, Master takes out the data structure of the connecting object of preserving from the thread of regular triggering, triggering copy is repaired, and the data structure of connecting object is sent to primary copy node, primary copy node is carried out copy reparation operation, import the data of the data fragmentation of its all metadata table into this newly-increased metadata node, and start corresponding data trnascription service, using this newly-increased metadata node as from replica node.
Therefore, the known expansion of carrying out metadata node is in order to meet the demand of many copy storages of metadata, to cause for further preventing metadata node storing excess data the machine of delaying.The new metadata node of expansion is as from replica node, and the node that former storing metadata is used is as primary copy node, is beneficial to the delay problem of machine of follow-up solution metadata node.
Two, data check
Adopt after the storage of many copies, need to consider primary copy node and replica node the consistance of data, therefore need to carry out data check to primary copy node with from replica node.When carrying out data check, read-write service is not externally provided, the data check when metadata node is restarted is lightweight verification, periodic check during operation belongs to the verification of internal memory rank.
The present embodiment mainly adopts three kinds of verification modes:
First, lightweight data check mode, specifically comprise: when metadata node starts, to all metadata node, send request, obtain the number that records of each metadata table burst in each metadata node, if it is inconsistent to record number, illustrated that data are inconsistent, close the data fragmentation service in ineligible metadata node, delete the data of this data fragmentation simultaneously, and trigger copy reparation operation.Lightweight verification while restarting is the state in order to confirm to restart also.
Second, periodic data slicing files verification mode, specifically comprise: whether the data file of the data fragmentation of metadata node meeting periodic check self maintained loses, if find, lose, stop this data fragmentation in the data, services of present node, delete the data of this data fragmentation simultaneously, and trigger at once copy reparation operation.
The 3rd, data check mode between the different copies of periodic data burst: as shown in Figure 4, be provided with three replica node, primary copy node and two are from replica node, primary copy node is obtained the check_set of self, and the foundation of piecemeal has been stored in the inside, and to sending check_set and check request from replica node, primary copy node and from replica node all according to this piecemeal according to obtaining md5 value, and md5 value is deposited in check_map; From replica node, check_map is returned to primary copy node, primary copy node is received the check_map from copy, compares with the check_map of self, completes the verification of three replica node.If all consistent from replica node data, think data consistent, otherwise be as the criterion with primary copy data.Check_map is the variable of a mapping structure, and its key (key) is for identifying current data burst, and Value (value) is the md5 value of this data fragmentation.
Three, repair and upgrade
Realize after primary copy node and the data consistent replica node, need to utilize primary copy node and solve the metadata node machine problem of delaying from replica node.
The present embodiment is supported upgrading and the reparation from replica node, during startup, by ZooKeeper, is set up and is monitored ring, when there is the death of metadata process exception, according to dead role and quantity, triggers fast upgrading and repairs.ZooKeeper is the reliable coordination strategy for large-scale distributed strategy, by judging that whether the session of metadata node and ZooKeeper is expired, determined whether primary copy node or from the replica node machine of delaying, if session is expired, the machine of delaying, otherwise the machine of not delaying.
When the primary copy node of metadata is delayed machine, select first from replica node, to take over the work of primary copy node, in order to guarantee metadata table, externally service is uninterrupted.Adopt the abbreviation of ZK(ZooKeeper) monitor and trigger and switch metadata primary copy from copy.Key step is as follows:
1) in ZK, set up bibliographic structure/root node/father node/interim node.
2) each META copy and ZK set up session, and below father node, set up interim node, write the agent address of oneself, if session is expired, this interim node can disappear.
3) illustrate, as shown in Figure 5, META_RS_01 is primary copy node, META_RS_02, META_RS_03 and META_RS_04 are from replica node, META_RS_02 monitors whether the interim node of META_RS_01 exists, if primary copy node session is expired, the interim node that primary copy node is corresponding disappears.Now, first, from replica node META_RS_02, upgrade to primary copy node.META_RS_02 is upgraded to after primary copy node and changes into and monitor that last is from replica node META_RS_04, as shown in phantom in FIG..
Method successful described in the present embodiment, while not adopting this strategy, can find by monitoring, kill after metadata node, loading can not complete, and while being parked in 89% left and right, starts to point out mistake.And follow-uply can not user data and metadata be inquired about, insert, be revised and the operation such as deletion.And after adopting this strategy, after killing metadata node, as long as also there is a metadata node, be carried in (in example, test is about 30 seconds) after of short duration stopping, continuing to load, until 100%, follow-up can normal running to user data and metadata.
As shown in Figure 6, corresponding above-mentioned distributed meta-data management method, technical scheme of the present invention also comprises a kind of distributed meta-data management system, specifically comprises with lower module:
Memory module, be used for dividing independently metadata node and subscriber's meter node, make it be respectively used to storing metadata and subscriber's meter, and adopt many copies of a plurality of metadata node storing metadatas, form all for storing the primary copy node of same metadata and from replica node;
Correction verification module, for carrying out data check to primary copy node with from replica node, with the consistance of the metadata that guarantees primary copy node and store from replica node;
Repair module, for adopting ZooKeeper technology, set up based on primary copy node with from the supervision ring of replica node, when monitoring that ring has monitored primary copy node or delayed machine from replica node, it triggers primary copy node and from the switching between replica node, realizes the reparation to the machine node of delaying.
In the present embodiment, described memory module is also for adopting dynamical fashion or static mode extended metadata node, dynamical fashion specifically comprises: increase the empty node of metadata, find after the empty node of metadata, to the empty node transmission unit of the metadata of finding data by verification.Static mode specifically comprises: after all metadata node shutdown, then increase new metadata node, and revise its configuration when this newly-increased metadata node starts.
Equally as shown in Figure 6, described correction verification module comprises lightweight data check module, periodic data slicing files correction verification module and periodic data burst copy correction verification module;
Described lightweight data check module, its for: when metadata node starts, to all metadata node, send request, obtain the number that records of each metadata table burst in each metadata node, if it is inconsistent to record number, illustrated that data are inconsistent, close the data fragmentation service in ineligible metadata node, delete the data of this data fragmentation simultaneously, and trigger copy reparation operation.
Described periodic data slicing files correction verification module, its for: whether the data file of the data fragmentation of metadata node periodic check self maintained is lost, if find, lose, stop this data fragmentation in the data, services of present node, delete the data of this data fragmentation simultaneously, and trigger at once copy reparation operation.
Described periodic data burst copy correction verification module, its for: make primary copy node obtain the piecemeal foundation of self, and to send piecemeal foundation and check request from replica node, primary copy node and from replica node all according to this piecemeal according to obtaining md5 value, and md5 value is deposited in check_map; From replica node, check_map is returned to primary copy node, primary copy node is received the check_map from copy, compares, if all consistent from replica node data with the check_map of self, think data consistent, otherwise be as the criterion with primary copy data.
In addition, in described reparation module, by judging that whether the session of metadata node and ZooKeeper is expired, determined whether primary copy node or from the replica node machine of delaying, if session is expired, the machine of delaying, otherwise the machine of not delaying.
Much more no longer the specific implementation process of described distributed meta-data management system is consistent with above-mentioned distributed meta-data management method, to state here.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (10)

1. a distributed meta-data management method, is characterized in that, specifically comprises the following steps:
Storing step: divide independently metadata node and subscriber's meter node, be respectively used to storing metadata and subscriber's meter, and adopt many copies of a plurality of metadata node storing metadatas, form all for storing the primary copy node of same metadata and from replica node;
Checking procedure: carry out data check to primary copy node with from replica node, with the consistance of the metadata that guarantees primary copy node and store from replica node;
Repair step: adopt ZooKeeper technology to set up based on primary copy node with from the supervision ring of replica node, when monitoring that ring has monitored primary copy node or delayed machine from replica node, it triggers primary copy node and from the switching between replica node, realizes the reparation to the machine node of delaying.
2. distributed meta-data management method according to claim 1, is characterized in that, described storing step also comprises employing dynamical fashion or static mode extended metadata node;
Dynamical fashion specifically comprises: increase the empty node of metadata, find after the empty node of metadata, to the empty node transmission unit of the metadata of finding data by verification;
Static mode specifically comprises: after all metadata node shutdown, then increase new metadata node, and revise its configuration when this newly-increased metadata node starts.
3. distributed meta-data management method according to claim 1, it is characterized in that, described to primary copy node with carry out data check from replica node and adopt lightweight data check mode, specifically comprise: when metadata node starts, to all metadata node, send request, obtain the number that records of each metadata table burst in each metadata node, if it is inconsistent to record number, illustrated that data are inconsistent, close the data fragmentation service in ineligible metadata node, delete the data of this data fragmentation simultaneously, and trigger copy reparation operation.
4. distributed meta-data management method according to claim 1, it is characterized in that, described to primary copy node with carry out data check from replica node and adopt periodic data slicing files verification mode, specifically comprise: whether the data file of the data fragmentation of metadata node periodic check self maintained loses, if find, lose, stop this data fragmentation in the data, services of present node, delete the data of this data fragmentation simultaneously, and trigger at once copy reparation operation.
5. distributed meta-data management method according to claim 1, it is characterized in that, described to primary copy node and the data check mode of carrying out from replica node between the different copies of data check employing periodic data burst: primary copy node is obtained the piecemeal foundation of self, and to sending piecemeal foundation and check request from replica node, primary copy node and from replica node all according to this piecemeal according to obtaining md5 value, and md5 value is deposited in check_map; From replica node, check_map is returned to primary copy node, primary copy node is received the check_map from copy, compares, if all consistent from replica node data with the check_map of self, think data consistent, otherwise be as the criterion with primary copy data.
6. distributed meta-data management method according to claim 1, it is characterized in that, in described reparation step, by judging that whether the session of metadata node and ZooKeeper is expired, determined whether primary copy node or from the replica node machine of delaying, if session is expired, the machine of delaying, otherwise the machine of not delaying.
7. a distributed meta-data management system, is characterized in that, specifically comprises with lower module:
Memory module, be used for dividing independently metadata node and subscriber's meter node, make it be respectively used to storing metadata and subscriber's meter, and adopt many copies of a plurality of metadata node storing metadatas, form all for storing the primary copy node of same metadata and from replica node;
Correction verification module, for carrying out data check to primary copy node with from replica node, with the consistance of the metadata that guarantees primary copy node and store from replica node;
Repair module, for adopting ZooKeeper technology, set up based on primary copy node with from the supervision ring of replica node, when monitoring that ring has monitored primary copy node or delayed machine from replica node, it triggers primary copy node and from the switching between replica node, realizes the reparation to the machine node of delaying.
8. distributed meta-data management system according to claim 7, is characterized in that, described memory module is also for passing through dynamical fashion or static mode extended metadata node;
Dynamical fashion specifically comprises: increase the empty node of metadata, find after the empty node of metadata, to the empty node transmission unit of the metadata of finding data by verification;
Static mode specifically comprises: after all metadata node shutdown, then increase new metadata node, and revise its configuration when this newly-increased metadata node starts.
9. distributed meta-data management system according to claim 7, is characterized in that, correction verification module comprises lightweight data check module, periodic data slicing files correction verification module and periodic data burst copy correction verification module;
Described lightweight data check module, its for: when metadata node starts, to all metadata node, send request, obtain the number that records of each metadata table burst in each metadata node, if it is inconsistent to record number, illustrated that data are inconsistent, close the data fragmentation service in ineligible metadata node, delete the data of this data fragmentation simultaneously, and trigger copy reparation operation;
Described periodic data slicing files correction verification module, its for: whether the data file of the data fragmentation of metadata node periodic check self maintained is lost, if find, lose, stop this data fragmentation in the data, services of present node, delete the data of this data fragmentation simultaneously, and trigger at once copy reparation operation;
Described periodic data burst copy correction verification module, its for: make primary copy node obtain the piecemeal foundation of self, and to send piecemeal foundation and check request from replica node, primary copy node and from replica node all according to this piecemeal according to obtaining md5 value, and md5 value is deposited in check_map; From replica node, check_map is returned to primary copy node, primary copy node is received the check_map from copy, compares, if all consistent from replica node data with the check_map of self, think data consistent, otherwise be as the criterion with primary copy data.
10. distributed meta-data management system according to claim 7, it is characterized in that, in described reparation module, by judging that whether the session of metadata node and ZooKeeper is expired, determined whether primary copy node or from the replica node machine of delaying, if session is expired, the machine of delaying, otherwise the machine of not delaying.
CN201310741599.XA 2013-12-27 2013-12-27 Distributed metadata management method and system Pending CN103729436A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310741599.XA CN103729436A (en) 2013-12-27 2013-12-27 Distributed metadata management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310741599.XA CN103729436A (en) 2013-12-27 2013-12-27 Distributed metadata management method and system

Publications (1)

Publication Number Publication Date
CN103729436A true CN103729436A (en) 2014-04-16

Family

ID=50453510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310741599.XA Pending CN103729436A (en) 2013-12-27 2013-12-27 Distributed metadata management method and system

Country Status (1)

Country Link
CN (1) CN103729436A (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104468569A (en) * 2014-12-04 2015-03-25 北京国双科技有限公司 Integrity detection method and device of distributed data
CN105243125A (en) * 2015-09-29 2016-01-13 北京京东尚科信息技术有限公司 PrestoDB cluster running method and apparatus, cluster and data query method and apparatus
CN105550230A (en) * 2015-12-07 2016-05-04 北京奇虎科技有限公司 Method and device for detecting failure of node of distributed storage system
CN105550229A (en) * 2015-12-07 2016-05-04 北京奇虎科技有限公司 Method and device for repairing data of distributed storage system
CN105589887A (en) * 2014-10-24 2016-05-18 中兴通讯股份有限公司 Data processing method for distributed file system and distributed file system
CN105610903A (en) * 2015-12-17 2016-05-25 北京奇虎科技有限公司 Data node upgrading method and device for distributed system
CN105681401A (en) * 2015-12-31 2016-06-15 深圳前海微众银行股份有限公司 Distributed architecture
CN105892954A (en) * 2016-04-25 2016-08-24 乐视控股(北京)有限公司 Data storage method and device based on multiple copies
CN106293980A (en) * 2016-07-26 2017-01-04 乐视控股(北京)有限公司 Data recovery method and system for distributed storage cluster
CN106945691A (en) * 2017-04-10 2017-07-14 湖南中车时代通信信号有限公司 The real-time hot standby switch device of server multicenter of automatic train monitor
CN107219997A (en) * 2016-03-21 2017-09-29 阿里巴巴集团控股有限公司 A kind of method and device for being used to verify data consistency
WO2017219678A1 (en) * 2016-06-22 2017-12-28 杭州海康威视数字技术股份有限公司 Data recovery method and device, and cloud storage system
CN108259543A (en) * 2016-12-29 2018-07-06 广东中科遥感技术有限公司 Distributed cloud storage database and its be deployed in the method for multiple data centers
CN109407977A (en) * 2018-09-25 2019-03-01 佛山科学技术学院 A kind of big data distributed storage management method and system
CN109614037A (en) * 2018-11-16 2019-04-12 新华三技术有限公司成都分公司 Data routing inspection method, apparatus and distributed memory system
CN109614164A (en) * 2018-11-29 2019-04-12 深圳前海微众银行股份有限公司 Realize plug-in unit configurable method, apparatus, equipment and readable storage medium storing program for executing
CN109947730A (en) * 2017-07-25 2019-06-28 中兴通讯股份有限公司 Metadata restoration methods, device, distributed file system and readable storage medium storing program for executing
CN110471934A (en) * 2019-08-19 2019-11-19 泰康保险集团股份有限公司 Method of calibration, device, medium and the electronic equipment of business datum
CN111124301A (en) * 2019-12-18 2020-05-08 深圳供电局有限公司 Data consistency storage method and system of object storage device
CN111241011A (en) * 2019-12-31 2020-06-05 清华大学 Global address space management method of distributed persistent memory
CN111695018A (en) * 2019-03-13 2020-09-22 阿里巴巴集团控股有限公司 Data processing method and device, distributed network system and computer equipment
CN111949210A (en) * 2017-06-28 2020-11-17 华为技术有限公司 Metadata storage method, system and storage medium in distributed storage system
WO2020232859A1 (en) * 2019-05-20 2020-11-26 平安科技(深圳)有限公司 Distributed storage system, data writing method, device, and storage medium
CN112711376A (en) * 2019-10-25 2021-04-27 北京金山云网络技术有限公司 Method and device for determining object master copy file in object storage system
CN113239013A (en) * 2021-05-17 2021-08-10 北京青云科技股份有限公司 Distributed system and storage medium
CN113297173A (en) * 2021-05-24 2021-08-24 阿里巴巴新加坡控股有限公司 Distributed database cluster management method and device and electronic equipment
CN113391767A (en) * 2021-06-30 2021-09-14 北京百度网讯科技有限公司 Data consistency checking method and device, electronic equipment and readable storage medium
CN113704359A (en) * 2021-09-03 2021-11-26 优刻得科技股份有限公司 Synchronization method, system and server for multiple data copies of time sequence database

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334797A (en) * 2008-08-04 2008-12-31 中兴通讯股份有限公司 Distributed file systems and its data block consistency managing method
CN102419766A (en) * 2011-11-01 2012-04-18 西安电子科技大学 Data redundancy and file operation methods based on Hadoop distributed file system (HDFS)
CN103383689A (en) * 2012-05-03 2013-11-06 阿里巴巴集团控股有限公司 Service process fault detection method, device and service node

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334797A (en) * 2008-08-04 2008-12-31 中兴通讯股份有限公司 Distributed file systems and its data block consistency managing method
CN102419766A (en) * 2011-11-01 2012-04-18 西安电子科技大学 Data redundancy and file operation methods based on Hadoop distributed file system (HDFS)
CN103383689A (en) * 2012-05-03 2013-11-06 阿里巴巴集团控股有限公司 Service process fault detection method, device and service node

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李宽: "基于HDFS的分布式Namenode节点模型的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589887B (en) * 2014-10-24 2020-04-03 中兴通讯股份有限公司 Data processing method of distributed file system and distributed file system
CN105589887A (en) * 2014-10-24 2016-05-18 中兴通讯股份有限公司 Data processing method for distributed file system and distributed file system
CN104468569B (en) * 2014-12-04 2017-12-22 北京国双科技有限公司 The integrality detection method and device of distributed data
CN104468569A (en) * 2014-12-04 2015-03-25 北京国双科技有限公司 Integrity detection method and device of distributed data
CN105243125A (en) * 2015-09-29 2016-01-13 北京京东尚科信息技术有限公司 PrestoDB cluster running method and apparatus, cluster and data query method and apparatus
CN105243125B (en) * 2015-09-29 2018-07-06 北京京东尚科信息技术有限公司 Operation method, device, cluster and the inquiry data method and device of PrestoDB clusters
CN105550230B (en) * 2015-12-07 2019-07-23 北京奇虎科技有限公司 The method for detecting and device of distributed memory system node failure
CN105550230A (en) * 2015-12-07 2016-05-04 北京奇虎科技有限公司 Method and device for detecting failure of node of distributed storage system
CN105550229A (en) * 2015-12-07 2016-05-04 北京奇虎科技有限公司 Method and device for repairing data of distributed storage system
CN105550229B (en) * 2015-12-07 2019-05-03 北京奇虎科技有限公司 The method and apparatus of distributed memory system data reparation
CN105610903A (en) * 2015-12-17 2016-05-25 北京奇虎科技有限公司 Data node upgrading method and device for distributed system
CN105681401A (en) * 2015-12-31 2016-06-15 深圳前海微众银行股份有限公司 Distributed architecture
CN107219997B (en) * 2016-03-21 2020-08-18 阿里巴巴集团控股有限公司 Method and device for verifying data consistency
CN107219997A (en) * 2016-03-21 2017-09-29 阿里巴巴集团控股有限公司 A kind of method and device for being used to verify data consistency
CN105892954A (en) * 2016-04-25 2016-08-24 乐视控股(北京)有限公司 Data storage method and device based on multiple copies
WO2017219678A1 (en) * 2016-06-22 2017-12-28 杭州海康威视数字技术股份有限公司 Data recovery method and device, and cloud storage system
CN107528872A (en) * 2016-06-22 2017-12-29 杭州海康威视数字技术股份有限公司 A kind of data reconstruction method, device and cloud storage system
CN107528872B (en) * 2016-06-22 2020-07-24 杭州海康威视数字技术股份有限公司 Data recovery method and device and cloud storage system
US10824372B2 (en) 2016-06-22 2020-11-03 Hangzhou Hikvision Digital Technology Co., Ltd. Data recovery method and device, and cloud storage system
CN106293980A (en) * 2016-07-26 2017-01-04 乐视控股(北京)有限公司 Data recovery method and system for distributed storage cluster
CN108259543A (en) * 2016-12-29 2018-07-06 广东中科遥感技术有限公司 Distributed cloud storage database and its be deployed in the method for multiple data centers
CN108259543B (en) * 2016-12-29 2021-07-06 广东中科遥感技术有限公司 Distributed cloud storage database and method for deploying same in multiple data centers
CN106945691B (en) * 2017-04-10 2019-06-21 湖南中车时代通信信号有限公司 The real-time hot standby switch device of the server multicenter of automatic train monitor
CN106945691A (en) * 2017-04-10 2017-07-14 湖南中车时代通信信号有限公司 The real-time hot standby switch device of server multicenter of automatic train monitor
CN111949210A (en) * 2017-06-28 2020-11-17 华为技术有限公司 Metadata storage method, system and storage medium in distributed storage system
CN109947730B (en) * 2017-07-25 2024-02-02 中兴通讯股份有限公司 Metadata recovery method, device, distributed file system and readable storage medium
CN109947730A (en) * 2017-07-25 2019-06-28 中兴通讯股份有限公司 Metadata restoration methods, device, distributed file system and readable storage medium storing program for executing
CN109407977A (en) * 2018-09-25 2019-03-01 佛山科学技术学院 A kind of big data distributed storage management method and system
CN109407977B (en) * 2018-09-25 2021-08-31 佛山科学技术学院 Big data distributed storage management method and system
CN109614037A (en) * 2018-11-16 2019-04-12 新华三技术有限公司成都分公司 Data routing inspection method, apparatus and distributed memory system
CN109614164A (en) * 2018-11-29 2019-04-12 深圳前海微众银行股份有限公司 Realize plug-in unit configurable method, apparatus, equipment and readable storage medium storing program for executing
CN111695018B (en) * 2019-03-13 2023-05-30 阿里云计算有限公司 Data processing method and device, distributed network system and computer equipment
CN111695018A (en) * 2019-03-13 2020-09-22 阿里巴巴集团控股有限公司 Data processing method and device, distributed network system and computer equipment
WO2020232859A1 (en) * 2019-05-20 2020-11-26 平安科技(深圳)有限公司 Distributed storage system, data writing method, device, and storage medium
CN110471934A (en) * 2019-08-19 2019-11-19 泰康保险集团股份有限公司 Method of calibration, device, medium and the electronic equipment of business datum
CN112711376A (en) * 2019-10-25 2021-04-27 北京金山云网络技术有限公司 Method and device for determining object master copy file in object storage system
CN111124301A (en) * 2019-12-18 2020-05-08 深圳供电局有限公司 Data consistency storage method and system of object storage device
CN111124301B (en) * 2019-12-18 2024-02-23 深圳供电局有限公司 Data consistency storage method and system of object storage device
CN111241011B (en) * 2019-12-31 2022-04-15 清华大学 Global address space management method of distributed persistent memory
CN111241011A (en) * 2019-12-31 2020-06-05 清华大学 Global address space management method of distributed persistent memory
CN113239013A (en) * 2021-05-17 2021-08-10 北京青云科技股份有限公司 Distributed system and storage medium
CN113239013B (en) * 2021-05-17 2024-04-09 北京青云科技股份有限公司 Distributed system and storage medium
CN113297173B (en) * 2021-05-24 2023-10-31 阿里巴巴新加坡控股有限公司 Distributed database cluster management method and device and electronic equipment
CN113297173A (en) * 2021-05-24 2021-08-24 阿里巴巴新加坡控股有限公司 Distributed database cluster management method and device and electronic equipment
CN113391767A (en) * 2021-06-30 2021-09-14 北京百度网讯科技有限公司 Data consistency checking method and device, electronic equipment and readable storage medium
CN113704359A (en) * 2021-09-03 2021-11-26 优刻得科技股份有限公司 Synchronization method, system and server for multiple data copies of time sequence database

Similar Documents

Publication Publication Date Title
CN103729436A (en) Distributed metadata management method and system
CN102591970B (en) Distributed key-value query method and query engine system
US10169169B1 (en) Highly available transaction logs for storing multi-tenant data sets on shared hybrid storage pools
US10817478B2 (en) System and method for supporting persistent store versioning and integrity in a distributed data grid
WO2019154394A1 (en) Distributed database cluster system, data synchronization method and storage medium
ES2881606T3 (en) Geographically distributed file system using coordinated namespace replication
RU2449358C1 (en) Distributed file system and data block consistency managing method thereof
US7653668B1 (en) Fault tolerant multi-stage data replication with relaxed coherency guarantees
US10489412B2 (en) Highly available search index with storage node addition and removal
CN105550229B (en) The method and apparatus of distributed memory system data reparation
US8229893B2 (en) Metadata management for fixed content distributed data storage
US7440977B2 (en) Recovery method using extendible hashing-based cluster logs in shared-nothing spatial database cluster
US20160306822A1 (en) Load balancing of queries in replication enabled ssd storage
CN104050249A (en) Distributed query engine system and method and metadata server
US20150363319A1 (en) Fast warm-up of host flash cache after node failover
CN104050250A (en) Distributed key-value query method and query engine system
US9659078B2 (en) System and method for supporting failover during synchronization between clusters in a distributed data grid
GB2484086A (en) Reliability and performance modes in a distributed storage system
US8090683B2 (en) Managing workflow communication in a distributed storage system
CN105069152B (en) data processing method and device
CN112559637B (en) Data processing method, device, equipment and medium based on distributed storage
US11003550B2 (en) Methods and systems of operating a database management system DBMS in a strong consistency mode
CN113010496B (en) Data migration method, device, equipment and storage medium
US8145598B2 (en) Methods and systems for single instance storage of asset parts
CN107943615B (en) Data processing method and system based on distributed cluster

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140416

RJ01 Rejection of invention patent application after publication