CN104735110A - Metadata management method and system - Google Patents

Metadata management method and system Download PDF

Info

Publication number
CN104735110A
CN104735110A CN201310716441.7A CN201310716441A CN104735110A CN 104735110 A CN104735110 A CN 104735110A CN 201310716441 A CN201310716441 A CN 201310716441A CN 104735110 A CN104735110 A CN 104735110A
Authority
CN
China
Prior art keywords
metadata
node
host node
data
backup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310716441.7A
Other languages
Chinese (zh)
Other versions
CN104735110B (en
Inventor
谢朝阳
高旸
冯明
广小明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201310716441.7A priority Critical patent/CN104735110B/en
Publication of CN104735110A publication Critical patent/CN104735110A/en
Application granted granted Critical
Publication of CN104735110B publication Critical patent/CN104735110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1029Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a metadata management method and system. The metadata management method comprises the steps that when a metadata master node needing backup data backs up assigned data, load query requests are sent to the other metadata master nodes in a metadata cluster; the metadata master nodes receiving the load query requests conduct statistics on the load conditions of themselves, and report the load statistic information to the metadata master node needing the backup data; the metadata master node needing the backup data determines target nodes by means of the reported load statistic information, and sends the assigned data to each metadata master node used as the target node to be backed up. Due to the fact that metadata are managed in a server cluster mode, the metadata are classified according to different heats of the metadata, and then backup is performed according to a network data replication algorithm, the data access and magnetic disk read-write pressure of single nodes are dispersed, the performance and handling capacity of the whole system are improved, and meanwhile the computing and storing capacity of the whole system is improved.

Description

Metadata management method and system
Technical field
The present invention relates to the communications field, particularly a kind of metadata management method and system.
Background technology
File system, as the important component part of operating system, is undertaken abstract by the memory space managed operating system, for user provides objectification, unified access interface, avoids direct control physical equipment, reaches the object of management resource.File system can only manage local disk space at first, then by file transfer protocol (FTP) (File Transfer Protocol between computer, be called for short: FTP) shared file, but File Transfer Protocol does not provide the access interface consistent with native file system and object model.
Along with computer is widely used in the various aspects of production and life, and the storage of single computer and computing capability are very limited, growing by the demand of multiple stage collaborative computer management document, and have engendered distributed file system.The development experience of distributed file system NFS (Network File System, NFS), Andrew file system (Andrew FileSystem be called for short:, AFS), storage area network (Storage Area Network be called for short:, SAN), network-attached storage (Network Attached Storage be called for short:, be called for short: NAS), directly access file system DAFS) (Direct Access File System is called for short: multiple stage such as.Current comparative maturity and widely used be GFS(Google File System) and HDFS(Hadoop Distributed File System).
HDFS is a realization of increasing income of GFS, be a kind of distributed file system of the supported data intensive applications based on Java language, the feature of HDFS has Error Tolerance, is applicable to being deployed on cheap machine, the data access of high-throughput can be provided, be applicable to very much management large-scale data.
HDFS system adopts master-slave mode (Master-Slave) framework, as shown in Figure 1.A HDFS cluster is made up of a client (Client), a metadata node (Name Node) and multiple back end (Data Node).Client may have access to metadata node and back end.Metadata node is a center main server, and the NameSpace of responsible maintaining file system and client are to the access of file system.Back end is made up of multiple server, is in charge of the memory space on storage node, performs the read-write requests of instruction from metadata node and client.
There is following shortcoming in the design of HDFS management of metadata:
1. metadata needs by frequent visit, but the cold and hot degree of different pieces of information difference, and the frequency that namely some data is accessed is high, and the accessed frequency of some data is low.Adopt the structure of single metadata node, when there are the very high data of temperature, the bottleneck of disk read-write performance can have a strong impact on the operating efficiency of HDFS.
An important application scene of 2.HDFS is process large-scale parallel data access request, and its throughput of structures shape of the single metadata node of HDFS is not high, can not meet data read-write operation in enormous quantities and carry out simultaneously.
A critical function of 3.HDFS is the MapReduce parallel computation in order to support upper strata, and the storage of single metadata node and computing capability are very limited, do not possess extensibility, is difficult to the process request meeting mass data.
4. metadata node is a central server of HDFS, is responsible for process client to the access request of file system.The machine if metadata node is delayed because breaking down, will make whole HDFS systemic breakdown, cause heavy losses.
Along with HDFS is more and more longer for running time, quantity of documents in HDFS constantly increases, the scale of metadata also can become very huge, and the storage of current individual server and computing capability are all very limited, therefore single metadata node is when data total amount increases very fast, likely become performance bottleneck, cause the performance of whole system to reduce.
The design of a metadata node is only had to simplify the architecture of system in HDFS.But HDFS design premises is hardware failure is normality, therefore how after fault occurs, system to be quickly recovered to an automatically core design target that normal operating condition is HDFS.
Metadata is as the data of data of description, and its high availability is vital, only has and is optimized metadata management method, ensures that it is read and write efficiently, failsafe, or can recover rapidly after breaking down, and just can reach the core design target of HDFS.
Summary of the invention
The embodiment of the present invention provides a kind of metadata management method and system.By adopting the mode management of metadata of server cluster, according to the difference of metadata temperature to meta data category, then back up according to network data replication strategy, the method has disperseed data access and the disk read-write pressure of individual node, improve performance and the throughput of whole system, improve calculating and the storage capacity of whole system simultaneously.
According to an aspect of the present invention, a kind of metadata management method is provided, comprises:
Needing the metadata host node of Backup Data when backing up specific data, sending load query request to other metadata host node in metadata cluster;
Receive the load state of the metadata host node statistics self of load query request, and by load statistics information reporting to the metadata host node needing Backup Data;
Need the metadata host node of Backup Data to utilize the load statistics information determination destination node reported, wherein destination node comprises the lightest m of a load metadata host node, m be greater than 0 positive integer, the size of m is associated with the data temperature of specific data;
The metadata host node of Backup Data is needed to be sent to by specific data each as the metadata host node of destination node;
After receiving specific data as the metadata host node of destination node, by specific data at local backup, and backing up successfully to needing the metadata host node of Backup Data to send backup success message.
In one embodiment, need the metadata host node of Backup Data being sent to by specific data each after the metadata host node of destination node, also comprise:
Judge whether to receive within the scheduled time all as the backup success message that the metadata host node of destination node sends;
Receive if fail all as the backup success message that the metadata host node of destination node sends within the scheduled time, then the metadata host node not sending backup success message in destination node is redefined as destination node;
Then performing needs the metadata host node of Backup Data to be sent to by specific data each as the step of the metadata host node of destination node.
In one embodiment, metadata host node preserves data during in this locality, the metadata be associated with self is sent to store from node the data of preservation, so that metadata realizes data syn-chronization from node and metadata host node.
In one embodiment, the network parameter of self when judging that metadata host node associated with it breaks down, is revised as metadata host node parameter from node by metadata, using metadata host node new in metadata cluster.
In one embodiment, after the metadata host node fault recovery of breaking down, the network parameter of self is revised as metadata from node parameter, and using metadata new in metadata cluster from node, wherein new metadata is associated from node with new metadata host node;
New metadata sends self data snapshot from node to new metadata host node, so that new metadata host node is according to data snapshot, send to new metadata to store from node from the data that node lacks new metadata, thus new metadata realize data syn-chronization from node and new metadata host node.
According to a further aspect in the invention, provide a kind of metadata management system, comprise multiple metadata host node, wherein:
Needing the metadata host node of Backup Data, for when backing up specific data, sending load query request to other metadata host node in metadata cluster; Utilize the load statistics information determination destination node that reports, wherein destination node comprises the lightest m of a load metadata host node, m be greater than 0 positive integer, the size of m is associated with the data temperature of specific data; Sent to by specific data each as the metadata host node of destination node;
Receive the metadata host node of load query request, for adding up the load state of self, and by load statistics information reporting to the metadata host node needing Backup Data;
As the metadata host node of destination node, for after receiving specific data, by specific data at local backup, and backing up successfully to needing the metadata host node of Backup Data to send backup success message.
In one embodiment, needing the metadata host node of Backup Data also for being sent to by specific data each after the metadata host node of destination node, judging whether to receive within the scheduled time all as the backup success message that the metadata host node of destination node sends; Receive all as the backup success message that the metadata host node of destination node sends within the scheduled time if fail, then redefine as destination node by the metadata host node not sending backup success message in destination node, then performing needs the metadata host node of Backup Data to be sent to by specific data each as the operation of the metadata host node of destination node.
In one embodiment, said system also comprises multiple metadata from node, and wherein each metadata is associated with a metadata host node respectively from node;
Metadata host node is also for preserving data during in this locality, the metadata be associated with self is sent to store from node the data of preservation, so that metadata realizes data syn-chronization from node and metadata host node.
In one embodiment, the network parameter of self also for when judging that metadata host node associated with it breaks down, is revised as metadata host node parameter, using metadata host node new in metadata cluster from node by metadata.
In one embodiment, the metadata host node broken down is also for after fault recovery, the network parameter of self is revised as metadata from node parameter, and using metadata new in metadata cluster from node, wherein new metadata is associated from node with new metadata host node; The data snapshot of self is sent to new metadata host node, so that new metadata host node is according to data snapshot, send to new metadata to store from node from the data that node lacks new metadata, thus new metadata realize data syn-chronization from node and new metadata host node.
Instant invention overcomes the shortcoming that existing HDFS adopts single metadata node structure, disperseed data access and the disk read-write pressure of individual node, improve performance and the throughput of whole system, improve calculating and the storage capacity of whole system simultaneously.HDFS system is made to have higher availability, for practical application provides better support.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the schematic diagram of a HDFS system architecture embodiment in prior art.
Fig. 2 is the schematic diagram of a metadata management method of the present invention embodiment.
Fig. 3 is the schematic diagram of another embodiment of metadata management method of the present invention.
Fig. 4 is the schematic diagram of another embodiment of metadata management method of the present invention.
Fig. 5 is the schematic diagram of a HDFS system architecture of the present invention embodiment.
Fig. 6 is the schematic diagram of a metadata aggregated structure of the present invention embodiment.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Illustrative to the description only actually of at least one exemplary embodiment below, never as any restriction to the present invention and application or use.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Unless specifically stated otherwise, otherwise positioned opposite, the numerical expression of the parts of setting forth in these embodiments and step and numerical value do not limit the scope of the invention.
Meanwhile, it should be understood that for convenience of description, the size of the various piece shown in accompanying drawing is not draw according to the proportionate relationship of reality.
May not discuss in detail for the known technology of person of ordinary skill in the relevant, method and apparatus, but in the appropriate case, described technology, method and apparatus should be regarded as a part of authorizing specification.
In all examples with discussing shown here, any occurrence should be construed as merely exemplary, instead of as restriction.Therefore, other example of exemplary embodiment can have different values.
It should be noted that: represent similar terms in similar label and letter accompanying drawing below, therefore, once be defined in an a certain Xiang Yi accompanying drawing, then do not need to be further discussed it in accompanying drawing subsequently.
Fig. 2 is the schematic diagram of a metadata management method of the present invention embodiment.As shown in Figure 1, the method step of the present embodiment comprises:
Step 201, needing the metadata host node of Backup Data when backing up specific data, sending load query request to other metadata host node in metadata cluster.
Step 202, receives the load state of the metadata host node statistics self of load query request, and by load statistics information reporting to the metadata host node needing Backup Data.
Step 203, needs the metadata host node of Backup Data to utilize the load statistics information determination destination node reported.
Wherein destination node comprises the lightest m of a load metadata host node, m be greater than 0 positive integer, the size of m is associated with the data temperature of specific data.
Such as, suppose the data having some bytes in a data block, its size can represent Vbytes.Suppose this data block accessed (reading or writing) per second repeatedly, its frequency can be expressed as Iaccesses/second.Data temperature can be defined as every byte data accessed number of times per second, i.e. temperature T=I/V (accesses/second/bytes).
When metadata backs up between host node, first according to above-mentioned formula, calculating the temperature of data, sort from low to high according to temperature to different pieces of information, is then more part of data backup that temperature is higher, is less part of data backup that temperature is less.
Step 204, needs the metadata host node of Backup Data to be sent to by specific data each as the metadata host node of destination node.
Step 205, after receiving specific data, by specific data at local backup, and is backing up successfully to needing the metadata host node of Backup Data to send backup success message as the metadata host node of destination node.
Based on the metadata management method that the above embodiment of the present invention provides, by according to the difference of metadata temperature to meta data category, then back up according to network data replication strategy, the method has disperseed data access and the disk read-write pressure of individual node, improve performance and the throughput of whole system, improve calculating and the storage capacity of whole system simultaneously.
Fig. 3 is the schematic diagram of another embodiment of metadata management method of the present invention.Compared with embodiment illustrated in fig. 2, in the embodiment shown in fig. 3, whether successfully carry out respective handling according to data backup further, wherein:
Step 301, needing the metadata host node of Backup Data when backing up specific data, sending load query request to other metadata host node in metadata cluster.
Step 302, receives the load state of the metadata host node statistics self of load query request, and by load statistics information reporting to the metadata host node needing Backup Data.
Step 303, needs the metadata host node of Backup Data to utilize the load statistics information determination destination node reported.
Wherein destination node comprises the lightest m of a load metadata host node, m be greater than 0 positive integer, the size of m is associated with the data temperature of specific data.
Step 304, needs the metadata host node of Backup Data to be sent to by specific data each as the metadata host node of destination node.
Step 305, after receiving specific data, by specific data at local backup, and is backing up successfully to needing the metadata host node of Backup Data to send backup success message as the metadata host node of destination node.
Step 306, judges whether to receive within the scheduled time all as the backup success message that the metadata host node of destination node sends.
Step 307, receives all as the backup success message that the metadata host node of destination node sends within the scheduled time if fail, then redefine the metadata host node not sending backup success message in destination node into destination node.Then step 304 is performed.
In one embodiment, metadata host node preserves data during in this locality, the metadata be associated with self is sent to store from node the data of preservation, so that metadata realizes data syn-chronization from node and metadata host node.
Fig. 4 is the schematic diagram of another embodiment of metadata management method of the present invention.As shown in Figure 4, when metadata host node breaks down, relevant metadata can substitute this metadata host node broken down from node.Wherein:
Step 401, the network parameter of self when judging that metadata host node associated with it breaks down, is revised as metadata host node parameter from node by metadata, using metadata host node new in metadata cluster.
Step 402, after the metadata host node fault recovery of breaking down, the network parameter of self is revised as metadata from node parameter, and using metadata new in metadata cluster from node, wherein new metadata is associated from node with new metadata host node.
Step 403, new metadata sends self data snapshot from node to new metadata host node.
Step 404, new metadata, according to data snapshot, sends to new metadata to store from node from the data that node lacks by new metadata host node, thus new metadata realizes data syn-chronization from node and new metadata host node.
In the present invention, the design of HDFS cluster still adopts host-guest architecture, as shown in Figure 5, comprise a client, multiple metadata node and multiple back end, the metadata cluster that Master is client and is made up of multiple servers, Slave is back end cluster.Client can each metadata node in accesses meta-data cluster, also can each back end in visit data cluster.Metadata cluster entirety is responsible for the NameSpace of maintaining file system and client to the access of file system, and wherein each metadata node can each back end in visit data cluster.
The structure of metadata cluster as shown in Figure 6, is made up of from node the metadata of some metadata host nodes and equal number.All metadata store in the cluster with being distributed, and improve the storage capacity of whole system.Simultaneously according to distributed computing method, utilize multiple internodal cooperated computing, improve the computing capability of whole system.
In cluster, each metadata host node is mutual UNICOM, and function is all equality and consistent, does not have the difference of Master and Slave.Data in each metadata host node can according to data temperature (i.e. data access frequency, the access frequency of data is higher, its temperature is higher, the access frequency of data is lower, its temperature is lower) piecemeal stores after height sequence, and the data in different masses can not coexist in other metadata host node according to temperature yet and back up varying number.Such as, in Fig. 6, have seven kinds of data blocks, increase successively from 1 to 7 temperatures.No. 1 and 2 number blocks that temperature is minimum, because access frequency is too low, be only kept in current meta data host node, do not back up in other metadata host node; No. 3 and 4 number blocks that temperature is lower, except preserving in current meta data host node, only have a backup in other metadata host node; No. 5 and 6 number blocks that temperature is higher, be kept at altogether in three metadata host nodes in the cluster; The 7 good data blocks that temperature is the highest, preserve portion in each metadata host node.Being made with 2 benefits like this, is on the one hand being distributed on different host nodes to the access of the high metadata of temperature, reducing the disk read-write pressure of individual node, improve the performance of whole system; The throughput improving whole system on the other hand.
In actual build environment, first metadata can be kept in a host node, then backup on other host node by network data replication strategy, this algorithm can ensure the final consistency of data in backup procedure, and metadata host node can not preserve many points of identical data.
In cluster, each metadata host node has one from node, keeps real-time consistency from the data of node and the data of its host node.When a host node breaks down, can replace according to main and subordinate node the work that algorithm replaces its host node fast from node, now originally become new host node from node, ensure that whole system can normally continuous service.Meanwhile, system is recovered the former host node broken down, and can keep data realtime uniform with new host node, and the former host node after recovery becomes new for node.For other metadata host node in cluster, after certain metadata host node breaks down, it becomes new host node from node switching, and the data that can not affect other host node back up.
In order to reduce the risk that server failure brings further, in actual build environment, host node and as far as possible non-conterminous from the physical location of node, such as, in different racks or machine room, or in virtual machine in different physical machine.
The structure of metadata cluster also has an advantage, be exactly when the storage of whole cluster and computing capability inadequate time, new node can be added at any time in cluster, improve the disposal ability of whole cluster.
In one embodiment, a kind of metadata management system comprises multiple metadata host node, wherein:
Needing the metadata host node of Backup Data, for when backing up specific data, sending load query request to other metadata host node in metadata cluster; Utilize the load statistics information determination destination node that reports, wherein destination node comprises the lightest m of a load metadata host node, m be greater than 0 positive integer, the size of m is associated with the data temperature of specific data; Sent to by specific data each as the metadata host node of destination node;
Receive the metadata host node of load query request, for adding up the load state of self, and by load statistics information reporting to the metadata host node needing Backup Data;
As the metadata host node of destination node, for after receiving specific data, by specific data at local backup, and backing up successfully to needing the metadata host node of Backup Data to send backup success message.
Based on the metadata management method that the above embodiment of the present invention provides, by according to the difference of metadata temperature to meta data category, then back up according to network data replication strategy, the method has disperseed data access and the disk read-write pressure of individual node, improve performance and the throughput of whole system, improve calculating and the storage capacity of whole system simultaneously.
In one embodiment, needing the metadata host node of Backup Data also for being sent to by specific data each after the metadata host node of destination node, judging whether to receive within the scheduled time all as the backup success message that the metadata host node of destination node sends; Receive all as the backup success message that the metadata host node of destination node sends within the scheduled time if fail, then redefine as destination node by the metadata host node not sending backup success message in destination node, then performing needs the metadata host node of Backup Data to be sent to by specific data each as the operation of the metadata host node of destination node.
Preferably, this system also comprises multiple metadata from node, and wherein each metadata is associated with a metadata host node respectively from node;
Metadata host node is also for preserving data during in this locality, the metadata be associated with self is sent to store from node the data of preservation, so that metadata realizes data syn-chronization from node and metadata host node.
In one embodiment, the network parameter of self also for when judging that metadata host node associated with it breaks down, is revised as metadata host node parameter, using metadata host node new in metadata cluster from node by metadata.
In one embodiment, the metadata host node broken down is also for after fault recovery, the network parameter of self is revised as metadata from node parameter, and using metadata new in metadata cluster from node, wherein new metadata is associated from node with new metadata host node; The data snapshot of self is sent to new metadata host node, so that new metadata host node is according to data snapshot, send to new metadata to store from node from the data that node lacks new metadata, thus new metadata realize data syn-chronization from node and new metadata host node.
By implementing the present invention, following beneficial effect can be obtained:
1. adopt the mode management of metadata of server cluster, metadata stores in the cluster with being distributed, and improves the storage capacity of whole system.
2., in metadata cluster, by utilizing multiple internodal cooperated computing, improve the computing capability of whole system.
3. according to the difference of metadata temperature height, metadata other node standby in the cluster, the data backup that temperature is higher is more, the data backup that temperature is lower is less, data access and the disk read-write pressure of individual node are disperseed, improve the performance of whole system, improve the throughput of whole system simultaneously.
4. adopt host node in metadata cluster and from the backup mode that node combines, avoid the Single Point of Faliure problem of single metadata node, improve the ability of system resisting risk.
5. when the storage of metadata cluster and computing capability inadequate time, new node can be added at any time in cluster, improve the disposal ability of whole cluster.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can have been come by hardware, the hardware that also can carry out instruction relevant by program completes, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium mentioned can be read-only memory, disk or CD etc.
Description of the invention provides in order to example with for the purpose of describing, and is not exhaustively or limit the invention to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Selecting and describing embodiment is in order to principle of the present invention and practical application are better described, and enables those of ordinary skill in the art understand the present invention thus design the various embodiments with various amendment being suitable for special-purpose.

Claims (10)

1. a metadata management method, is characterized in that, comprising:
Needing the metadata host node of Backup Data when backing up specific data, sending load query request to other metadata host node in metadata cluster;
Receive the load state of the metadata host node statistics self of load query request, and by load statistics information reporting to the metadata host node needing Backup Data;
Need the metadata host node of Backup Data to utilize the load statistics information determination destination node reported, wherein destination node comprises the lightest m of a load metadata host node, m be greater than 0 positive integer, the size of m is associated with the data temperature of specific data;
The metadata host node of Backup Data is needed to be sent to by specific data each as the metadata host node of destination node;
After receiving specific data as the metadata host node of destination node, by specific data at local backup, and backing up successfully to needing the metadata host node of Backup Data to send backup success message.
2. method according to claim 1, is characterized in that,
Need the metadata host node of Backup Data being sent to by specific data each after the metadata host node of destination node, also comprise:
Judge whether to receive within the scheduled time all as the backup success message that the metadata host node of destination node sends;
Receive if fail all as the backup success message that the metadata host node of destination node sends within the scheduled time, then the metadata host node not sending backup success message in destination node is redefined as destination node;
Then performing needs the metadata host node of Backup Data to be sent to by specific data each as the step of the metadata host node of destination node.
3. method according to claim 1 and 2, is characterized in that,
Metadata host node preserves data during in this locality, the metadata be associated with self is sent to store from node the data of preservation, so that metadata realizes data syn-chronization from node and metadata host node.
4. method according to claim 1 and 2, is characterized in that,
The network parameter of self when judging that metadata host node associated with it breaks down, is revised as metadata host node parameter from node by metadata, using metadata host node new in metadata cluster.
5. method according to claim 4, is characterized in that,
After the metadata host node fault recovery of breaking down, the network parameter of self is revised as metadata from node parameter, using metadata new in metadata cluster from node, wherein new metadata is associated from node with new metadata host node;
New metadata sends self data snapshot from node to new metadata host node, so that new metadata host node is according to data snapshot, send to new metadata to store from node from the data that node lacks new metadata, thus new metadata realize data syn-chronization from node and new metadata host node.
6. a metadata management system, is characterized in that, comprises multiple metadata host node, wherein:
Needing the metadata host node of Backup Data, for when backing up specific data, sending load query request to other metadata host node in metadata cluster; Utilize the load statistics information determination destination node that reports, wherein destination node comprises the lightest m of a load metadata host node, m be greater than 0 positive integer, the size of m is associated with the data temperature of specific data; Sent to by specific data each as the metadata host node of destination node;
Receive the metadata host node of load query request, for adding up the load state of self, and by load statistics information reporting to the metadata host node needing Backup Data;
As the metadata host node of destination node, for after receiving specific data, by specific data at local backup, and backing up successfully to needing the metadata host node of Backup Data to send backup success message.
7. system according to claim 6, is characterized in that,
Needing the metadata host node of Backup Data also for being sent to by specific data each after the metadata host node of destination node, judging whether to receive within the scheduled time all as the backup success message that the metadata host node of destination node sends; Receive all as the backup success message that the metadata host node of destination node sends within the scheduled time if fail, then redefine as destination node by the metadata host node not sending backup success message in destination node, then performing needs the metadata host node of Backup Data to be sent to by specific data each as the operation of the metadata host node of destination node.
8. the system according to claim 6 or 7, is characterized in that, also comprises multiple metadata from node, and wherein each metadata is associated with a metadata host node respectively from node;
Metadata host node is also for preserving data during in this locality, the metadata be associated with self is sent to store from node the data of preservation, so that metadata realizes data syn-chronization from node and metadata host node.
9. system according to claim 8, is characterized in that,
The network parameter of self also for when judging that metadata host node associated with it breaks down, is revised as metadata host node parameter, using metadata host node new in metadata cluster from node by metadata.
10. system according to claim 9, is characterized in that,
The metadata host node broken down is also for after fault recovery, the network parameter of self is revised as metadata from node parameter, using metadata new in metadata cluster from node, wherein new metadata is associated from node with new metadata host node; The data snapshot of self is sent to new metadata host node, so that new metadata host node is according to data snapshot, send to new metadata to store from node from the data that node lacks new metadata, thus new metadata realize data syn-chronization from node and new metadata host node.
CN201310716441.7A 2013-12-23 2013-12-23 Metadata management method and system Active CN104735110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310716441.7A CN104735110B (en) 2013-12-23 2013-12-23 Metadata management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310716441.7A CN104735110B (en) 2013-12-23 2013-12-23 Metadata management method and system

Publications (2)

Publication Number Publication Date
CN104735110A true CN104735110A (en) 2015-06-24
CN104735110B CN104735110B (en) 2019-03-26

Family

ID=53458543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310716441.7A Active CN104735110B (en) 2013-12-23 2013-12-23 Metadata management method and system

Country Status (1)

Country Link
CN (1) CN104735110B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589733A (en) * 2015-11-27 2016-05-18 杭州华三通信技术有限公司 Data processing method and device
CN106407045A (en) * 2016-09-29 2017-02-15 郑州云海信息技术有限公司 Data disaster recovery method and system, and server virtualization system
CN106776151A (en) * 2017-01-14 2017-05-31 郑州云海信息技术有限公司 SAMBA cluster TDB data-base recordings backup method, apparatus and system
CN106897279A (en) * 2015-12-17 2017-06-27 阿里巴巴集团控股有限公司 For the method and apparatus of distributed document treatment
CN107315547A (en) * 2017-07-18 2017-11-03 郑州云海信息技术有限公司 A kind of method and device for reading distributed meta data file
CN108829787A (en) * 2018-05-31 2018-11-16 郑州云海信息技术有限公司 A kind of meta-data distribution formula system
CN110909076A (en) * 2019-10-31 2020-03-24 北京浪潮数据技术有限公司 Storage cluster data synchronization method, device, equipment and storage medium
CN111338843A (en) * 2018-12-19 2020-06-26 中国移动通信集团云南有限公司 Data backup method and device for production system
CN111506253A (en) * 2019-01-31 2020-08-07 阿里巴巴集团控股有限公司 Distributed storage system and storage method thereof
CN113923222A (en) * 2021-12-13 2022-01-11 云和恩墨(北京)信息技术有限公司 Data processing method and device
CN114356848A (en) * 2022-03-11 2022-04-15 中国信息通信研究院 Metadata management method, computer storage medium and electronic device
CN116521744A (en) * 2023-06-30 2023-08-01 杭州拓数派科技发展有限公司 Full duplex metadata transmission method, device, system and computer equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7249281B2 (en) * 2003-07-28 2007-07-24 Microsoft Corporation Method and system for backing up and restoring data of a node in a distributed system
CN101668046A (en) * 2009-10-13 2010-03-10 成都市华为赛门铁克科技有限公司 Resource caching method, resource obtaining method, device and system thereof
CN102523105A (en) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 Failure recovery method of data storage and applied data distribution framework thereof
CN102571772A (en) * 2011-12-26 2012-07-11 华中科技大学 Hot spot balancing method for metadata server
CN102693168A (en) * 2011-03-22 2012-09-26 中兴通讯股份有限公司 A method, a system and a service node for data backup recovery
CN103235748A (en) * 2013-04-24 2013-08-07 曙光信息产业(北京)有限公司 Method and system for managing metadata
CN103294167A (en) * 2013-05-21 2013-09-11 暨南大学 Data behavior based low-energy consumption cluster storage replication device and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7249281B2 (en) * 2003-07-28 2007-07-24 Microsoft Corporation Method and system for backing up and restoring data of a node in a distributed system
CN101668046A (en) * 2009-10-13 2010-03-10 成都市华为赛门铁克科技有限公司 Resource caching method, resource obtaining method, device and system thereof
CN102693168A (en) * 2011-03-22 2012-09-26 中兴通讯股份有限公司 A method, a system and a service node for data backup recovery
CN102523105A (en) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 Failure recovery method of data storage and applied data distribution framework thereof
CN102571772A (en) * 2011-12-26 2012-07-11 华中科技大学 Hot spot balancing method for metadata server
CN103235748A (en) * 2013-04-24 2013-08-07 曙光信息产业(北京)有限公司 Method and system for managing metadata
CN103294167A (en) * 2013-05-21 2013-09-11 暨南大学 Data behavior based low-energy consumption cluster storage replication device and method

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589733B (en) * 2015-11-27 2018-12-25 新华三技术有限公司 A kind of data processing method and device
CN105589733A (en) * 2015-11-27 2016-05-18 杭州华三通信技术有限公司 Data processing method and device
CN106897279A (en) * 2015-12-17 2017-06-27 阿里巴巴集团控股有限公司 For the method and apparatus of distributed document treatment
CN106407045A (en) * 2016-09-29 2017-02-15 郑州云海信息技术有限公司 Data disaster recovery method and system, and server virtualization system
CN106407045B (en) * 2016-09-29 2019-09-24 郑州云海信息技术有限公司 A kind of data disaster restoration methods, system and server virtualization system
CN106776151A (en) * 2017-01-14 2017-05-31 郑州云海信息技术有限公司 SAMBA cluster TDB data-base recordings backup method, apparatus and system
CN107315547A (en) * 2017-07-18 2017-11-03 郑州云海信息技术有限公司 A kind of method and device for reading distributed meta data file
CN108829787A (en) * 2018-05-31 2018-11-16 郑州云海信息技术有限公司 A kind of meta-data distribution formula system
CN111338843A (en) * 2018-12-19 2020-06-26 中国移动通信集团云南有限公司 Data backup method and device for production system
CN111338843B (en) * 2018-12-19 2023-08-15 中国移动通信集团云南有限公司 Data backup method and device for production system
CN111506253B (en) * 2019-01-31 2023-06-20 阿里巴巴集团控股有限公司 Distributed storage system and storage method thereof
CN111506253A (en) * 2019-01-31 2020-08-07 阿里巴巴集团控股有限公司 Distributed storage system and storage method thereof
CN110909076A (en) * 2019-10-31 2020-03-24 北京浪潮数据技术有限公司 Storage cluster data synchronization method, device, equipment and storage medium
CN113923222B (en) * 2021-12-13 2022-05-31 云和恩墨(北京)信息技术有限公司 Data processing method and device
CN113923222A (en) * 2021-12-13 2022-01-11 云和恩墨(北京)信息技术有限公司 Data processing method and device
CN114356848B (en) * 2022-03-11 2022-06-07 中国信息通信研究院 Metadata management method, computer storage medium and electronic device
CN114356848A (en) * 2022-03-11 2022-04-15 中国信息通信研究院 Metadata management method, computer storage medium and electronic device
CN116521744A (en) * 2023-06-30 2023-08-01 杭州拓数派科技发展有限公司 Full duplex metadata transmission method, device, system and computer equipment
CN116521744B (en) * 2023-06-30 2023-09-12 杭州拓数派科技发展有限公司 Full duplex metadata transmission method, device, system and computer equipment

Also Published As

Publication number Publication date
CN104735110B (en) 2019-03-26

Similar Documents

Publication Publication Date Title
CN104735110A (en) Metadata management method and system
US11153380B2 (en) Continuous backup of data in a distributed data store
US11120152B2 (en) Dynamic quorum membership changes
AU2017225107B2 (en) System-wide checkpoint avoidance for distributed database systems
CA2906511C (en) Fast crash recovery for distributed database systems
CA2935215C (en) Hierarchical chunking of objects in a distributed storage system
CN103116661B (en) A kind of data processing method of database
US9460008B1 (en) Efficient garbage collection for a log-structured data store
CN103929500A (en) Method for data fragmentation of distributed storage system
CN103763383A (en) Integrated cloud storage system and storage method thereof
CN104111804A (en) Distributed file system
CN104184812A (en) Multi-point data transmission method based on private cloud
CN103501319A (en) Low-delay distributed storage system for small files
CN113377868A (en) Offline storage system based on distributed KV database
CN114385561A (en) File management method and device and HDFS system
KR101589122B1 (en) Method and System for recovery of iSCSI storage system used network distributed file system
Rao Data duplication using Amazon Web Services cloud storage
CN112749136B (en) File storage method and system based on Glusteris

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220127

Address after: 100007 room 205-32, floor 2, building 2, No. 1 and No. 3, qinglonghutong a, Dongcheng District, Beijing

Patentee after: Tianyiyun Technology Co.,Ltd.

Address before: No.31, Financial Street, Xicheng District, Beijing, 100033

Patentee before: CHINA TELECOM Corp.,Ltd.

TR01 Transfer of patent right