CN104142871B - Data backup method and device and distributed file system - Google Patents

Data backup method and device and distributed file system Download PDF

Info

Publication number
CN104142871B
CN104142871B CN201310170578.7A CN201310170578A CN104142871B CN 104142871 B CN104142871 B CN 104142871B CN 201310170578 A CN201310170578 A CN 201310170578A CN 104142871 B CN104142871 B CN 104142871B
Authority
CN
China
Prior art keywords
cost
back end
node
backup
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310170578.7A
Other languages
Chinese (zh)
Other versions
CN104142871A (en
Inventor
姚玉凤
冯明
丁圣勇
唐宏
金华敏
刘健民
于玉海
贾嫚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201310170578.7A priority Critical patent/CN104142871B/en
Publication of CN104142871A publication Critical patent/CN104142871A/en
Application granted granted Critical
Publication of CN104142871B publication Critical patent/CN104142871B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a data backup method and device and a distributed file system. The data backup method comprises, when a data backup request transmitted by a data node is received, transmitting a query request to a cost server for querying cost matrix information in a distributed system, which is associated with the data node; receiving response information transmitted by the cost server, wherein the cost information comprises the cost matrix information associated with the data node; computing the backup cost between the data node and any other candidate node through the cost matrix information, selecting the candidate node with the lowest backup cost as a target node, and transmitting the information of the target node to the data node so as to back data up to the target node through the data node. According to the data backup method, data backup is performed by selecting the node with the lowest backup cost according to a cost matrix, so that the problem of overhigh backup cost and node load non-uniformity can be effectively avoided.

Description

Method, device and distributed file system for data backup
Technical field
The present invention relates to the communications field, more particularly to a kind of method for data backup, device and distributed field system System.
Background technology
Cloud computing distributed file system provides Large Copacity highly reliable file service using multiple servers, and cluster is included Data server and LIST SERVER (meta data server).LIST SERVER is used for maintenance documentation data block in data server Distribution situation(Metadata), data server be used for store specific file data.It is distributed compared to other memory technologies now File system has obtained the extensive approval of industry with advantages such as its autgmentability strong, cost performance is high, zmodems.However, how to protect Demonstrate,prove all requirements of distributed file system, such as autgmentability, availability, reliability, security, efficiency are our needs The problem of solution.
In a distributed system, copy mechanism is a kind of important method for improving validity and performance.Copy mechanism refers to be worked as During to file system writing data blocks, except writing data on a primary node, while other several nodes can be distributed simultaneously Data are write in these nodes.Specifically distribute that how many node are determined according to the reliability requirement of system.Copy compensate for storage The problems such as object single point failure, poor fault tolerance, access performance not high.But introduce copy mechanism and also necessarily bring the following aspects Problem:Cost problem and cloud in copy consistency problem, load balance problem, the various hardware for creating a Copy and communication Calculating task is to copy access price problem etc..
Although copy mechanism can effectively improve the validity of distributed system, existing Distributed File System Data There is problems with copy backup scenario:
When the 1st, creating data trnascription, the cost such as do not account for storing, communicate, easily causing the problem of backup cost prohibitive.
2nd, because data trnascription randomly chooses position, when a large amount of copies concentrate on same node, it is likely to result in node and bears Carry uneven.
The content of the invention
The technical problem to be solved in the present invention is to provide a kind of method for data backup, device and distributed field system System.By introducing cost matrix, data backup is carried out according to the node that cost matrix selects backup cost minimum, so as to effectively keep away Exempt from backup cost prohibitive and the uneven problem of node load occur.
According to an aspect of the present invention, there is provided a kind of method for data backup, including:
When the data backup requests of back end transmission are received, inquiry request is sent to cost server, for looking into The cost matrix information being associated with the back end in distributed system is ask, wherein the back end is write-in data Primary node, the cost matrix information being associated with the back end represents other in the back end and distributed system Carrying cost between any both candidate nodes;
The response message that cost server sends is received, wherein response message includes what is be associated with the back end Cost matrix information;
The backup cost between the back end and other any both candidate nodes is calculated using cost matrix information;
The minimum both candidate nodes of selection backup cost are used as destination node;
Destination node information is sent to the back end, so that the back end backs up data to destination node On.
Preferably, using cost matrix information calculate backup between the back end and other any both candidate nodes into This step of, includes:
Backup cost Cost (i, j) between back end i and both candidate nodes j is:
Wherein CM (i, j) [l] is l-th carrying cost of dimension between the back end i and both candidate nodes j, W [l] It is l-th cost weight of dimension, K is dimension sum.
Preferably, the step of selecting the minimum both candidate nodes of backup cost as destination node includes:
The both candidate nodes j of argminCost (i, j) will be met as the destination node of the back end i.
Preferably, network state at predetermined intervals in detection distributed system between any two node;
Sent to cost server according to the network state and update request, to update cost matrix information.
Preferably, the step of network state at predetermined intervals in detection distributed system between any two node Suddenly include:
Link congestion degree in detection distributed system between any two node at predetermined intervals.
According to another aspect of the present invention, there is provided a kind of method for data backup, including:
When primary node of the back end as write-in data, data backup requests are sent to host node, to make master Node obtains the cost matrix information being associated with the back end according to data backup requests from cost server, using generation Valency matrix information calculates the backup cost between the back end and other any both candidate nodes, selection backup cost minimum Both candidate nodes are used as destination node;The cost matrix information being wherein associated with the back end represent the back end with Carrying cost in distributed system between other any both candidate nodes;
Receive the destination node information that host node sends;
Back up data on destination node.
According to another aspect of the present invention, there is provided a kind of host node for data backup, including:
First receiving unit, the data backup requests for receiving back end transmission send when back end is received Data backup requests when, indicate the first transmitting element to cost server send inquiry request, wherein the back end is Write the primary node of data;
First transmitting element, for the instruction according to the first receiving unit, inquiry request is sent to cost server, is used for The cost matrix information being associated with the back end in Querying Distributed system, wherein be associated with the back end Cost matrix information represents the carrying cost between other any both candidate nodes in the back end and distributed system;
Second receiving unit, the response message for receiving the transmission of cost server, wherein response message includes and institute State the associated cost matrix information of back end;
Computing unit, for being calculated between the back end and other any both candidate nodes using cost matrix information Backup cost;
Select unit, for selecting the both candidate nodes of backup cost minimum as destination node;
Second transmitting element, for destination node information to be sent into the back end, so that the back end will Data are backuped on destination node.
Preferably, computing unit specifically calculates the backup cost between back end i and both candidate nodes j using following equation Cost(i,j):
Wherein CM (i, j) [l] is l-th carrying cost of dimension between the back end i and both candidate nodes j, W [l] It is l-th cost weight of dimension, K is dimension sum.
Preferably, select unit will specifically meet the both candidate nodes j of argminCost (i, j) as the back end i Destination node.
Preferably, control unit also includes detection unit, wherein:
Detection unit, it is network-like between any two node in distributed system for detecting at predetermined intervals State;
First transmitting element is additionally operable to be sent to cost server according to the network state and updates request, to update cost Matrix information.
Preferably, detection unit is specifically at predetermined intervals in detection distributed system between any two node Link congestion degree.
According to another aspect of the present invention, there is provided a kind of back end for data backup, including:
3rd transmitting element, for when back end is used as the primary node for writing data, data being sent to host node Backup request, to make host node obtain the generation being associated with the back end from cost server according to data backup requests Valency matrix information, the backup cost between the back end and other any both candidate nodes is calculated using cost matrix information, The minimum both candidate nodes of selection backup cost are used as destination node;The cost matrix information being wherein associated with the back end Represent the carrying cost between other any both candidate nodes in the back end and distributed system;
3rd receiving unit, the destination node information for receiving host node transmission;
Backup units, for backing up data to destination node.
According to another aspect of the present invention, there is provided a kind of distributed file system for data backup, including host node And back end, wherein host node is the host node that any of the above-described embodiment is related to, and back end is related to for any of the above-described embodiment And back end.
The present invention is by introducing cost matrix, and it is standby that the node for selecting backup cost minimum according to cost matrix carries out data Part, there is backup cost prohibitive and the uneven problem of node load so as to be prevented effectively from.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also Other accompanying drawings are obtained with according to these accompanying drawings.
Fig. 1 is the schematic diagram of data back up method one embodiment of the present invention.
Fig. 2 is the schematic diagram of another embodiment of data back up method of the present invention.
Fig. 3 is schematic diagram of the present invention for host node one embodiment of data backup.
Fig. 4 is schematic diagram of the present invention for another embodiment of host node of data backup.
Fig. 5 is schematic diagram of the present invention for back end one embodiment of data backup.
Fig. 6 is schematic diagram of the present invention for distributed file system one embodiment of data backup.
Fig. 7 is the network diagram of distributed file system of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.Below Description only actually at least one exemplary embodiment is illustrative, and never conduct is to the present invention and its application or makes Any limitation.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, belongs to the scope of protection of the invention.
Unless specifically stated otherwise, the part and positioned opposite, the digital table of step for otherwise illustrating in these embodiments Do not limited the scope of the invention up to formula and numerical value.
Simultaneously, it should be appreciated that for the ease of description, the size of the various pieces shown in accompanying drawing is not according to reality Proportionate relationship draw.
May be not discussed in detail for technology, method and apparatus known to person of ordinary skill in the relevant, but suitable In the case of, the technology, method and apparatus should be considered as authorizing a part for specification.
In all examples shown here and discussion, any occurrence should be construed as merely exemplary, without It is as limitation.Therefore, the other examples of exemplary embodiment can have different values.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then it need not be further discussed in subsequent accompanying drawing.
Fig. 1 is the schematic diagram of data back up method one embodiment of the present invention.Preferably, the method and step of the embodiment can By the host node in distributed system(Name Node)Perform.
Step 101, when the data backup requests of back end transmission are received, sending inquiry to cost server please Ask, for the cost matrix information being associated with the back end in Querying Distributed system.
Wherein described back end is the primary node for writing data, the cost matrix letter being associated with the back end Breath represents the carrying cost between other any both candidate nodes in the back end and distributed system.
Step 102, receives the response message that cost server sends, and wherein response message includes and the back end Associated cost matrix information.
Step 103, the backup between the back end and other any both candidate nodes is calculated using cost matrix information Cost.
Step 104, the minimum both candidate nodes of selection backup cost are used as destination node.
Step 105, the back end is sent to by destination node information, so that the back end is backed up data to On destination node.
Based on the data back up method that the above embodiment of the present invention is provided, by introducing cost matrix, according to cost matrix The minimum node of selection backup cost carries out data backup, backup cost prohibitive and node load inequality occurs so as to be prevented effectively from Problem.
Cost matrix design can be designed according to the actual conditions of distributed system.As shown in table 1, it is one The sample of cost matrix.Cost matrix can substantially be divided into two parts:Node serial number and various cost amounts.Node serial number is included Source node(First node of data write-in)And destination node number information;Cost amount includes storage cost(Uniformity is tieed up Shield cost, loading condition, handling capacity etc.), communication cost(Geographic distance, transmission bandwidth, across interstitial content, link load feelings Condition etc.)And other.It is herein a simplified example, concrete scheme is not limited in any way, keeper can be self-defined on demand Included cost information.
Table 1
Preferably, above-mentioned utilization cost matrix information calculates standby between the back end and other any both candidate nodes The step of part cost, includes:
Backup cost Cost (i, j) between back end i and both candidate nodes j is:
Wherein CM (i, j) [l] is l-th carrying cost of dimension between the back end i and both candidate nodes j, W [l] It is l-th cost weight of dimension, K is dimension sum.
Wherein, specific dimension can determine according to actual requirement, it is simplest can directly with one-dimensional(Such as degree of Congestion)Come Weigh.
Preferably, the minimum both candidate nodes of above-mentioned selection backup cost as destination node the step of include:
The both candidate nodes j of argminCost (i, j) will be met as the destination node of the back end i.
Further, since with the variation of data backup operation and resource situation, cost matrix meeting occurrence dynamics change, because This needs to detect with the cost matrix that upgrades in time network state.Preferably, distribution can at predetermined intervals be detected Network state in formula system between any two node, sends to cost server according to the network state and updates request, To update cost matrix information.
Preferably, the network state in the above-mentioned distributed system of detection at predetermined intervals between any two node The step of include:
Link congestion degree in detection distributed system between any two node at predetermined intervals.
That is, specific more new strategy can be by monitoring the state and network link loads of server node come real It is existing.A kind of simple method is to monitor the link congestion degree between each back end, using Congestion Level SPCC as matrix cost Value.Monitoring method can utilize the SNMP of standard(Simple Network Management Protocol, simple network management Agreement)Management interface obtained the present flow rate of link every 5 minutes, and degree of Congestion is just further defined as present flow rate with link Bandwidth ratio, the Congestion Level SPCC of the bigger representative of numerical value is higher.One can be automatically thus set up inside distributed file system The storage backup scheduling mechanism of fair relatively.
In addition, the introducing of cost matrix, in addition to data backup scheduling mechanism and cloud computing subtask load mechanism, Important reference role, such as recovery mechanism, replica consistency maintenance mechanism etc. can be played to other scheduling mechanisms.
Fig. 2 is the schematic diagram of another embodiment of data back up method of the present invention.Preferably, the embodiment method and step can be by Performed as the primary node of write-in data.
Step 201, when primary node of the back end as write-in data, data backup requests is sent to host node, Believe to make host node that the cost matrix being associated with the back end is obtained from cost server according to data backup requests Breath, the backup cost between the back end and other any both candidate nodes, selection backup are calculated using cost matrix information The both candidate nodes of cost minimization are used as destination node;The cost matrix information being wherein associated with the back end represents described Carrying cost in back end and distributed system between other any both candidate nodes.
Step 202, receives the destination node information that host node sends.
Step 203, backs up data on destination node.
Based on the data back up method that the above embodiment of the present invention is provided, by introducing cost matrix, according to cost matrix The minimum node of selection backup cost carries out data backup, backup cost prohibitive and node load inequality occurs so as to be prevented effectively from Problem.
Fig. 3 is schematic diagram of the present invention for host node one embodiment of data backup.As shown in figure 3, host node bag Include:
First receiving unit 301, the data backup requests for receiving back end transmission are sent out when back end is received During the data backup requests sent, the first transmitting element 302 is indicated to send inquiry request to cost server, wherein the data section Point is the primary node of write-in data.
First transmitting element 302, for the instruction according to the first receiving unit 301, sending inquiry to cost server please Ask, for the cost matrix information being associated with the back end in Querying Distributed system, wherein with the back end Associated cost matrix information represents the storage between other any both candidate nodes in the back end and distributed system Cost.
Second receiving unit 303, for receive cost server transmission response message, wherein response message include with The associated cost matrix information of the back end.
Computing unit 304, for using cost matrix information calculate the back end and other any both candidate nodes it Between backup cost.
Select unit 305, for selecting the both candidate nodes of backup cost minimum as destination node.
Second transmitting element 306, for destination node information to be sent into the back end, so as to the back end Back up data on destination node.
Based on the host node that the above embodiment of the present invention is provided, by introducing cost matrix, select standby according to cost matrix The minimum node of part cost carries out data backup, asking for backup cost prohibitive and node load inequality occurs so as to be prevented effectively from Topic.
Preferably, computing unit 304 specifically calculates the backup between back end i and both candidate nodes j using following equation Cost Cost (i, j):
Wherein CM (i, j) [l] is l-th carrying cost of dimension between the back end i and both candidate nodes j, W [l] It is l-th cost weight of dimension, K is dimension sum.
Preferably, select unit 305 will specifically meet the both candidate nodes j of argminCost (i, j) as the data section The destination node of point i.
Fig. 4 is schematic diagram of the present invention for another embodiment of host node of data backup.With embodiment illustrated in fig. 3 phase Than in the embodiment shown in fig. 4, control unit also includes detection unit 401.Wherein:
Detection unit 401, for detecting the net in distributed system between any two node at predetermined intervals Network state.
First transmitting element 302 is additionally operable to be sent to cost server according to the network state and updates request, to update Cost matrix information.
Preferably, detection unit is specifically at predetermined intervals in detection distributed system between any two node Link congestion degree.
Fig. 5 is schematic diagram of the present invention for back end one embodiment of data backup.As shown in figure 5, the data Node includes:
3rd transmitting element 501, for when back end is used as the primary node for writing data, number being sent to host node According to backup request, to make host node obtain what is be associated with the back end from cost server according to data backup requests Cost matrix information, using cost matrix information calculate backup between the back end and other any both candidate nodes into This, the minimum both candidate nodes of selection backup cost are used as destination node;The cost matrix being wherein associated with the back end Information represents the carrying cost between other any both candidate nodes in the back end and distributed system.
3rd receiving unit 502, the destination node information for receiving host node transmission.
Backup units 503, for backing up data to destination node.
Based on the back end that the above embodiment of the present invention is provided, by introducing cost matrix, selected according to cost matrix The minimum node of backup cost carries out data backup, so as to be prevented effectively from occur backup cost prohibitive and node load inequality ask Topic.
Fig. 6 is schematic diagram of the present invention for distributed file system one embodiment of data backup.Wherein in Fig. 6 institutes State in embodiment, distributed file system includes host node 601 and back end 602.Wherein:
Host node is the host node that is related to of any embodiment in accompanying drawing 3-4, and back end is that any embodiment is related in accompanying drawing 5 And back end.
For brevity, a back end is only gived in figure 6.But those skilled in the art are scrutable It is can have multiple back end in the system.Fig. 7 is the network diagram of distributed file system of the present invention.
Technical scheme proposed by the present invention possesses following excellent relative to existing Distributed File System Data backup scenario Point:
1. perfection solves above-mentioned backup cost prohibitive, and copy is visited when node load is uneven and performs cloud computing task Ask cost prohibitive three subject matters.
2. in addition to data backup scheduling mechanism and task load mechanism, the introducing of cost matrix can be distribution Other scheduling mechanisms of the inside of file system provide important reference role.
3. the Distributed File System Data backup scenario based on cost matrix that this patent is proposed, to distributed document The soft hardware equipment of system is without what special requirement.User only needs to additionally arrange a generation in distributed file system Valency server is to be capable of achieving described data backup scenario.
One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware To complete, it is also possible to instruct the hardware of correlation to complete by program, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
Description of the invention is given for the sake of example and description, and is not exhaustively or by the present invention It is limited to disclosed form.Many modifications and variations are for the ordinary skill in the art obvious.Select and retouch State embodiment and be to more preferably illustrate principle of the invention and practical application, and one of ordinary skill in the art is managed The solution present invention is suitable to the various embodiments with various modifications of special-purpose so as to design.

Claims (11)

1. a kind of method for data backup, it is characterised in that including:
When the data backup requests of back end transmission are received, inquiry request is sent to cost server, divided for inquiring about The cost matrix information being associated with the back end in cloth system, wherein the back end is the primary of write-in data Node, other are any during the cost matrix information being associated with the back end represents the back end and distributed system Carrying cost between both candidate nodes;
The response message that cost server sends is received, wherein response message includes the cost being associated with the back end Matrix information;
The backup cost between the back end and other any both candidate nodes is calculated using cost matrix information;
The minimum both candidate nodes of selection backup cost are used as destination node;
Destination node information is sent to the back end, so that the back end is backed up data on destination node;
Wherein, the step of the backup cost between the back end and other any both candidate nodes is calculated using cost matrix information Suddenly include:
Backup cost Cost (i, j) between back end i and both candidate nodes j is:
C o s t ( i , j ) = Σ l = 1 K W [ l ] C M ( i , j ) [ l ] ;
Wherein CM (i, j) [l] is l-th carrying cost of dimension between the back end i and both candidate nodes j, and W [l] is l The cost weight of individual dimension, K is dimension sum.
2. method according to claim 1, it is characterised in that
The step of minimum both candidate nodes of selection backup cost are as destination node includes:
The both candidate nodes j of argminCost (i, j) will be met as the destination node of the back end i.
3. the method according to any one of claim 1-2, it is characterised in that
Network state in detection distributed system between any two node at predetermined intervals;
Sent to cost server according to the network state and update request, to update cost matrix information.
4. method according to claim 3, it is characterised in that
Include the step of network state between any two node in detection distributed system at predetermined intervals:
Link congestion degree in detection distributed system between any two node at predetermined intervals.
5. a kind of method for data backup, it is characterised in that including:
When primary node of the back end as write-in data, data backup requests are sent to host node, to make host node The cost matrix information being associated with the back end is obtained from cost server according to data backup requests, using cost square Battle array information calculates the backup cost between the back end and other any both candidate nodes, the minimum candidate of selection backup cost Node is used as destination node;The cost matrix information being wherein associated with the back end represents the back end with distribution Carrying cost in formula system between other any both candidate nodes;
Receive the destination node information that host node sends;
Back up data on destination node;
Wherein, backup cost Cost (i, j) between back end i and both candidate nodes j is:
C o s t ( i , j ) = Σ l = 1 K W [ l ] C M ( i , j ) [ l ] ;
Wherein CM (i, j) [l] is l-th carrying cost of dimension between the back end i and both candidate nodes j, and W [l] is l The cost weight of individual dimension, K is dimension sum.
6. a kind of host node for data backup, it is characterised in that including:
First receiving unit, the data backup requests for receiving back end transmission, when the number for receiving back end transmission During according to backup request, the first transmitting element is indicated to send inquiry request to cost server, wherein the back end is write-in The primary node of data;
First transmitting element, for the instruction according to the first receiving unit, sends inquiry request, for inquiring about to cost server The cost matrix information being associated with the back end in distributed system, wherein the cost being associated with the back end Matrix information represents the carrying cost between other any both candidate nodes in the back end and distributed system;
Second receiving unit, the response message for receiving the transmission of cost server, wherein response message includes and the number According to the cost matrix information that node is associated;
Computing unit, for calculating the backup between the back end and other any both candidate nodes using cost matrix information Cost;
Select unit, for selecting the both candidate nodes of backup cost minimum as destination node;
Second transmitting element, for destination node information to be sent into the back end, so that the back end is by data Backup on destination node;
Wherein, computing unit specifically calculates the backup cost Cost between back end i and both candidate nodes j using following equation (i,j):
C o s t ( i , j ) = Σ l = 1 K W [ l ] C M ( i , j ) [ l ] ;
Wherein CM (i, j) [l] is l-th carrying cost of dimension between the back end i and both candidate nodes j, and W [l] is l The cost weight of individual dimension, K is dimension sum.
7. host node according to claim 6, it is characterised in that
Select unit will specifically meet the both candidate nodes j of argminCost (i, j) as the destination node of the back end i.
8. the host node according to any one of claim 6-7, it is characterised in that host node also includes detection unit, its In:
Detection unit, for detecting the network state in distributed system between any two node at predetermined intervals;
First transmitting element is additionally operable to be sent to cost server according to the network state and updates request, to update cost matrix Information.
9. host node according to claim 8, it is characterised in that
Detection unit link congestion degree specifically at predetermined intervals in detection distributed system between any two node.
10. a kind of back end for data backup, it is characterised in that including:
3rd transmitting element, for when back end is used as the primary node for writing data, data backup being sent to host node Request, to make host node obtain the cost square being associated with the back end from cost server according to data backup requests Battle array information, the backup cost between the back end and other any both candidate nodes, selection are calculated using cost matrix information The minimum both candidate nodes of backup cost are used as destination node;The cost matrix information being wherein associated with the back end is represented Carrying cost in the back end and distributed system between other any both candidate nodes;
3rd receiving unit, the destination node information for receiving host node transmission;
Backup units, for backing up data to destination node;
Wherein, backup cost Cost (i, j) between back end i and both candidate nodes j is:
C o s t ( i , j ) = Σ l = 1 K W [ l ] C M ( i , j ) [ l ] ;
Wherein CM (i, j) [l] is l-th carrying cost of dimension between the back end i and both candidate nodes j, and W [l] is l The cost weight of individual dimension, K is dimension sum.
A kind of 11. distributed file systems for data backup, it is characterised in that including:Host node and back end, its In:
Host node, is host node that any one of claim 6-9 is related to;
Back end, is the back end being related in claim 10.
CN201310170578.7A 2013-05-10 2013-05-10 Data backup method and device and distributed file system Active CN104142871B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310170578.7A CN104142871B (en) 2013-05-10 2013-05-10 Data backup method and device and distributed file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310170578.7A CN104142871B (en) 2013-05-10 2013-05-10 Data backup method and device and distributed file system

Publications (2)

Publication Number Publication Date
CN104142871A CN104142871A (en) 2014-11-12
CN104142871B true CN104142871B (en) 2017-05-24

Family

ID=51852052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310170578.7A Active CN104142871B (en) 2013-05-10 2013-05-10 Data backup method and device and distributed file system

Country Status (1)

Country Link
CN (1) CN104142871B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106648970A (en) * 2016-11-04 2017-05-10 北京华为数字技术有限公司 File backup method and distributed file system
CN108023967B (en) * 2017-12-20 2021-05-18 联想(北京)有限公司 Data balancing method and device and management equipment in distributed storage system
CN108628706B (en) * 2018-05-02 2021-08-17 北京新桥信通科技股份有限公司 Data backup method, device, system and storage medium
CN108875035B (en) * 2018-06-25 2022-02-18 郑州云海信息技术有限公司 Data storage method of distributed file system and related equipment
CN112241319A (en) * 2019-07-19 2021-01-19 伊姆西Ip控股有限责任公司 Method, electronic device and computer program product for balancing load
CN112306962B (en) * 2019-07-26 2024-02-23 杭州海康威视数字技术股份有限公司 File copying method, device and storage medium in computer cluster system
CN110597659B (en) 2019-08-28 2024-06-04 华为技术有限公司 Backup processing method and server

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082830A (en) * 2011-01-18 2011-06-01 浙江大学 Unstable network-oriented distributed file storage method based on quality perception
CN102880531A (en) * 2012-09-27 2013-01-16 新浪网技术(中国)有限公司 Database backup system and backup method and slave database server of database backup system
CN103078936A (en) * 2012-12-31 2013-05-01 网宿科技股份有限公司 Metadata hierarchical storage method and system for Global file system (GFS)-based distributed file system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7664731B2 (en) * 2002-03-21 2010-02-16 United States Postal Service Method and system for storing and retrieving data using hash-accessed multiple data stores

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082830A (en) * 2011-01-18 2011-06-01 浙江大学 Unstable network-oriented distributed file storage method based on quality perception
CN102880531A (en) * 2012-09-27 2013-01-16 新浪网技术(中国)有限公司 Database backup system and backup method and slave database server of database backup system
CN103078936A (en) * 2012-12-31 2013-05-01 网宿科技股份有限公司 Metadata hierarchical storage method and system for Global file system (GFS)-based distributed file system

Also Published As

Publication number Publication date
CN104142871A (en) 2014-11-12

Similar Documents

Publication Publication Date Title
CN104142871B (en) Data backup method and device and distributed file system
CN102546782B (en) Distribution system and data operation method thereof
EP2904763B1 (en) Load-balancing access to replicated databases
US20190373065A1 (en) Method and Apparatus for Virtualized Network Function Chaining Management
JP5998206B2 (en) Scalable centralized dynamic resource distribution in cluster data grids
KR101242458B1 (en) Intelligent virtual storage service system and method thereof
CN102629268B (en) Data synchronization method, system and date access device
CN106843745A (en) Capacity expansion method and device
CN102137133B (en) Method and system for distributing contents and scheduling server
US8140791B1 (en) Techniques for backing up distributed data
CN109358971B (en) Rapid and load-balancing service function chain deployment method in dynamic network environment
KR20120072907A (en) Distribution storage system of distributively storing objects based on position of plural data nodes, position-based object distributive storing method thereof, and computer-readable recording medium
CN106034160A (en) Distributed computing system and method
CN102609446A (en) Distributed Bloom filter system and application method thereof
US20230308511A1 (en) Multichannel virtual internet protocol address affinity
CN104850394A (en) Management method of distributed application program and distributed system
CN112492022A (en) Cluster, method, system and storage medium for improving database availability
CN104715044A (en) Distributed system and data manipulation method thereof
CN109426439A (en) The method and device of dilatation is carried out to distributed memory system
US9544371B1 (en) Method to discover multiple paths to disk devices cluster wide
CN109587185B (en) Cloud storage system and object processing method in cloud storage system
CN105740091B (en) Data backup, restoration methods and equipment
CN107295032B (en) Data synchronization method and equipment for data center
CN102868594B (en) Method and device for message processing
CN109947593B (en) Data disaster tolerance method, system, strategy arbitration device and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant