CN106776952A

CN106776952A - Date storage method in a kind of distributed system

Info

Publication number: CN106776952A
Application number: CN201611097372.6A
Authority: CN
Inventors: 刘斌; 吴方才; 楚涌泉
Original assignee: Space Star Technology (beijing) Co Ltd
Current assignee: Space Star Technology (beijing) Co Ltd
Priority date: 2016-12-02
Filing date: 2016-12-02
Publication date: 2017-05-31
Anticipated expiration: 2036-12-02
Also published as: CN106776952B

Abstract

The present invention relates to date storage method in a kind of distributed system, according to memory capacity, operation load and failure-frequency calculate the storage value-at-risk for determining each node, multiple nodes are grouped, each packet includes storage value-at-risk node and the relatively low node of a storage value-at-risk higher, so that the average storage value-at-risk relative equilibrium in each packet, avoid redundant data distribution some nodes be Frequent Troubles node situation, redundant data is distributed in so packet of storage risk balance, facilitate data maintenance and reduce loss of data risk.

Description

Date storage method in a kind of distributed system

【Technical field】

The invention belongs to date storage method under field of data storage, more particularly to distributed system environment.

【Background technology】

Generally, an external highest of handling up for providing of machine also can only achieve 200MBps, according to common machine Device mirror image is the completely the same mode of data on several machines, and repairing the data of 12TB needs to take more than 20 hour, it is considered to To normal service pressure, the time of reparation is up to tens hours.

Distributed storage mode is proposed in the prior art, is multiple storehouses by data cutting, and replicate several parts of redundancies, it is same The different redundant distributions in storehouse are stored in the diverse location of different machines, improve reparation speed.But in the prior art by redundancy When data distribution is on different machines, random fashion is used, different machines are not selected, cause identical data The machine of distribution is all probably Frequent Troubles machine, thus results in data maintenance difficulty and improves loss of data risk.

Based on above mentioned problem, a kind of new distributed system storage method is needed badly now, by node according to storage value-at-risk It is grouped so that the average storage value-at-risk relative equilibrium of each packet, redundant data is distributed in each packet, it is convenient Data maintenance and reduce loss of data risk.

【The content of the invention】

In order to solve above mentioned problem of the prior art, the present invention proposes data storage side in a kind of distributed system Method.

The technical solution adopted by the present invention is as follows：

A kind of date storage method in distributed system, the method comprises the following steps：

(1) m node i in distributed system is loaded according to memory capacity, operation and failure-frequency is according to following public affairs Formula (a) calculates the storage value-at-risk R for determining each node_i:

R_i=S_i×P_i+1/F_i×Q_i+G_i×T_i(a)；

Wherein S_iRepresent the memory capacity of node i, P_iRepresent the weight corresponding to memory capacity, F_iRepresent the operation of node i Highest system operation load in history, Q_iRepresent the weight corresponding to operation load, G_iIn the history run of expression node i Failure-frequency, T_iRepresent the weight corresponding to failure-frequency, and P_i, Q_i, G_i>1；

(2) m node is lined up an ordered queue by the order according to storage value-at-risk from low to high, by ordered queue First node of head of the queue and tail of the queue last node taking-up constitute the first storage packet, for the orderly team that remaining node is constituted Row continue with the follow-up multiple storage packets of composition in the manner described above, until only 2 or 3 nodes in ordered queue, then Using above-mentioned 2 or 3 nodes as a storage packet, final m node is divided into k storage packet；

(3) when system receives data storage request, data are cut into k according to fragment, to each data slot Replicate, obtain the packet of k groups data slots, every group of data slot is grouped includes data slot replicate data corresponding with its Fragment；

(4) data slot in a data fragment packet and corresponding replicate data fragment are respectively stored into one to deposit In two nodes in storage packet, until the data slot and replicate data fragment in the packet of k groups data slot all store k In storage packet；

(5) when a nodes break down in storing packet, another node in above-mentioned storage packet is deposited The data slot or replicate data fragment of storage, repair to the above-mentioned node for breaking down；

(6) man-to-man port is set to each node, when a node fails, the corresponding port of above-mentioned node is automatic Close, after the node for breaking down successfully is repaired, automatically turn on the corresponding port of above-mentioned node.

Beneficial effects of the present invention include：Multiple nodes are grouped, each packet includes a storage value-at-risk Node higher and the relatively low node of a storage value-at-risk so that the average storage value-at-risk relative equilibrium in each packet, Avoid redundant data distribution some nodes be Frequent Troubles node situation, by redundant data be distributed to so storage In the packet of risk balance, facilitate data maintenance and reduce loss of data risk.

【Brief description of the drawings】

Accompanying drawing described herein be for providing a further understanding of the present invention, constituting the part of the application, but Inappropriate limitation of the present invention is not constituted, in the accompanying drawings：

Fig. 1 is the structure chart of distributed system of the present invention.

Fig. 2 is the flow chart of date storage method in distributed system of the present invention.

【Specific embodiment】

Describe the present invention in detail below in conjunction with accompanying drawing and specific embodiment, illustrative examples therein and say It is bright to be only used for explaining the present invention but not as a limitation of the invention.

It is the distributed system applied of the invention referring to accompanying drawing 1, the system includes multiple calculate nodes.

Referring to accompanying drawing 2, date storage method in a kind of distributed system, the method comprises the following steps：

R_i=S_i×P_i+1/F_i×Q_i+G_i×T_i(a)；

Wherein, memory capacity, operation load and failure-frequency be influence respectively node storage value-at-risk it is different because Element, wherein memory capacity represent a storage capacity for node, and memory capacity is bigger, then above-mentioned node is because data storage pressure The failure risk for causing is lower, otherwise higher；The operation load of one node is bigger, then above-mentioned node is because operation excess load institute The failure risk for causing is bigger, otherwise lower；The failure-frequency that one node occurs within the history run cycle is higher, then show The possibility that above-mentioned node breaks down in the cycle afterwards is higher, otherwise lower.

In one embodiment, memory capacity, operation load and failure-frequency are recorded in a table, each node Memory capacity can refer to its hard disc of computer memory capacity, and the memory capacity of each node is recorded in table, monitor that each is saved Be run multiple times shared system resource in point predetermined time cycle, and using the most system resources shared by operation as Highest system operation load record monitors the failure frequency in each node predetermined time cycle in above-mentioned table, makees It is that failure-frequency is recorded in table；

In the storage value-at-risk R of calculate node_iWhen, the memory capacity corresponding to above-mentioned node, operation are read from table and is born Carry and failure-frequency is calculated according to above-mentioned formula (a).

(2) m node is lined up an ordered queue by the order according to storage value-at-risk from low to high, by ordered queue First node of head of the queue and tail of the queue last node taking-up constitute the first storage packet, for the orderly team that remaining node is constituted Row continue with the follow-up multiple storage packets of composition in the manner described above, until only 2 or 3 nodes in ordered queue, i.e., Corresponding to the situation that m is even number and odd number, then using above-mentioned 2 or 3 nodes as a storage packet, final m node divides Into k storage packet；

Because each packet includes storage value-at-risk node and the relatively low node of a storage value-at-risk higher, So that the average storage value-at-risk relative equilibrium in each packet, it is to avoid some nodes of redundant data distribution are failure frequently The situation of the node of hair, redundant data is distributed in so packet of storage risk balance, facilitates data maintenance and reduction Loss of data risk.

Thus, it will greatly improve the speed of repair data, repair time is shortened, it is right when multiple nodes break downs Multiple nodes are repaired parallel, and the data slot and corresponding replicate data fragment in a data fragment packet are according to random Mode is stored in two nodes in a storage packet.It is whole in quantity of the machine quantity more than the storehouse on failed machines The time-consuming of individual repair process is usually only necessary to dozens of minutes, solves the problems, such as that data efficient is repaired automatically.

(6) man-to-man port is set to each node, when a node fails, the corresponding port of above-mentioned node is automatic Close, after the node for breaking down successfully is repaired, automatically turn on the corresponding port of above-mentioned node.So as to ensure data just Really read, and avoid the problem for using error listing access port.

By the above method, be grouped for multiple nodes by the present invention, each packet include one store value-at-risk compared with Node high and the relatively low node of a storage value-at-risk so that the average storage value-at-risk relative equilibrium in each packet, keep away Exempted from redundant data distribution some nodes be Frequent Troubles node situation, by redundant data be distributed to so storage wind During danger is grouped in a balanced way, facilitates data maintenance and reduce loss of data risk.

The above is only better embodiment of the invention, therefore all constructions according to described in present patent application scope, The equivalent change or modification that feature and principle are done, is included in the range of present patent application.

Claims

1. date storage method in a kind of distributed system, it is characterised in that the method comprises the following steps：

(1) by m node i in distributed system according to memory capacity, operation load and failure-frequency according to the following equation (a) Calculate the storage value-at-risk R for determining each node_i:

R_i=S_i×P_i+1/F_i×Q_i+G_i×T_i(a)；

Wherein S_iRepresent the memory capacity of node i, P_iRepresent the weight corresponding to memory capacity, F_iRepresent the history run of node i In highest system operation load, Q_iRepresent the weight corresponding to operation load, G_iRepresent the failure in the history run of node i Frequency, T_iRepresent the weight corresponding to failure-frequency, and P_i, Q_i, G_i>1；

(2) m node is lined up an ordered queue by the order according to storage value-at-risk from low to high, by head of the queue in ordered queue First node and tail of the queue last node taking-up constitute the first storage packet, are pressed for the ordered queue that remaining node is constituted The follow-up multiple storage packets of composition are continued with according to aforesaid way, until only 2 or 3 nodes in ordered queue, then will be upper State 2 or 3 nodes are grouped as a storage, final m node is divided into k storage packet；

(3) when system receives data storage request, data are cut into k according to fragment, each data slot are replicated, Obtain the packet of k groups data slots, every group of data slot is grouped includes data slot replicate data fragment corresponding with its；

(4) data slot in a data fragment packet and corresponding replicate data fragment are respectively stored into a storage point In two nodes in group, until the data slot and replicate data fragment in the packet of k groups data slot all store k storage In packet；

(5) when a nodes break down in storing packet, what another node in above-mentioned storage packet was stored Data slot or replicate data fragment, repair to the above-mentioned node for breaking down；

(6) man-to-man port is set to each node, when a node fails, the corresponding port of above-mentioned node is closed automatically Close, after the node for breaking down successfully is repaired, automatically turn on the corresponding port of above-mentioned node.

2. date storage method in distributed system according to claim 1, it is characterised in that occur when multiple nodes therefore During barrier, multiple nodes are repaired parallel.

3. date storage method in distributed system according to claim 1, it is characterised in that a data fragment packet In data slot and corresponding replicate data fragment according to random fashion store to one storage packet in two nodes in.