CN111131457B

CN111131457B - Capacity and bandwidth compromise method and system for heterogeneous distributed storage

Info

Publication number: CN111131457B
Application number: CN201911355800.4A
Authority: CN
Inventors: 骆源; 王旌兆; 顾振兴
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2019-12-25
Filing date: 2019-12-25
Publication date: 2021-11-30
Anticipated expiration: 2039-12-25
Also published as: CN111131457A

Abstract

The invention provides a capacity and bandwidth compromise method and a system of heterogeneous distributed storage, which comprise a client module, a repair sequence generation module and a compromise curve drawing module; inputting parameter information of a storage system through a client module input module; inputting the parameter information of the storage system into a repair sequence generation module to obtain a repair sequence; the repairing sequence is output to a compromise curve drawing module to obtain a compromise curve, and the compromise curve is output to an output module of the client module; the repair sequence generation module: for any bandwidth and capacity, analyzing the influence of the repair sequence on the minimum cut to generate the repair sequence with the minimum cut of the information flow graph; the compromise curve drawing module: and drawing a compromise curve of the storage capacity and the repair bandwidth of the storage system. Aiming at the heterogeneous distributed storage system, the invention provides a method for drawing a compromise curve of storage capacity and repair bandwidth by analyzing the upper bound of the file storage size which can be achieved by different repair schemes.

Description

Capacity and bandwidth compromise method and system for heterogeneous distributed storage

Technical Field

The invention relates to the field of data storage, in particular to a method and a system for compromising capacity and bandwidth of heterogeneous distributed storage, and more particularly to calculation of capacity of a heterogeneous distributed storage system and drawing of a compromise curve of the storage capacity and repair bandwidth.

Background

In recent years, with the rapid development of internet technology and the development of the whole information industry, information is generated, transmitted, processed and stored in large quantities and is in an exponential growth trend. In order to meet the storage requirements of mass data, the distributed storage system is widely applied due to the characteristics of low cost, strong expansibility, high access speed, high reliability, higher concurrent access capacity support and the like.

Erasure codes can greatly reduce data redundancy while ensuring higher data reliability, and thus are widely applied to distributed storage systems. The working principle of erasure codes is as follows: an erasure code is generally a code that encodes a document using a linear code encoding technique, and the original data of a size is divided and encoded into data of sizes and stored on nodes. If an erasure code satisfies the property: any piece of data can recover the original data, and we say that this erasure code satisfies the mds (maximum Distance separate) property. Linear codes that satisfy MDS properties are referred to as MDS codes. MDS codes are a very memory efficient class of coding schemes. While MDS codes are optimal in terms of redundancy and reliability tradeoffs, repairing a node requires access to the other intact nodes as well. If we add extra check information to some (less than one) information bits, then when these nodes are damaged, only these check-related nodes can be accessed, and there is no need to access one node. This addition of extra parity bits reduces the storage efficiency to some extent, but can save repair bandwidth by a large amount.

As described above, the conventional erasure codes require a large amount of network bandwidth to repair the damaged nodes, and the addition of extra parity bits reduces the storage efficiency. In order to balance the relationship between the storage capacity and the repair bandwidth, an information flow graph is introduced to model a distributed storage system, the system capacity is defined by using a network coding method, and accordingly, the compromise relationship between the storage capacity of a node and the repair bandwidth of the node is described. The construction of the regeneration code is mainly based on the Minimum Storage point and the Minimum Bandwidth point on the optimal compromise relationship curve, and respectively corresponds to a Minimum Storage Regeneration (MSR) code and a Minimum Bandwidth Regeneration (MBR) code.

The above studies of erasure coded data recovery are all based on the assumption that the nodes in a distributed storage system are indifferent. In a practical distributed system, the system tends to be heterogeneous, i.e. the amount of data stored by each node and downloaded from the helper node is different. In this case, the calculation of the compromise between the capacity and the repair bandwidth of the heterogeneous distributed storage system is very important, because the construction of the regeneration code needs to be based on the compromise relationship between the storage capacity and the repair bandwidth.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a capacity and bandwidth compromise method and system for heterogeneous distributed storage.

The capacity and bandwidth compromise system for the heterogeneous distributed storage provided by the invention comprises the following components:

module M1: inputting parameter information of a storage system through a client module;

module M2: inputting the parameter information of the storage system into a repair sequence generation module to obtain a repair sequence;

module M3: calculating the relation between the storage capacity and the bandwidth of a large-small system of the file which can be correctly stored by the storage system through the repair sequence; drawing a curve by a compromise curve drawing step by using the relation between the storage capacity and the bandwidth;

the client module is used as a user interface;

the repair sequence generation module: for any bandwidth and capacity, analyzing the influence of the repair sequence on the minimum cut to generate the repair sequence with the minimum cut of the information flow graph;

the compromise curve drawing module: and drawing a compromise curve of the storage capacity and the repair bandwidth of the storage system.

Preferably, said module M1 comprises: acquiring parameter information of the heterogeneous distributed storage system through the built heterogeneous distributed storage system;

the parameter information includes: the cluster number L of the heterogeneous distributed storage system, the storage node number R of each cluster, the number E of scattered storage points and/or the total storage point number n; wherein n ═ LR + E; erasure code parameters (n, k) adopted by users, node transmission bandwidth beta in the cluster_ICross cluster transmission bandwidth beta_C。

Preferably, said module M2 comprises:

module M2.1: the node cluster source sequence generating module is used for inputting parameter information of the heterogeneous distributed storage system as the input of the node cluster source sequence generating module to generate a node cluster source sequence, and the generated node cluster source sequence is the minimum cut of an information flow graph in all the cluster source sequences;

module M2.2: the cluster position sequence generating module is used for generating a cluster position sequence by taking the minimum cut node cluster source sequence of the information flow graph generated by the node cluster source sequence generating module as the input of the cluster position sequence generating module, wherein the generated cluster position sequence is the minimum cut node of the information flow graph in all cluster positions of the current cluster source sequence;

the minimum cut value of the information flow graph is the maximum value of the storable files, namely the relation between the size and the capacity of the storable files and the bandwidth.

Preferably, said module M2.1 comprises: node cluster source p ═ p (p)₀,p₁,...p_i...p_L) (ii) a The node cluster source represents the number of the help nodes in each cluster, i.e. the number of the ith cluster used as the help node is p_iA plurality of; p is a radical of₀Representing the number of scatter points as help nodes;

module M2.1.1: determining the number of scattered points as help nodes, and selecting the scattered points as the help nodes;

module M2.1.2: for the selection of the rest of the help nodes, the cluster numbers in the node cluster source are sequentially selected from small to large; repeating blocks M2.1.1-M2.1.2 until all selected nodes have been taken;

said module M2.2 comprises: node position order q ═ q (q)₁,q₂...q_i...q_k) The node position sequence is used for describing the number of the cluster to which each node belongs in a repair sequence, namely the ith repair node is from the qth_iA cluster;

module M2.2.1: starting from the cluster 1, preferentially selecting nodes from low to high according to the cluster number;

module M2.2.2: when the cluster with the largest number is obtained or the current cluster has no node selection, the cluster 1 is obtained again; repeating blocks M2.2.1-M2.2.2 until all selected nodes have been taken;

module M2.2.3: and selecting all scatter points as the help nodes.

Preferably, said module M3 comprises:

module M3.1: sequentially calculating the edge entering weight coefficient a of k selected nodes in the information flow graph_iAnd b_i；

Module M3.2: incorporating the edge weight coefficient a_iAnd b_iAnd node transmission bandwidth beta in the cluster_IAnd cross-cluster transmission bandwidth beta_CCorrelation, calculating edge weights w_i；

Module M3.3: calculating the edge weights w separately_iAnd beta_CAnd (4) integrating the compromise relationship of the k selected nodes, and drawing a compromise curve by an iterative method.

The invention provides a method for compromising capacity and bandwidth of heterogeneous distributed storage, which comprises the following steps:

step M1: inputting parameter information of a storage system through a client module;

step M2: inputting the parameter information of the storage system into a repair sequence generation module to obtain a repair sequence;

step M3: calculating the relation between the storage capacity and the bandwidth of a large-small system of the file which can be correctly stored by the storage system through the repair sequence; drawing a curve by a compromise curve drawing step by using the relation between the storage capacity and the bandwidth;

the client module is used as a user interface;

Preferably, the step M1 includes: acquiring parameter information of the heterogeneous distributed storage system through the built heterogeneous distributed storage system;

Preferably, the step M2 includes:

step M2.1: the node cluster source sequence generating module is used for inputting parameter information of the heterogeneous distributed storage system as the input of the node cluster source sequence generating module to generate a node cluster source sequence, and the generated node cluster source sequence is the minimum cut of an information flow graph in all the cluster source sequences;

step M2.2: the cluster position sequence generating module is used for generating a cluster position sequence by taking the minimum cut node cluster source sequence of the information flow graph generated by the node cluster source sequence generating module as the input of the cluster position sequence generating module, wherein the generated cluster position sequence is the minimum cut node of the information flow graph in all cluster positions of the current cluster source sequence;

Preferably, said step M2.1 comprises: node cluster source p ═ p (p)₀,p₁,...p_i...p_L) (ii) a The node cluster source represents the number of the help nodes in each cluster, i.e. the number of the ith cluster used as the help node is p_iA plurality of; p is a radical of₀Representing the number of scatter points as help nodes;

step M2.1.1: determining the number of scattered points as help nodes, and selecting the scattered points as the help nodes;

step M2.1.2: for the selection of the rest of the help nodes, the cluster numbers in the node cluster source are sequentially selected from small to large; repeating the steps M2.1.1 to M2.1.2 until all the selected nodes are obtained;

said step M2.2 comprises: node position order q ═ q (q)₁,q₂...q_i...q_k) The node position sequence is used for describing the number of the cluster to which each node belongs in a repair sequence, namely the ith repair node is from the qth_iA cluster;

step M2.2.1: starting from the cluster 1, preferentially selecting nodes from low to high according to the cluster number;

step M2.2.2: when the cluster with the largest number is obtained or the current cluster has no node selection, the cluster 1 is obtained again; repeating the steps M2.2.1 to M2.2.2 until all the selected nodes are obtained;

step M2.2.3: and selecting all scatter points as the help nodes.

Preferably, the step M3 includes:

step M3.1: sequentially calculating the edge entering weight coefficient a of k selected nodes in the information flow graph_iAnd b_i；

Step M3.2: incorporating the edge weight coefficient a_iAnd b_iAnd node transmission bandwidth beta in the cluster_IAnd cross-cluster transmission bandwidth beta_CCorrelation, calculating edge weights w_i；

Step M3.3: calculating the edge weights w separately_iAnd beta_CAnd (4) integrating the compromise relationship of the k selected nodes, and drawing a compromise curve by an iterative method.

Compared with the prior art, the invention has the following beneficial effects: existing research considers the trade-off of capacity and bandwidth of homogeneous distributed storage systems; by isomorphism, it is meant that the transmission bandwidth between all nodes is the same; aiming at the heterogeneous distributed storage system, the invention provides a method for drawing a compromise curve of storage capacity and repair bandwidth by analyzing the upper bound of the file storage size which can be achieved by different repair schemes.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a general block diagram of the process;

FIG. 2 is a diagram of algorithm 0-1 generating a repair scenario;

fig. 3 is a plot of the storage capacity and repair bandwidth trade-off for this cluster using algorithms 0-2.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

The invention aims to provide a set of efficient and feasible system capacity calculation method of a heterogeneous distributed storage system and a method for drawing a compromise curve of storage capacity and repair bandwidth. For a given parameter (n, k, L, R, E ═ 0) heterogeneous distributed storage system, a repair scheme with the minimum cut corresponding to the information flow graph is generated for all feasible repair sequences, the system capacity is calculated, and then a compromise curve is drawn according to the system capacity.

The distributed storage system is a cloud storage service with low cost, strong expansibility, high access speed and high reliability. In a heterogeneous distributed storage system, calculation of system capacity and calculation of a compromise between storage capacity and repair bandwidth are very important problems. According to the method, through research on the repair sequence of the heterogeneous distributed storage system when the nodes are damaged, a method for generating the repair node sequence with the minimum cut of an information flow graph and a method for drawing a compromise curve of the storage capacity and the repair bandwidth of the heterogeneous distributed storage system according to the repair sequence are provided. And additionally provides a practical calculation method. The method is suitable for capacity calculation and compromise curve drawing of the heterogeneous distributed storage system which is commonly used at present.

The capacity and bandwidth compromise system for the heterogeneous distributed storage comprises a client module, a repair sequence generation module and a compromise curve drawing module;

specifically, the module M1 includes: acquiring parameter information of the heterogeneous distributed storage system through the built heterogeneous distributed storage system;

specifically, the module M2 includes:

module M2.1: the node cluster source sequence generating module is used for inputting the parameter information of the storage system as the input of the node cluster source sequence generating module to generate a node cluster source sequence, and the generated node cluster source sequence is the minimum cut of an information flow graph in all the cluster source sequences;

the min-cut min represents the maximum size of a file that the system can correctly repair, i.e. given the relation between the size of a file that can be stored and the capacity and bandwidth, a file that exceeds this size may not be correctly repaired. Only files that are less than or equal to the minimum cut size are guaranteed to be repaired correctly.

In particular, said module M2.1 comprises: node cluster source p ═ p (p)₀,p₁,...p_i...p_L) (ii) a The node cluster source represents the number of the help nodes in each cluster, i.e. the number of the ith cluster used as the help node is p_iA plurality of; p is a radical of₀Representing the number of scatter points as help nodes;

module M2.2.2: when the cluster with the largest number is obtained or no node can be selected in the current cluster, the collection is started from the cluster 1 again; blocks M2.2.1 through M2.2.2 are repeated until all selected nodes have been retrieved.

Module M2.2.3: and selecting all scatter points as the help nodes.

Module M3: calculating the relation between the storage capacity and the bandwidth of a large-small system of the file which can be correctly stored by the storage system through the repair sequence; drawing a curve by a compromise curve drawing step by using the relation between the storage capacity and the bandwidth; enumerating a plurality of bandwidth points, respectively calculating the corresponding minimum storage capacity of the bandwidth points, and fitting each point to form a compromise curve;

specifically, the module M3 includes:

The client module: as a user interface; the user inputs the parameter information of the storage system through the client module. And the compromise curve is returned to the user by the client module after the completion of the drawing.

The repair sequence generation module: for any bandwidth and capacity, analyzing the influence of the repair sequence on the minimum cut to generate the repair sequence with the minimum cut of the information flow graph; consider the impact of the cluster origin and cluster location of the helper node on the minimum cut of the information flow graph. The generated node cluster source sequence has the smallest minimal cut in the information flow graph among all possible cluster source sequences.

The invention provides a capacity and bandwidth compromise method for heterogeneous distributed storage, which comprises a client step, a repair sequence generation step and a compromise curve drawing step;

specifically, the step M1 includes: acquiring parameter information of the heterogeneous distributed storage system through the built heterogeneous distributed storage system;

the parameter information includes: the method comprises the following steps that the cluster number L of a storage system, the storage node number R of each cluster, the number E of scattered point storage points and/or the total storage point number n are/is calculated; wherein n ═ LR + E; erasure code parameters (n, k) adopted by users, node transmission bandwidth beta in the cluster_ICross cluster transmission bandwidth beta_C。

specifically, the step M2 includes:

step M2.1: the node cluster source sequence generating module is used for inputting the parameter information of the storage system as the input of the node cluster source sequence generating module to generate a node cluster source sequence, and the generated node cluster source sequence is the minimum cut of an information flow graph in all the cluster source sequences;

In particular, said step M2.1 comprises: node cluster source p ═ p (p)₀,p₁,...p_i...p_L) (ii) a The node cluster source represents the number of the help nodes in each cluster, i.e. the number of the ith cluster used as the help node is p_iA plurality of; p is a radical of₀Representing the number of scatter points as help nodes;

step M2.1.2: for the selection of the rest of the help nodes, the cluster numbers in the node cluster source are sequentially selected from small to large; repeating blocks M2.1.1-M2.1.2 until all selected nodes have been taken;

step M2.2.2: when the cluster with the largest number is obtained or no node can be selected in the current cluster, the collection is started from the cluster 1 again; steps M2.2.1 through M2.2.2 are repeated until all selected nodes have been fetched.

Step M2.2.3: and selecting all scatter points as the help nodes.

Step M3: calculating the relation between the storage capacity and the bandwidth of a large-small system of the file which can be correctly stored by the storage system through the repair sequence; drawing a curve by a compromise curve drawing step by using the relation between the storage capacity and the bandwidth; enumerating a plurality of bandwidth points, respectively calculating the corresponding minimum storage capacity of the bandwidth points, and fitting each point to form a compromise curve;

specifically, the step M3 includes:

The repair sequence generation module: for any bandwidth and capacity, analyzing the influence of the repair sequence on the minimum cut to generate the repair sequence with the minimum cut of the information flow graph; the impact of the cluster source and the cluster location of the help node on the minimum cut of the information flow graph is considered. The generated node cluster source sequence has the smallest minimal cut in the information flow graph among all possible cluster source sequences.

The present invention is further described in detail by the following preferred examples:

the implementation scheme comprises three parts: the method comprises the implementation of a client module, the implementation of a repair sequence generation module and the implementation of a compromise curve drawing module.

Client module implementation

The client module consists of an input module and an output module. The input module receives parameters input by a user and transmits the parameters to the repair sequence generation module. And the output module outputs the image drawn by the compromise curve drawing module in a form specified by a user.

Repair sequence generation module implementation

A repair node sequence is described in the following manner, node cluster source p ═ p (p)₀,p₁,,…,p_L) And node position order q ═ q (q)₁,q₂,…,q_k). The node cluster source represents the number of the help nodes in each cluster, i.e. the number of the ith cluster used as the help node is p_iAnd (4) respectively. p is a radical of₀Indicating the number of scatter points as help nodes. The node position sequence is used to describe the number of the cluster to which each node belongs in a repair scheme, i.e. the ith repair node is from the qth_iThe number cluster. The node cluster source and the node position order both affect the size of the minimum cut of the information flow graph corresponding to the repair scheme.

The invention employs the following algorithm to determine the node cluster source.

Firstly, determining the number of scattered points as help nodes, selecting all scattered points as help nodes, and enabling p to be₀＝E

For the remaining help node selections, preference is given to selecting from the cluster with the smaller cluster number. That is, a node is preferentially selected from the cluster 1, and when all of the R nodes in the cluster 1 are selected, the node is selected from the cluster 2, and when all of the R nodes in the cluster 2 are selected, the node is selected from the cluster 3. And so on.

It can be proved that the node cluster source sequence generated by the above algorithm is the minimum cut of the information flow graph in all possible node source sequences.

After determining the cluster source of the node, the present invention determines the node cluster location by the following method.

Starting from cluster 1, nodes are preferably selected from low to high according to the cluster number.

When the cluster with the largest number is taken, or no node can be selected in the current cluster, the taking is started again from the cluster 1.

And repeating the steps above until all cluster nodes are completely taken.

And finally, sequentially selecting each scatter point as a help node.

It can be demonstrated that the minimal cut of the information flow graph among all possible distributions of node cluster locations determined by the above algorithm is minimal.

As shown in fig. 2, the pseudo code of the algorithm is shown in algorithm 0-1, after the repair sequence of the node is obtained, that is, the system capacity can be obtained by calculating the minimum cut of the corresponding information flow graph.

Compromise curve drawing module

The following parameters are required for the compromise curve plotting: the size M of the file that needs to be stored. When a cluster has a node damaged, the new node downloads beta from the node in the same cluster_IData of (referred to as intra-cluster repair bandwidth), downloading beta from an off-cluster node_CIs known as cross cluster repair bandwidth. After the repair scheme corresponding to the information flow graph with the minimum cut is known, an algorithm 0-2 is used to draw a compromise curve of the storage capacity and the repair bandwidth of the cluster as shown in fig. 3.

Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. A capacity and bandwidth tradeoff system for heterogeneous distributed storage, comprising:

the client module is used as a user interface;

the compromise curve drawing module: drawing a compromise curve of the storage capacity and the repair bandwidth of the storage system;

the module M2 includes:

2. The capacity and bandwidth trading system for heterogeneous distributed storage according to claim 1, wherein the module M1 comprises: acquiring parameter information of the heterogeneous distributed storage system through the built heterogeneous distributed storage system;

3. A capacity and bandwidth trading system for heterogeneous distributed storage according to claim 1, wherein said module M2.1 comprises: node cluster source p ═ p (p)₀,p₁,...p_i...p_L) (ii) a The node cluster source represents the number of the help nodes in each cluster, i.e. the number of the ith cluster used as the help node is p_iA plurality of; p is a radical of₀Representing the number of scatter points as help nodes;

module M2.2.3: and selecting all scatter points as the help nodes.

4. The capacity and bandwidth trading system for heterogeneous distributed storage according to claim 1, wherein the module M3 comprises:

5. A capacity and bandwidth compromise method for heterogeneous distributed storage is characterized by comprising the following steps:

the client module is used as a user interface;

the step M2 includes:

6. The capacity and bandwidth trade-off method for the heterogeneous distributed storage according to claim 5, wherein the step M1 comprises: acquiring parameter information of the heterogeneous distributed storage system through the built heterogeneous distributed storage system;

7. The method of claim 5, wherein the step M2.1 comprises: node cluster source p ═ p (p)₀,p₁,...p_i...p_L) (ii) a The node cluster source represents the number of the help nodes in each cluster, i.e. the number of the ith cluster used as the help node is p_iA plurality of; p is a radical of₀Representing the number of scatter points as help nodes;

step M2.2.3: and selecting all scatter points as the help nodes.

8. The capacity and bandwidth trade-off method for heterogeneous distributed storage according to claim 5, wherein said step M3 comprises: