CN108512553B

CN108512553B - Truncated regeneration code construction method for reducing bandwidth consumption

Info

Publication number: CN108512553B
Application number: CN201810194923.3A
Authority: CN
Inventors: 何荣祥; 顾术实; 李月; 李娟�; 张钦宇; 王野
Original assignee: Shenzhen Graduate School Harbin Institute of Technology
Current assignee: Shenzhen Graduate School Harbin Institute of Technology
Priority date: 2018-03-09
Filing date: 2018-03-09
Publication date: 2022-09-27
Anticipated expiration: 2038-03-09
Also published as: CN108512553A

Abstract

The invention relates to a construction method of a truncated regeneration code for reducing bandwidth consumption, which deletes t information bits on the basis of (n, k) mother code parameters to obtain (n-t, k-t) truncated subcodes, adds a redundancy order, and solves the numerical value of redundancy quantity after the t nodes are coded to store data which are 0; during decoding and repairing, the coefficient vectors of t truncated nodes are supplemented to a matrix formed by the coefficient vectors corresponding to the connected nodes to form a new k x d coefficient matrix, a coding matrix formed by data downloaded from the nodes is supplemented with t rows of zero vectors to form a new coding matrix, and the supplemented coefficient matrix and the received data matrix are decoded or repaired according to MSR mother codes. The invention reduces the computational complexity of the truncated regeneration code, solves the problems of less parameter selection and poorer adaptability of the regeneration code when the network node and the bandwidth resource are limited, and realizes the regeneration code structure with low complexity and low bandwidth overhead.

Description

Truncated regeneration code construction method for reducing bandwidth consumption

Technical Field

The invention relates to the technical field of distributed storage, in particular to a method for constructing a truncated regeneration code capable of reducing bandwidth consumption.

Background

With the rapid development of the internet, the arrival of a big data era, the rapid promotion of 5G mobile communication and the explosive growth of global data traffic, higher requirements are put forward on the data storage capacity of the system. Compared with the traditional centralized storage system, the distributed storage system has the advantages of low cost, large storage capacity, strong expansibility, high parallel processing speed and the like, and gradually receives wide attention from academia and the industry. In order to ensure the overall reliability and stability of the system, a redundancy strategy is generally adopted in a distributed storage system, a replication scheme is adopted in distributed systems such as gfs (google File system), and erasure correction coding is adopted in systems such as OceanStore, but the problems of storage resource waste, overlarge bandwidth overhead and the like cannot be avoided, and the working efficiency of the system is seriously influenced.

The network coding endows the intermediate node with computing power, carries out coding operation on data, and improves the bandwidth utilization rate and the throughput rate of the whole system. Dimakis et al propose regeneration Codes (Regenerating Codes) based on network coding, achieving the optimal solution of storage-bandwidth overhead tradeoff. However, under complex network conditions, especially when bandwidth resources are limited, the adaptability of the regenerated code is poor, and further reducing the requirement of the regenerated code on the bandwidth resources is an urgent problem to be solved.

Rashmi et al propose a construction method of MBR and MSR based on Product-Matrix (Product-Matrix) framework, and although the construction of MBR code with arbitrary parameters can be realized, there are still some limitations on MSR. Goparaju et al propose a new MSR structure applicable to any n, k, d parameters based on the interference alignment idea. Li et al propose an invariant subspace MSR based generic framework with two check structures, capable of achieving minimal I/O reads. Kamath et al propose a construction of a local regeneration code, which realizes local repair characteristics through the transformation of the conventional MSR and MBR. Papariiopoulos et al propose simple regeneration codes, which are regeneration codes that achieve local repair characteristics by combining MDS codes with xor operations. Shah et al have studied the security of the regenerated code, and have realized the regenerated code with secrecy ability under the scene that the eavesdropper can visit the storage node data, and can download the repair data.

Through the analysis, the calculation cost of the regeneration code is still large at present, and the time cost of data downloading and repairing is influenced, so that the working efficiency in an actual system is influenced.

Disclosure of Invention

Aiming at the defects or shortcomings in the prior art, the invention aims to solve the technical problems that: a method for constructing a shortened regenerated code with reduced bandwidth consumption is provided.

The technical scheme is as follows:

a construction method of a truncated regeneration code for reducing bandwidth consumption is disclosed, which deletes t information bits on the basis of (n, k) mother code parameters to obtain (n-t, k-t) truncated subcodes, adds a redundancy order, and solves the value of redundancy quantity when t nodes store data which are 0 after coding;

when decoding and repairing, the coefficient vectors of t truncated nodes are supplemented to the matrix formed by the coefficient vectors corresponding to the connected nodes to form a new k multiplied by d coefficient matrix, t rows of zero vectors are supplemented to the coding matrix formed by the data downloaded from the nodes to form a new coding matrix, and the supplemented coefficient matrix and the received data matrix are decoded or repaired according to the decoding or repairing mode of the MSR mother code.

As a further improvement of the invention, the method comprises the following steps:

(1) with the parameter (n, k, d) [ alpha, gamma, B [ ]]For MSR mother code, truncating t nodes to obtain parameter (n) _s ,k _s ,d _s )[α _s ,γ _s ,B _s ]The following relationship exists between the MSR mother code and the truncated subcode parameters in the truncated subcode of (1):

(2) under the frame of a product matrix, deleting t nodes of information bits and adding redundancy to obtain

In which the first alpha data are known redundancy amounts, consisting of two alpha x alpha diagonalsThe message matrix M constructed by the matrix of d multiplied by alpha is as follows:

(3) and constructing a coding matrix C which is psi.M, making the first t row elements in C all equal to 0, and solving the value of the redundancy quantity, wherein psi is a Van der Menu matrix which is used as a coefficient matrix.

As a further improvement of the invention, the method is based on the decoding of any k from the concatenation _s Each node, corresponding to the node coefficient vector constituting k _s X d coefficient matrix denoted as Ψ _{s_DC} Will make Ψ _{s_DC} The coefficient vectors of the t truncated nodes are complemented to form a new k x d coefficient matrix psi _DC . From k _s Each node downloads data to form k _s Coding matrix Ψ for x d _{s_DC} M, complementing t rows of zero vectors to form a new k x d coding matrix psi _DC And M. The supplemented coefficient matrix and the supplemented coding matrix may be decoded in accordance with a decoding method of an MSR code formed from a product matrix.

The specific steps of decoding include:

(1) the data collector needs to connect any k _s Each node, the corresponding node coefficient vector constituting k _s X d coefficient matrix Ψ _{s_DC} Will make Ψ _{s_DC} Complementing the coefficient vectors of t truncated nodes to form a new k x d coefficient matrix psi _DC Obtaining: psi _DC ＝[Φ _DC Λ _DC Φ _DC ]；

From k _s Individual node download data construct k _s Coding matrix Ψ of xd _{s_DC} M, supplementing t rows of zero vectors to the coding matrix to form a new k x d coding matrix psi _DC M, obtaining:

(2) will collect the data Ψ _DC M right ride

Expressed as:

p and Q are intermediate variables and are both symmetric matrices, wherein,

(3) introducing the (i, j) matrix element, when i ≠ j, since the symmetric matrix (i, j) element is the same as the (j, i) element, namely P _ij +λ _i Q _ij ＝P _ji +λ _j Q _ji P when solution i ≠ j _i,j And Q _i,j ；

When i is j, the rest is not known except the ith element

It is known that S1 can be solved, and S2 can be solved in the same way to complete the decoding. Wherein phi is expressed as phi _DC Column of (i.e., +) _i Is a matrix phi _DC The same applies to the ith column of (1).

As a further improvement of the invention, the method is carried out in such a way that, when repairing, the connection d _s A help node, denoted as

Helper node coefficient vector construction d _s X d coefficient matrix denoted as Ψ _{s_repair} Will make Ψ _{s_repair} The coefficient vectors of the t truncated nodes are complemented to form a new d x d coefficient matrix psi _repair . Beta data is downloaded from each helper node. Form d _s And the data matrix of multiplied by beta supplements the zero vector of t rows to form a new data matrix of multiplied by beta. And repairing the supplemented coefficient matrix and the received data matrix according to the repairing mode of the MSR code constructed by the product matrix.

The specific steps of repairing include:

(1) the coefficient vector of the failure node is represented as f, f

Calculating the missing data as

(2) Connection d _s A help node, denoted as

Form d _s X d coefficient matrix Ψ _{s_repair} Will make Ψ _{s_repair} The coefficient vectors of t truncated nodes are complemented to form a new d x d coefficient matrix psi _repair Obtaining:

(3) downloading beta data from each helper node, forming d _s The data matrix of x beta supplements t rows of zero vectors to form a new data matrix of d x beta to help the node data and phi _f Right multiplication to obtain

To a new node, the new node being from d _s A helper node accepting the data Ψ _repair Mφ _f And get the lead

Expressed as:

the lost data can be repaired by transposition

As a further improvement of the invention, the method comprises the following steps when applied to the overhead in complexity:

(1) defining the primitive polynomial as the XOR number of the basic operations in the finite field GF (2w) of g (z), the addition is bitwise XOR requiring w XOR operations, the multiplication is a multiplication of the polynomial and then dimension reduction using the primitive polynomial g (z), requiring μ w XOR, where μ ═ 1) + | g (z) | m ₀ ；

(2) Defining solving equation Ax ═ B, where a is an n × n matrix, the unknowns x are n × 1 matrix, B is an n × 1 matrix, sub-exclusive or is required to solve a-1, and the unknowns matrix x is solved by the product of the two matrices, a-1B.

As a further improvement of the invention, the encoding process of the method comprises the following steps:

(1) the ta complete redundancy quantities are solved by a ta group process and converted into a solved mathematical model Ax which is B, wherein A is a ta multiplied by ta matrix, the unknown number x is a ta multiplied by 1 matrix, and B is a ta multiplied by 1 matrix, so that the ta complete redundancy quantities need to be solved

A secondary exclusive or operation;

(2) multiplying a coefficient matrix of size (n + t) × (2k +2t-2) with a message matrix of size (2k +2t-2) × α requires α (n + t) (2k +2t-2) multiplications and α (n + t) (2k +2t-3) additions, each multiplication requiring μ w exclusive-ors and each addition requiring w exclusive-ors, performing α (n + t) w · [ (2k +2t-2) μ + (2k +2t-3)]After the XOR is carried out, averaging each bit of original data to obtain the XOR frequency of the codes

As a further improvement of the invention, the decoding process of the method comprises the following steps:

(1) the completion operation of the coefficient matrix and the collected data does not involve an exclusive or operation;

(2) data of size (k + t) × α is right-multiplied by α × (k + t)

Carrying out alpha (k + t) ² Sub-addition of (alpha-1) (k + t) ² Secondary multiplication;

(3) by passing

An equation in the form of Ax ═ B, where a is a 2 × 2 matrix, can be solved for the elements at positions i ≠ j in the P and Q matrices, for an xor of 4(k + t) ² μw；

(4) Due to the fact that

The original data S1 can be decoded by solving a equations set, and S2 is solved by Q.

As a further improvement of the invention, the repair process of the method comprises the following steps:

(1) in the help node, the node stores a 1 multiplied alpha matrix and a multiplied alpha multiplied beta failure node coefficient matrix to generate 1 multiplied beta new data;

(2) the new node receives the (2k + t +2) x β matrix from the helper node and the inverse of the (2k + t +2) x (2k + t +2) coefficient matrix

Multiplication and XOR order of beta (2k + t +2) [ (2k + t +2) μ w + (2k + t +1) w](ii) a Transpose addition again, require (2k + t +2) β μ xor. Average to each bit of original data, proceed

And performing XOR again to finish the repair.

The invention has the beneficial effects that:

in order to reduce the bandwidth consumption of the regeneration code in a system and improve the applicability of the regeneration code in a bandwidth resource limiting network, the invention achieves the purposes of reducing the number of storage nodes and reducing the bandwidth overhead by reducing partial information bits of a mother code on the premise of constructing the regeneration code by using a product matrix, and introduces Binary Addition and Shift arithmetic (BASIC) to reduce the computational complexity of the shortened regeneration code and solve the problems of less parameter selection and poorer adaptability of the regeneration code when network nodes and bandwidth resources are limited, thereby realizing the regeneration code constructing method with low complexity and low bandwidth overhead.

Drawings

Fig. 1a is a performance analysis of unit bandwidth overhead, MSR code and truncated MSR code when t is 1 according to the present invention;

fig. 1b is a performance analysis of unit bandwidth overhead, MSR code and truncated MSR code when t is 2 according to the present invention;

fig. 2a is a bandwidth overhead comparison of schemes where n is 8 and k is 3, where RS, MBR, MSR, mbrt is 1 and mbrt is 2;

fig. 2b is a bandwidth overhead comparison of the schemes of RS, MBR, MSR, mbrt ═ 1, and mbrt ═ 2 when n is 9 and k is 3 according to the present invention;

fig. 2c is a bandwidth overhead comparison of the schemes of RS, MBR, MSR, mbrt ═ 1, and mbrt ═ 2 when n is 16 and k is 6 according to the present invention;

FIG. 3a is a graph comparing the encoding complexity of the present invention using MSRs, truncated MSRs, and BASIC truncated MSRs;

FIG. 3b is a comparison of decoding complexity using MSRs, truncated MSRs, and BASIC truncated MSRs according to the present invention;

FIG. 3c is a graph comparing the repair complexity of the present invention using MSRs, truncated MSRs, and BASIC truncated MSRs.

Detailed Description

The invention is further described with reference to the following description and embodiments in conjunction with the accompanying drawings.

Description of the principles of the invention:

aiming at the problems that when network nodes and bandwidth resources are limited, the selection of parameters of the regeneration codes is less and the adaptability is poor, a method for constructing the shortened regeneration codes for reducing the bandwidth consumption is provided, and the implementation principle and the performance analysis are detailed in the following analysis.

Designing a group of regeneration codes with the parameters of (n, k, d) [ alpha, beta, B ], wherein n is the total number of nodes, k is the number of nodes needing to be connected for decoding, d is the number of nodes needing to be connected for repairing, alpha is the storage capacity of each node, beta is the size of data downloaded from one node during repairing, and B is the number of original symbols capable of being coded at one time. The existing indexes of the regeneration code, namely storage cost alpha and bandwidth cost gamma, describe the performance of the node and cannot reflect the performance in an actual system. For example, in table 1, when the link bandwidth is limited to β ═ 1 and the total data amount M is 6, and simply using two indexes α and γ as criteria for evaluating the performance of the regenerated code, it can be seen that the parameter (4,2,2) [1,2] MSR code has the minimum overhead and can be considered as the best performance, but in an actual system, the actual system memory occupation and bandwidth consumption of the (4,2,2) [1,2] MSR code are not the best due to the difference in the bandwidth limitation and the number of striping, whereas the (6,3,4) [2,4] MSR code with a larger α and γ performs better instead.

Table 1 parameter comparison for three MSR code schemes (M ═ 6, β ═ 1)

In order to fairly and intuitively compare the performance of the regeneration code in an actual system under different parameters, two new regeneration code indexes are defined: unit storage overhead usc (unit storage cost), that is, the hard disk space occupied by each unit data block during storage; unit bandwidth overhead, urb, (unit bandwidth), i.e. the transmission bandwidth consumed for repairing each unit data block. USC and URB of MSR code and MBR code are respectively expressed as

And

then: ,

the invention applies the truncation idea to the regeneration code and uses two indexes of unit storage cost USC and unit bandwidth cost URB to judge the performance of the regeneration code. The truncation is to delete t information bits on the basis of the (n, k) mother code to obtain a truncated subcode of (n-t, k-t), thereby achieving the purpose of shortening the code. Preferably, the key technology of the core of the invention is that some information bits are deleted, and redundancy is added, so that the data stored by t nodes after coding are all 0, and therefore, the nodes do not need to be stored, and the cost in an actual system is lower.

In the present invention, since the shortened MBR code cannot improve the performance, the present invention mainly considers the shortened MSR code.

The steps during encoding are as follows:

(1) assume that the parameters are (n, k, d) [ α, γ, B ]]The MSR mother code of (a) is obtained by truncating t nodes to obtain a parameter of (n) _s ,k _s ,d _s )[α _s ,γ _s ,B _s ]The following relationship exists between the parameters of the mother code and the child code in the truncated child code of (1), and is expressed by the formula:

n _s ＝n-t

k _s ＝k-t

d _s ＝d-t

α _s ＝α

γ _s ＝γ-t

B _s ＝B-t·α

(2) also, because the MSR mother code is a product matrix based construction and the parameters of the MSR mother code are constrained by the product matrix framework, the preferred inventive truncated MSR construction is an improvement under the product matrix framework. In the encoding process, the information is obtained after information bits are deleted and redundancy is added

Where the first a data is a known amount of redundancy and is a linear combination of the last elements. The coefficient matrix Ψ is a vandermonde matrix, shaped as:

and the message matrix M is a d × α matrix formed by two α × α diagonal matrices, in the form of:

(3) and (3) the coding matrix C is psi.M, all the first t row elements in C are equal to 0, and the linear relation between the alpha redundancy quantities and the original data can be determined, so that the numerical values of the redundancy quantities can be solved.

In decoding, from connecting arbitrary k _s Each node, corresponding to the node coefficient vector constituting k _s X d coefficient matrix denoted as Ψ _{s_DC} Will make Ψ _{s_DC} The coefficient vectors of the t truncated nodes are complemented to form a new k x d coefficient matrix psi _DC . From k _s Individual node download data construct k _s Coding matrix Ψ for x d _{s_DC} M, complementing t rows of zero vectors to form a new k x d coding matrix psi _DC And M. And decoding the supplemented coefficient matrix and the supplemented coding matrix according to a decoding mode of the MSR code formed by the product matrix.

The specific decoding process comprises the following steps:

(1) the data collector needs to connect any k _s Each node, the corresponding node coefficient vector constituting k _s X d coefficient matrix denoted as Ψ _{s_DC} . Will make Ψ _{s_DC} Complementing the coefficient vectors of t truncated nodes to form a new k x d coefficient matrix psi _DC The concrete formula is as follows:

Ψ _DC ＝[Φ _DC Λ _DC Φ _DC ]

(2) from k _s Individual node download data construct k _s Coding matrix Ψ for x d _{s_DC} M, supplementing t rows of zero vectors to the coding matrix to form a new k x d coding matrix psi _DC M, formulated as:

first, collect data Ψ _DC Right multiplication of M

Is formulated as:

let P and Q be intermediate variables, then:

due to S ₁ And S ₂ Are both symmetric matrices, so P and Q are both symmetric matrices. Therefore, it is

Simplified to be represented by P and Q

In addition, due to P, Q, Λ _DC Are all symmetric matrices, then

Also a symmetric matrix.

Secondly, introduce the (i, j) matrix element, when i ≠ j, since the symmetric matrix (i, j) element is the same as the (j, i) element, i.e. P _ij +λ _i Q _ij ＝P _ji +λ _j Q _ji P when solution i ≠ j _i,j And Q _i,j . When calculating i ═ j, the i-th element is not known, and the rest are

Are all known, S can be solved ₁ . By the same principle, S ₂ Finish decoding, where phi is all expressed as phi _DC Column of (i.e., +) _i Is a matrix phi _DC The same applies to the ith column of (1).

At the time of repair, connect d _s A help node, denoted as

Helper node coefficient vector construction d _s X d coefficient matrix denoted as Ψ _{s_repair} Will make Ψ _{s_repair} The coefficient vectors of t truncated nodes are complemented to form a new d x d coefficient matrixΨ _repair . Beta data is downloaded from each helper node. Form d _s And the data matrix of multiplied by beta supplements the zero vector of t rows to form a new data matrix of multiplied by beta. And repairing the repaired coefficient matrix and the received data matrix according to the repairing mode of the MSR code constructed by the product matrix.

The specific repairing process comprises the following steps:

(1) the coefficient vector of the failure node is represented as f, f

The missing data is calculated as:

(2) connection d _s A help node, denoted as

These helper node coefficient vectors constitute d _s X d coefficient matrix denoted as Ψ _{s_repair} . Will Ψ _{s_repair} The coefficient vectors of the t truncated nodes are complemented to form a new d x d coefficient matrix psi _repair The following formula is obtained:

(3) beta data is downloaded from each helper node. Form d _s And the data matrix of multiplied by beta supplements the zero vector of t rows to form a new data matrix of multiplied by beta. Helper node data and phi _f Right multiplication to obtain

And transmitting to the new node. New node slave d _s A helper node accepting the data Ψ _repair Mφ _f 。

First, the new node pre-multiplies the received data

Is formulated as:

second, lost data can be repaired by transposition

And finishing the repair.

In the present invention, the truncated MSR code is constructed based on the product matrix, so that the constraint d is 2 k-2. For the MSR code, a truncated MSR code with a parameter of (n, k,2k + t-2) can be obtained by truncating t bits from the MSR mother code of (n + t, k + t,2k +2t-2), and then the USC and URB of the truncated MSR code are respectively expressed as:

compared with the MSR code with the same n, k parameters, shortening the MSR can reduce the unit bandwidth overhead on the basis of maintaining the same unit storage overhead, as shown in simulation results of fig. 1a and 1 b.

It is seen from fig. 1a and 1b that, preferably, when the number of truncated bits is the same, the unit bandwidth overhead difference between the truncated MSR and the MSR code will gradually decrease as the k value increases, because the number of truncated bits is gradually reduced in proportion to the total number of nodes. The larger the truncated bit number t is, the larger the truncated node number ratio is, and the unit bandwidth overhead of the truncated MSR is reduced more obviously. Shortening the MSR code enables a reduction in the unit bandwidth overhead

The unit bandwidth overhead can be reduced by 10% by truncating one bit for k values equal to 5. After the k value is more than 10, the reduction of the truncated bit is less than 5%. When the k value is less than or equal to 9, the reduction of the truncated two bits reaches 10 percent. The reduction of two truncated bits after the k value is more than 19 is less than 5%.

In practical experiments, original files with sizes of 4KB, 8KB, 12KB, 16KB and 20KB were selected, and in the parameter fig. 2 a: n-8, k-3, fig. 2 b: n-9, k-3, fig. 2 c: under the condition that n is 16 and k is 6, comparing bandwidth overheads of several coding schemes such as RS code, MBR, MSR, and truncated MSR in an actual system, we obtain the bandwidth overheads as shown in fig. 2a, fig. 2b, and fig. 2 c.

First analyzing FIG. 2a, truncating the MSR reduces bandwidth overhead compared to the MSR, and bandwidth overhead decreases as the number of truncations increases. Secondly, comparing the bandwidth overhead under different redundancies with the same k value longitudinally, as shown in fig. 2a and fig. 2b, although the redundancy is increased by the n value, the bandwidth overhead of various schemes is kept unchanged, the bandwidth overhead which can be reduced by shortening the MSR is also unchanged, and the performance of the shortened MSR is verified to be independent of the redundancy. When the same redundancy is different for k values, as in fig. 2a and 2c, the RS code bandwidth overhead is still the largest, the MBR is kept to a minimum, and the MSR and the two truncated MSRs are centered. When k is 3, the MSR with one bit of truncation is reduced by 16.7% of bandwidth overhead compared with the MSR, and the two bits of truncation are reduced by 24.6%; when k is 6, the shortening of one bit is only reduced by 4.4%, the shortening of two bits is reduced by 12.1%, although the shortening still can reduce the bandwidth overhead, the reduction amplitude is not obvious when the k value is increased, and the reduction amplitude of the bandwidth overhead caused by the shortening in the actual system is basically consistent with the reduction amplitude of the bandwidth overhead caused by the shortening in the actual system

When the overhead construction in the aspect of complexity constructs the truncated regeneration code, the method of the invention comprises the following steps:

(1) first, a finite field GF (2) with primitive polynomial g (z) is defined ^w ) The addition is bitwise exclusive-or, w exclusive-or operations are required, the multiplication is multiplication of a polynomial, and then μ w exclusive-or is required by using the primitive polynomial g (z) dimension reduction, where μ ═ 1) + | g (z) | ₀ ；

(2) And secondly, defining the XOR times of solving equations, and using the XOR times as a mathematical model to solve the complexity analysis of the subsequent truncated MSR code. Assuming that the solution equation Ax is B, where A is n × n matrix, the unknowns x are n × 1 matrix, and B is n × 1 matrix, A- ¹ Requiring a sub-exclusive OR, the unknown matrix x being the product A of two matrices ^-1 B solves the result that the XOR number of the matrix multiplication is n ² Mu w + n (n-1) w, because the magnitude of the XOR times of the inverse matrix is far greater than that of the matrix product, for simple processing, the total XOR times of equation solution are set as

Assuming that the original data has Bm-bit symbols in common, the original data is divided into B original code blocks, each of which has m symbols, if the finite field is GF (2) ^w ) And w is m.

The encoding process is divided into two steps, specifically:

(1) firstly, t α group equations are used to solve t α complemented redundancy, and the t α complemented redundancy is converted into a mathematical model Ax ═ B, where a is a t α × t α matrix, the unknown number x is a t α × 1 matrix, and B is a t α × 1 matrix, so that it is necessary to solve the problem that B is a t α × 1 matrix

A secondary xor operation.

(2) Second, multiplying a coefficient matrix of size (n + t) × (2k +2t-2) with a message matrix of size (2k +2t-2) × α requires α (n + t) (2k +2t-2) multiplications and α (n + t) (2k +2t-3) additions, each requiring μ w XOR, each requiring w XOR, which requires α (n + t) w · [ (2k +2t-2) μ + (2k +2t-3)]And (4) performing secondary exclusive or. Averaging to each bit of original data, the XOR number of codes is

The decoding process is divided into four steps:

(1) first, the padding operation on the coefficient matrix and the collected data does not involve an exclusive-or operation. Then, data of size (k + t) × α is right-multiplied by α × (k + t)

Need α (k + t) ² Sub-addition of (alpha-1) (k + t) ² Multiplication, this step totalling (k + t) ² w·[αμ+α-1]sub-XOR;

(2) secondly, by

(3) Finally, since it is known that

The original data S can be decoded by solving a equation set ₁ . Each equation is formed as AxB ═ C, where a is a 1 × α matrix, x is an α × α matrix, and C is a 1 × α matrix, and can be modified to be B ^T ·(x ^T A ^T )＝C ^T Solution of the equation requires

Xored to obtain x ^T A ^T D; a common alpha set of equations, then the XOR order is

Forming a group as x ^T A ^T Reconstructing the D equation to form ES ₁ F, where E is an α × α matrix, S ₁ Is alpha x 1 matrix, F is alpha x 1 matrix, S is solved ₁ Require exclusive or

Next, the process is carried out. In the same way, S can be solved through Q ₂ . Averaging to each bit of original data, decoding needs

And (4) performing secondary exclusive or.

The repair process is divided into two steps:

(1) first, in the helper node, the local node stores a 1 × α matrix multiplied by an α × β failure node coefficient matrix, and new 1 × β data is generated. The process is carried out by the exclusive OR operation of beta alpha multiplication and beta (alpha-1) addition, wherein the times are beta [ alpha mu w + (alpha-1) w ]. Since there are a total of 2k + t +2 helper nodes, the total XOR order is β (2k + t +2) [ α μ w + (α -1) w ].

(2) Second, the new node receives the (2k + t +2) × β matrix from the helper node and the inverse of the (2k + t +2) × (2k + t +2) coefficient matrix

Multiplication and XOR order of beta (2k + t +2) [ (2k + t +2) μ w + (2k + t +1) w]. Transpose addition again, require (2k + t +2) β μ xor. Averaging to each bit of original data, repair needs

And (4) performing secondary exclusive or.

The complexity of the invention was analyzed:

the complexity of the truncated MSR code in the present invention is compared with the conventional erasure codes, MSR codes, and partial repair regeneration codes as shown in table 2. The truncated MSR needs to calculate the redundancy added before the original data in the coding link, and zero padding is needed to decode and repair according to the rule of the MSR mother code, so the complexity of the truncated MSR is slightly higher than that of the MSR, and the coding mode reduces the bandwidth overhead by sacrificing the calculation overhead.

TABLE 2 complexity comparison

Analyzing the complexity of various schemes, all involving the mu factor brought by finite field operation, Hou et al propose a BASIC operation, replacing the traditional finite field calculation, which can reduce the calculation complexity. BASIC operations can be applied to the truncated MSR code, called BASIC _ ssmsr code, and can reduce the computational overhead in each of the encoding, decoding, and repairing stages, as shown in fig. 3a, 3b, and 3 c.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. A construction method of a truncated regeneration code for reducing bandwidth consumption is characterized in that t information bits are deleted on the basis of (n, k) mother codes serving as parameters to obtain (n-t, k-t) truncated subcodes, a redundancy order is added, data stored in t nodes after coding are all 0, and the numerical value of redundancy is solved;

when decoding and repairing, a matrix formed by coefficient vectors corresponding to connected nodes is supplemented with coefficient vectors of t truncated nodes to form a new k multiplied by d coefficient matrix, an encoding matrix formed by data downloaded from the nodes is supplemented with t rows of zero vectors to form a new encoding matrix, and the supplemented coefficient matrix and the received data matrix are decoded or repaired according to a decoding or repairing mode of an MSR mother code;

the method comprises the following steps:

(1) with the parameter (n, k, d) [ alpha, gamma, B [ ]]The method is characterized in that the method is MSR mother code, wherein n is total node number, k is node number needing to be connected for decoding, d is node number needing to be connected for repairing, alpha is storage capacity of each node, beta is data size downloaded from one node during repairing, B is original symbol number capable of being coded at one time, and gamma represents total bandwidth needed when a single file coding block of a fault node is repaired; truncating t nodes to obtain a parameter of (n) _s ,k _s ,d _s )[α _s ,γ _s ,B _s ]The following relationship exists between the MSR mother code and the truncated subcode parameters in the truncated subcode of (1):

Wherein the first α data are known redundancy quantities, and a d × α matrix constructed by two α × α diagonal matrices constructs a message matrix M as:

(3) constructing a coding matrix C ═ Ψ · M, making the first t row elements in C all equal to 0, and solving the numerical value of the redundant margin, wherein Ψ is a coefficient matrix which is a Van der Menu matrix;

in decoding, from connecting arbitrary k _s Each node, corresponding to the node coefficient vector constituting k _s X d coefficient matrix, denoted as Ψ _{s_DC} Will make Ψ _{s_DC} The coefficient vectors of the t truncated nodes are complemented to form a new k x d coefficient matrix psi _DC From k to k _s Individual node download data construct k _s Coding matrix Ψ of xd _{s_DC} M, complementing t rows of zero vectors to form a new k x d coding matrix psi _DC M, decoding the filled coefficient matrix and the filled coding matrix according to the decoding mode of MSR codes formed by the product matrix;

the steps in decoding include:

(1) the data collector needs to connect any k _s Each node, the corresponding node coefficient vector constituting k _s X d coefficient matrix Ψ _{s_DC} Will make Ψ _{s_DC} The coefficient vectors of t truncated nodes are complemented to form a new k x d coefficient matrix psi _DC Expressed as: Ψ _DC ＝[Φ _DC Λ _DC Φ _DC ]；

From k _s Individual node download data construct k _s Coding matrix Ψ for x d _{s_DC} M, supplementing t rows of zero vectors to the coding matrix to form a new k x d coding matrix psi _DC M, obtaining:

(2) will collect the data Ψ _DC Right multiplication of M

To obtain

Wherein the content of the first and second substances,

Ψ _DC for decoding a sub-matrix of k x d coefficients connecting nodes, S ₁ And S ₂ A symmetric sub-matrix which is a message matrix;

(3) introducing (i, j) matrix elements, when i ≠ j, since the symmetric matrix (i, j) elements are identical to the (j, i) elements, i.e. P _ij +λ _i Q _ij ＝P _ji +λ _j Q _ji P when the solution is i not equal to j _i,j And Q _i,j ；

When i equals j, the rest except the i-th element is unknown

Are all known, can solve S ₁ In this way, S is solved ₂ Finish decoding, where phi are all expressed as phi _DC Column vector of (phi) _i Is a matrix phi _DC The ith column vector of (2).

2. The method of claim 1, wherein the connection d is used for repair _s A help node, denoted as

Helper node coefficient vector construction d _s X d coefficient matrix denoted as Ψ _{s_repair} Will make Ψ _{s_repair} The coefficient vectors of t truncated nodes are complemented to form a new d x d coefficient matrix psi _repair Downloading beta data from each helper node, forming d _s And supplementing the zero vector of the t rows by the data matrix of the multiplied by beta to form a new data matrix of the multiplied by beta, and repairing the supplemented coefficient matrix and the received data matrix according to the repairing mode of the MSR code constructed by the product matrix.

3. The method of claim 2, wherein the repairing step comprises:

(1) the coefficient vector of the failure node is represented as f, f

Calculating the missing data as

(2) Connection d _s A help node, denoted as

Form d _s X d coefficient matrix Ψ _{s_repair} Will make Ψ _{s_repair} The coefficient vectors of the t truncated nodes are complemented to form a new d x d coefficient matrix psi _repair Obtaining:

Obtaining:

the lost data can be repaired by transposition

4. The method of claim 1, wherein the method applied to the complexity overhead comprises the steps of:

(1) finite field GF (2) defining a primitive polynomial g (z) ^w ) The addition is bitwise exclusive-or, w exclusive-or operations are required, the multiplication is multiplication of a polynomial, and then μ w exclusive-or is required by using the primitive polynomial g (z) dimension reduction, where μ ═ 1) + | g (z) | ₀ ；

(2) Defining solving equation Ax ═ B, where a is n × n matrix, unknowns x are n × 1 matrix, B is n × 1 matrix, solving a ^-1 A sub-exclusive OR is required to solve, and the unknown matrix x is formed by the product A of two matrices ^-1 And B is solved.

5. The method of claim 4, wherein the encoding process comprises the steps of:

(1) t alpha compensated redundancy is solved by a t alpha group process and converted into a mathematical model Ax which is solved, wherein A is a t alpha multiplied by t alpha matrix, unknown x is a t alpha multiplied by 1 matrix, and B is a t alpha multiplied by 1 matrix, so that the requirement on solving the redundancy is met

A sub exclusive or operation, wherein alpha is the storage capacity of each node, t is the line number of the supplementary zero vector, and μ w is the polynomial multiplication operation times;

6. The method as claimed in claim 4, wherein the decoding process comprises the following steps:

(2) data of size (k + t) × α is right-multiplied by α × (k + t)

Carrying out alpha (k + t) ² Sub-addition of (alpha-1) (k + t) ² A secondary multiplication;

(3) by passing

An equation in the form of Ax ═ B, where A is a 2 × 2 matrix, can be solved for the elements at i ≠ j positions in the P matrix and the Q matrix, when the XOR number is 4(k + t) ² μw；

(4) Due to the fact that

The original data S can be decoded by solving alpha equations ₁ In turn, solve for S by Q ₂ 。

7. The method of claim 4, wherein the repair process comprises the steps of:

Multiplication of Ψ _repair Represents a sub-matrix of d x d coefficients for the repair assistance node, the xor order being β (2k + t +2) [ (2k + t +2) μ w + (2k + t +1) w](ii) a Transpose and add again, need (2k + t +2) beta mu times of XOR, average to each bit of original data, proceed

And performing XOR again to finish the repair.