WO2020029418A1

WO2020029418A1 - Method for constructing repair binary code generator matrix and repair method

Info

Publication number: WO2020029418A1
Application number: PCT/CN2018/110067
Authority: WO
Inventors: 侯韩旭; 韩永祥; 李挥; 周清峰; 李勇; 周丰丰; 范立生
Original assignee: 东莞理工学院
Priority date: 2018-08-09
Filing date: 2018-10-12
Publication date: 2020-02-13
Also published as: CN109257050A

Abstract

Provided is a method for constructing a repair binary code generator matrix applicable to the field of improving digital processing techniques. The method for constructing a repair binary code generator matrix comprises: denoting a constructed code with C ₁(k, r, d, p), where η ＝ d - k + 1, k ≥ 3, r ≥ 3 is an odd number, d ＝ k + (r - 1)/2 and τ ＝ (d - k + 1) ^k-2, a constructing matrix is denoted by P _k _× _r and a calculation formula thereof expressed as formula (I). A regenerated code product matrix structure still works in a quotient ring, the computational complexity is low, and repair bandwidth is reduced with greater fault tolerance.

Description

Construction method and repair method for repairing binary code generation matrix

Technical field

The invention belongs to the field of digital processing technology improvement, and particularly relates to a method for repairing a binary code generation matrix construction method and a repair method.

Background technique

Modern distributed storage systems deploy erasure codes to maintain data availability to prevent storage node failures. Binary Maximum Distance Separable (MDS) array coding is a special erasure code that can achieve minimal storage redundancy and low computation Complexity tolerance. In particular, the binary array code consists of k + r columns, each of which has L bits. In k + r columns, k information columns store information bits r parity columns store redundant bits. Each The L bits in the columns are all stored in the same storage node. We use the disk as a column or a storage node and an entry in the array as a bit. When a node fails, the corresponding column of the array code is considered Is an erasure if any k in the k + r column can reconstruct all k information columns (ie: it can tolerate any column where r fails), such an encoding is called an MDS code. Examples of binary MDS array codes include double Fault-tolerant codes (ie r = 2) such as x-code [2], RDP codes [3] and EVENODD codes [4], and triple fault-tolerant codes (ie r = 3) such as: STAR code [5], generalized RDP code [ 6], and TIP code [7].

When a node fails in a distributed storage system, the faulty node should be repaired by downloading fragments from the d healthy node, where k≤d≤k + r-1. Minimizing the repair bandwidth is defined as downloading during the repair process The number of bits is crucial for speeding up the repair operation and minimizing the window of vulnerabilities, especially in distributed storage, where network transmission is the bottleneck. The repair problem is formulated by Dimakis et al. [8] based on the concept of information flow graphs The minimum repair bandwidth of the minimum storage redundancy is stated in [8], also known as the minimum storage regeneration (MSR) point, which is expressed by the following formula:

Although the minimum repair bandwidth is achievable over a sufficiently large finite field, how to construct a binary MDS array code to achieve the minimum repair bandwidth is still a challenge.

A traditional method is to download all the bits from any surviving column to regenerate the bits in the faulty column. Therefore, the total number of bits used to repair the faulty column is k times the number of faulty bits. In binary MDS In the array code, some studies have reduced the repair bandwidth of a single failed column. Some methods minimize the disk read of the RDP code [10] and x-code [11] with d = k + 1, but their repair bandwidth is second Optimal, 50% larger than the minimum value of (1) when d = k + 1. MDR codes [12], [13] and ButterFly codes [14], [15] are binary MDS array codes to achieve optimal repair However, they only provide double fault tolerance (ie r = 2). How to construct binary MDS array codes with optimal repair and better fault tolerance (ie r> 2) is still an open question. Such a structure would have Conducive to maintaining data availability in fault-prone distributed storage systems.

Based on the patent [Binary Array Code Coding Framework], this paper proposes a new method for designing binary MDS array coding by selecting a suitable generator matrix. This method can tolerate r≥3 disk failures. We show that when d is large enough, the minimum repair bandwidth (1) for any single information column failure can be achieved progressively. By using the quotient loop of the loop structure and choosing a carefully designed coding matrix, our structure minimizes the repair Bandwidth so that bits accessed during the repair operation cross as much as possible.

The repair bandwidth of most existing binary MDS array codes [2], [3], [5], [6] is suboptimal. Some binary MDS array codes constructed in [12-15] have the best repair bandwidth and only focus on double fault tolerance (ie r = 2). As far as we know, the proposed code is the first binary MDS array code, which has a progressive optimal repair bandwidth and has a fault tolerance of greater than 2. The key differences between the proposed code and existing binary MDS array codes are as follows. First, with existing structures, such as [2], [3], [5], [6], the redundant bits in the check column (except the first check column) are specified by the array The correspondence of polygonal lines is generated. Secondly, in the proposed code, the number of rows of the array is an exponential function of k. These two attributes are very important to reduce the repair bandwidth. The difference between [12-15] and the proposed structure of the double fault-tolerant optimal repair structure is that a quotient loop with a circular structure is used, while [12-15] does not. By using the quotient ring, we can choose a well-designed coding matrix (check matrix), and achieve an optimal repair bandwidth with greater fault tolerance.

Previous studies [16], [17] also used similar techniques to reduce the computational complexity of regenerative codes. In this study, we proved that when τ (a parameter introduced later) is large enough and meets certain conditions, We can find some binary MDS array coding structures, which can get the optimal repair. The rings of [16] and [17] can be regarded as special cases of the proposed ring when τ = 1. In addition, the main results of [16], [17] and this paper are different. The results show that in the commercial ring, the basic trade-off curve between the storage and repair bandwidth of functional repair regeneration codes can also be realized in the commercial ring, while the existing product matrix structure of the recycled code still works under the commercial ring, and the calculation is complicated The degree is low. In this paper, we use a more general ring to construct a new binary MDS array code, and choose a well-designed generator matrix to construct a progressive optimal repair bandwidth. Although high data rate MSR codes of binary array MDS codes and structures are proposed [9], [18]-[24] are based on matrix construction generators or check matrices, the proposed codes are constructed in binary, coding matrix or calibration The trial matrix is selected in a ring with a cyclic structure.

Summary of the invention

An object of the present invention is to provide a method for constructing a repair binary code generation matrix and a repair method, which aim to solve the above technical problems.

The present invention is implemented as such, a method for constructing a repair binary code generation matrix, the method for constructing a repair binary code generation matrix includes: setting a construction code c ₁ (k, r, d, p), where η = d-k + 1, k≥3, r≥3 is an odd number, d = k + (r-1) / 2 and τ = (d-k + 1) ^k-2 , construct the matrix P _{k × r} ; its calculation formula:

A further technical solution of the present invention is: for j = k + 1, k + 2, ..., k + r, each coding polynomial s _j (x) is in the ring C _pτ ; let (i: j) = {i , i + 1, ..., j } according to the column index (i: j) matrix P _{k × r} generated P _{k × r} sub (i: j), the P _{k × r,} the sub-matrix P _{k × r} ( η + 1: 2η-1) can be obtained by rotating the sub-matrix P _{k × r} (2: η) 180 degrees, and the last row in P _{k × r} (2: η) is a vector of all 1, and P _{k × r} The index of the element of the i-th row and j-th column in (2: η) is a multiple of η ^i-1 of the j-th column of the first row, where i = 2, 3, ..., k-1 and j = 1, 2, ... , Dk.

A further technical solution of the present invention is that the extra bits calculated for the information bits in the repair binary code generating matrix construction method do not need to be stored and are used to calculate redundant bits.

Another object of the present invention is to provide a repair method for repairing a binary code generation matrix. The repair method for repairing a binary code generation matrix includes:

The first check column

Each check set is defined as follows

Where 2≤j≤d-k + 1,

Where d-k + 2≤j≤r; assuming the fth information column is invalid, if

because

Bit

Repair with the first check column; there are factors t = 1, 2, ..., dk, which

Bit

Repair with dk-t + 2 check column; if

because

Bit

Repaired with the first check column, because t = 1, 2, ..., dk, which

, Bit

Repair with the d-k + t + 1 check column.

A further technical solution of the present invention is:

And

Bit

Checksum set that can be the first checksum

Repair, need to download (p-1) η ^k-3 bits respectively from the remaining k-1 information columns

Where i ∈ {1, 2, ..., f-1, f + 1, ..., k} and

And download (p-1) η ^k-3 redundant bits from the first check column

A total of (p-1) η ^k-3 bits need to be downloaded.

A further technical solution of the present invention is that the parity check set of the first parity check column in the repairing method for repairing a binary code generation matrix is the same as the first parity column of the RDP and the even parity.

A further technical solution of the present invention is that the parity bits of other parity columns in the repair method for repairing a binary code generation matrix are not bits corresponding to a straight line in the array, but bits corresponding to a polygon line, The number of lines in the proposed encoding can be divisible by η ^k-2 .

The beneficial effect of the invention is that the product matrix structure of the regenerative code still works under the quotient ring, the calculation complexity is low, and the repair bandwidth is reduced with a greater fault tolerance.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram illustrating an example of storage encoding of three check columns according to an embodiment of the present invention.

detailed description

Consider a binary MDS array code with k≥2 information columns and r≥3 check columns. Each column of the array code stores L = (p-1) τ bits, where p is a prime number, so 2 is a finite field

One of the original elements, the value of τ will be specified later. Consider a file of size k (p-1) τ, using information bits

To represent. These information bits can be used to generate r (p-1) τ check bits

Bits s _{0, i} , s1 _{, i} , ..., s _{(p-1) τ-1, i} (i = 1,2, ..., k) are stored in the i-th information column, (p- 1) τ bits s _{0, j} , s _{1, j} , ..., s _{(p-1) τ-1, j} (j = k + 1, k + 2, ..., k + r) are stored in jkth Check columns.

For i = 1, 2, ..., k and μ = 0,1, ..., τ-1, we define the following short notation:

We call s _{(p-1) τ + μ, i} as s _{μ, i} , s _{τ + μ, i} , ..., s _{(p-2) τ + μ, i} extra bits. For example, when p = 3, k = 4, and τ = 4, s _{0 + μ, i} , s _{4 + μ, and} the extra bits of _i are s _{8 + μ, i} = s _{0 + μ, i} + s _{4 + μ, i} . For j = k + 1, k + 2, ..., k + r, τ extra bits s _{(p-1) τ, j} , s _{(p-1) τ + 1, j} , ..., s _{pτ- 1, j} will be added after the jkth check column. It will be obvious later that the redundant bits s _{(p-1) τ + μ of} the jkth check column _{, j} will satisfy (2) for j = k + 1, k + 2, ..., k + r and μ = 0, 1, ..., τ-1.

for

We use one in the ring

Polynomial

To represent the

Bits in column

And τ extra bits

E.g

The polynomial s _i (x) corresponding to the i (i = 1, 2, ..., k) information column is called an information polynomial; the polynomial s _j (x) (j) corresponding to the jk-th check column is called = K + 1, k + 2, ..., k + r) are called coding polynomials. We write k information polynomials and r coded polynomials as row vectors as follows

[s ₁ (x), s ₂ (x), ..., s _{k + r} (x)] (4)

This vector can be passed in the ring

Based on the calculation above, the calculation formula is as follows:

[s ₁ (x), s ₂ (x), ..., s _{k + r} (x)] = [s ₁ (x), s ₂ (x), ..., s _k (x)] · G _{k × ( k + r)} (5)

The k × (k + r) generation matrix G is composed of a k × k identity matrix I and a k × r coding matrix P, and is calculated as follows: G _{k × (k + r)} = [I _{k × k} P _{k × r} ] (6)

The proposed coding can be described as a check matrix H _{(k + r) × r} . Considering (4), we have

[s ₁ (x), s ₂ (x), ..., s _k (x)] · H _{(k + r) × r} = 0 (7)

_Let R _pτ represent the ring

An element a (x) in R _pτ can be expressed as a (x) = a _pτ-1 x ^pτ-1 + ... + a ₁ x + a ₀ , and its coefficient is a finite field

Element. Addition is the usual item-by-item addition, and multiplication is performed by modulo x ^pτ +1. In R _pτ , multiplying by x can be interpreted as a cyclic shift. This is critical to reducing the repair bandwidth of a column failure. Please note that we do not need to store extra bits on disk, they are only used for convenience of labeling.

Consider a neutron by the _R pτ _R pτ factor ^x τ +1 polynomial ring configuration C _pτ,

C _pτ = (a (x) (1 + x ^τ ) mod (1 + x ^(pτ) a (x) ∈ R _pτ ) (8)

In fact, C _pτ is ideal because

We can verify that the product of any polynomial in h (x) = x ^{(p-1) τ} + x ^{(p-2) τ} + ... + x ^τ +1 and C _pτ is 0. h (x) is called a check polynomial in C _pτ . The multiplication property in C _pτ is e (x) = 1 + h (x) = x ^{(p-1) τ} + x ^{(p-2) τ} + ... + x ^τ = (1 + x ^τ ) (x ^{(p -2) τ} + ... + x ^3τ + x ^τ ),

due to

e (x) b (x) = (1 + h (x)) b (x) = b (x) mod (1 + x ^pτ ) (9)

Theorem 1 satisfies (2) if and only if the coefficient _si (x) of the polynomial is in C _pτ .

Proof: Assume that the coefficient s _i (x) of the polynomial satisfies (2). By adjusting s _i (x),

This simplification is to prove that for i = 0, 1, ..., p-2 and j = 0, 1, ..., τ-1, x ^{iτ + j} + x ^{(p-1) τ + j} is x ^τ + Multiple of 1. This is because x ^{iτ + j} + x ^{(p-1) τ + j} = x ^{iτ + j} (1 + x ^{(pi-1) τ} ) = x ^{iτ + j} (1 + x ^τ ) (1 + x ^τ + x ^2τ + ... + x ^{(pi-2) τ} ) This proves that the polynomial coefficient _si (x) is in the ring C _pτ .

Instead, suppose

In the ring C _pτ . According to (8), s _i (x) can be written as

s _i (x) = a (x) (1 + x ^τ ) mod (1 + x ^τp )

= (A ₀ + a _{(p-1) τ} ) + (a ₁ + a _{(p-1) τ + 1} ) x + ... + (a _pτ-1 + a _{(p-1) τ-1} ) x _{pτ- 1} .

Therefore, for μ = 0,1, ..., τ-1, we can get

s _{μ, i} = α _μ + α _{(p-1) τ + μ} , s _{τ + μ, i} = α _{τ + μ} + α _μ , ..., s _{(p-1) τ + μ, i} = α _{(p -1) τ + μ} + α _{(p-2) τ + μ} , we can verify

s _{μ, i} + s _{τ + μ, i} + ... + s _{(p-2) τ + μ, i} = (α _μ + α _{(p-1) τ + μ} ) + (α _{τ + μ} + α _μ ) +… + (Α _{(p-2) τ + μ} + α _{(p-3) τ + μ} )

= Α _{(p-1) τ + μ} + α _{(p-2) τ + μ} = s _{(p-1) τ + μ, i} ,

Therefore, the coefficient _si (x) of the polynomial satisfies (2).

Since there are two polynomials 1 and x ^τ + x ^3τ + ... + x ^{(p-2) τ} , the equation

(1 + x ^τ ) (x ^τ + x ^3τ + ... + x ^{(p-2) τ} ) + 1 · h (x) = 1

In the bounds

It is true that 1 + x ^pτ can be decomposed into the product of two coprime factors 1 + x ^τ and h (x). In the next lemma the rings R _pτ and

Are isomorphic.

Lemma 2: Rings R _pτ and

Are isomorphic.

Prove that we need between R _pτ and

An isomorphism was found in. Furthermore, we can set a homomorphism by defining θ (f (x)): = (f (x) modx ^τ +1, f (x) modh (x)):

The mapping θ is a ring homomorphism and bijection because it has an inverse function φ (a (x), b (x)), where

φ (a (x), b (x)) = [a (x) h (x) + b (x) e (x)] modx ^pτ +1.

Will be explained below

Is the identity map of the ring R _pτ .

For any polynomial f (x) ∈ R _pτ , there are two polynomials g ₁ (x) and g ₂ (x) ∈ R _pτ , so

f (x) = g ₁ (x) (1 + x ^τ ) + f (x) mod (1 + x ^τ ), f (x) = g ₂ (x) h (x) + f (x) modh ( x).

Then we can have

φ (θ (f (x))) = (h (x) (f (x) mod (1 + x ^τ )) + e (x) (f (x) modh (x))) modx ^pτ +1

= [H (x) (f (x) -g ₁ (x) (1 + x ^τ )) + (1 + h (x)) (f (x) -g ₂ (x) h (x))] modx ^pτ +1

= (H (x) (f (x) -h (x) g ₁ (x) (1 + x ^τ )) + f (x) + f (x) h (x) -e (x) g ₂ ( x) h (x)] modx ^pτ +1

= [F (x) -h (x) g ₁ (x) (1 + x ^τ ) -e (x) g ₂ (x) h (x)) modx ^pτ +1

= [F (x)-(1 + x ^τ ) (x ^τ + x ^3τ + ... + x ^{(p-2) τ} ) g ₂ (x) h (x)) modx ^pτ +1

= F (x).

such

This is the identity mapping of the ring R _pτ , and the lemma proves.

By Lemma 2, we have the loops C _pτ and

Is isomorphic, which will be given in the next lemma.

Lemma 3: Ring C _pτ and

Are isomorphic.

For example, when p = 5 and τ = 2, C ₁₀ and

Are isomorphic, 1 + x ⁸ and the ring ₁₀ can be mapped to the C:

1 + x ⁸ mod (1 + x ² + x ⁴ + x ⁶ + x ⁸ ) = x ² + x ⁴ + x ⁶ .

If we apply the function φ to x ² + x ⁴ + x ⁶ , we can recover;

φ (0, x ² + x ⁴ + x ⁶ ) = (x ² + x ⁴ + x ⁶ ) (x ² + x ⁴ + x ⁶ + x ⁸ ) = 1 + x ⁸ mod (1 + x ¹⁰ ).

When τ = 1, C _{p is} discussed in [16] [17] and applied to a regenerative code with low complexity. Note that if and only if 2 is

Prime element in τ and τ = p ⁱ (i is a non-negative integer) [25], C _pτ and finite field

Isomorphism.

Before introducing the explicit structure of the proposed array code, we need to define e (x) -inverse.

Definition 1: If the polynomial f (x) ∈ R _pτ has a polynomial

Make

Then polynomial

It is called the e (x) _-inverse of the polynomial f (x). In the next lemma we will show that 1 + x ^b is e (x) _-invertible in the ring R _pτ .

Lemma 4: Let b (1≤b≤pτ) be an integer, and the greatest common divisor of b and p is gcd (b, p) = 1, gcd (b, τ) = α. The e (x) _-inverse of 1 + x ^b in the ring R _pτ is

Proof: In the ring R _pτ , we can verify

It was simplified to prove that the above equation is equal to e (x), for example,

Consider a ring of integer modulus pτ and express it as

in

In there is a collection

Now for i ∈ {1, 2, ..., p-1}, we consider

therefore,

Next, for i ≠ j∈ {1, 2, ..., p-1}, we want to prove τib / a ≠ jτb / a modpτ.

Suppose iτb / a modpτ = jτb / a modpτ, so there is an integer

Make

The above formula can be further reduced to

Since gcd (b, p) = 1 and gcd (b / a, p) = 1, we have p | (i-j). However, this is impossible because 1≤j≤i≤p-1. Similarly, we prove that iτb / a modpτ ≠ 0 when 1≤i≤p-1.

So we can get

(τb / a, 2τb / a, ..., (p-1) τb / a) = (τ, 2τ, ..., (p-1) τ) m0dpτ. So (11) holds.

By Lemma 1, for i = 1, 2, ... k, we have s _i (x) ∈ _{C pτ} . Let f (x) be any of the generation or check matrix. in case

Then (f (x) e (x) mod (1 + x ^pτ )) ∈ _{C pτ} can be used instead of f (x) without changing the result. This is because when ₁ ≦ _i ≦ k, s _i (x) e (x) = s _i (x) mod (1 + x ^pτ ). Therefore, after replacing all f (x) in the generator matrix or check matrix with (f (x) e (x) mod (1 + x ^pτ )) ∈ _{C pτ} , we have the equivalent generator matrix or Check the matrix so that the encoding polynomial in (4) can be calculated on the ring C _pτ by (5) or (7).

The encoding process can be described by the following polynomial operation. Given k (p-1) τ information bits, by (3), add τ extra bits for each (p-1) τ information bits and form C _pτ K data polynomials. After obtaining a vector (4) by selecting a specific encoding matrix or check matrix, store the coefficients in the polynomial from 0 to (p-1) τ-1, and store The target coefficients are discarded. The proposed array code can be regarded as a systematic linear code on C _pτ .

The purpose of this paper is to find a suitable coding matrix P _{k × r} so that the corresponding coding is MDS coding, and the repair bandwidth of a single fault is asymptotically optimal. In the rest of this section, we will give the encoding matrix construction method for binary MDS array codes. The repair bandwidth of our proposed binary array code is asymptotically optimal for any single information column failure.

A coding matrix construction method

The construction code is represented by c ₁ (k, r, d, p), and the construction matrix P _{k × r is} as (12), where η = d-k + 1, k≥3, r≥3 is an odd number, and d = k + (r-1) / 2 and τ = (d-k + 1) ^k-2 .

Because each data polynomial is in the ring C _pτ and the ring C _pτ is ideal, we have the following lemma.

Figure 1 shows an example of storage encoding for the three check columns. When the first column of information is invalid, the bits in the solid line frame are used to repair the information bits s _0,1 , s _2,1 , s _4,1 , s _6,1. The bits in the dashed box are used to repair the information bits s _1,1 , s _3,1 , s _5,1 , s _7,1 .

Lemma 5: For j = k + 1, k + 2, ..., k + r, each coded polynomial s _j (x) is in the ring C _pτ .

According to Lemma 1,5, if i is replaced by j in (2), the coefficients of the coding polynomial satisfy (2). Let (i: j) = {i, i + 1, ..., j} and generate a _{Pk × r} sub-matrix _{Pk × r} (i: j) based on the column index (i: j). In P _{k × r} , the sub-matrix P _{k × r} (η + 1: 2η-1) can be obtained by rotating the sub-matrix P _{k × r} (2: η) by 180 degrees. The last row in P _{k × r} (2: η) is a vector of all 1, and the index of the element in the i-th row and j-th column in P _{k × r} (2: η) is η ⁱ in the first row and j-th column Multiples of ^-1 , where i = 2, 3, ..., k-1 and j = 1, 2, ..., dk.

Example 1: Consider k = 4, p = 3, and r = 3. Therefore, d = 4 + 1 = 5, τ = 4, the 32 information bits are represented by s _{0, i} , s _{1, i} , ..., s _{7, i} (i = 1, 2, 3, 4). The encoding matrix for this example is:

Example 1 is illustrated in Figure 1. Note that the extra bits calculated from the information bits need not be stored and are used to calculate redundant bits.

B second construction: check matrix

4 Efficiently repair single row faults

In this section, we will show how to recover the bits stored in any information column or any column in the binary array code proposed in this paper, with progressively optimal repair bandwidth.

In this subsection, we always assume that column f is erased, and f can be any value from 1 to k. We want to obtain it from k-1 other information columns and d-k + 1 check columns Bits to recover the bits s _{0, f} , s _{1, f} , ..., s _{(p-1) τ-1, f} stored in the information column f, and get the progressively optimal repair bandwidth. Recall that we can Calculate the extra bits by (2). For convenience, in this section we will treat the pτ bit in column i as s _{0, i} , s _{1, i} , ..., s _{pτ-1, i} . Before giving the repair algorithm, we formally define the following parity set.

Definition 2: For

The first check column

Each check set is defined as follows

, Where 2≤j≤d-k + 1,

, Where d-k + 2≤j≤r.

Note that in Definition 2 and throughout the paper, all indexes are extracted from Definition 2, the parity set

Is used to generate redundant bits

Consisting of information bits. When we say that an information bit is repaired by a parity column, it means that we access the redundant bits of the equivalence column, and all the information bits in this parity bit, except for the bits that are erased. Consider the example in Figure 1. Suppose the first column is erased to access the information bits s _0,2 , s _0,3 , s _0,4 and the redundant bits s _0,1 + s _0,2 + s _0,3 + s _0,4 , And reconstruct s _0,1 by s _0,2 + s _0,3 + s _0,4 + (s _0,1 + s _0,2 + s _0,3 + s _0,4 ).

The repair algorithm is described in Algorithm 1. We use the example given in Figure 1 to illustrate the repair process in detail. In this example, k = 4, d = 5, and τ = 4. Suppose the first information column (ie, node 1, f = 1) fails. Through step 2 and step 3 in algorithm 1, we can repair by the first check column

among them

More specifically, the bits s _0,1 , s _2,1 , s _4,1 , s _6,1 can be reconstructed by the following calculations.

s _{0, 1} = s _{0, 2} + s _{0, 3} + s _{0, 4} + (s _{0, 1} + s _{0, 2} + s _{0, 3} + s _{0, 4} )

s _2,1 = s _{2, 2} + s _{2, 3} + s _{2, 4} + (s _{2, 1} + s _{2, 2} + s _{2, 3} + s _{2, 4} )

s _4,1 = s _{4, 2} + s _{4, 3} + s _{4, 4} + (s _{4, 1} + s _{4, 2} + s _{4, 3} + s _{4, 4} )

s _6,1 = s _{6, 2} + s _{6, 3} + s _6, ₄ + (s _{6, 1} + s _{6, 2} + s _{6, 3} + s _6, ₄ ).

When f = 1 ∈ {1, 2}, the remaining bits

(among them

And

) Repaired by the second check column. Therefore, the bits s _1,1 , s _3,1 , s _5,1 , s _7,1 can be reconstructed by the following calculations.

s _1,1 = s _0,2 + s _10,3 + s _2,4 + (s _1,1 + s _0,2 + s _10,3 + s _2,4 )

s _3,1 = s _{2, 2} + s _{0, 3} + s _{4, 4} + (s _{3, 1} + s _{2, 2} + s _{0, 3} + s _{4, 4} )

s _5,1 = s _{4, 2} + s _{2, 3} + s 6, ₄ + (s _{5, 1} + s _{4, 2} + s _{2, 3} + s 6, ₄ )

_{_{_{s 7,1 = s 6,2 + s 4,3}}} + s 8,4 + (s 11,1 + s 10,2 + s 8,3 + s 0,4) + (

s

3,1 + s 2 _{, 2} + s _{0, 3} + s _{4, 4} ).

Because we can use s _6,3 + s _2,3 and s _4,4 + s _0,4 to calculate s _10,3 , s _{8,4 respectively} , we do not need to download these two bits. Therefore, we only need to download four bits from three information columns and two check columns, a total of 20 bits to repair the first information column. That is, only half of the total data in the five columns needs to be downloaded. In FIG. 1, the bits in the solid line box are downloaded to repair the information bits s _0,1 , s _2,1 , s _4,1 , s _6,1 , and the bits in the dotted box are used to repair the information bit s _{1 , 1} , s _3,1 , s _5,1 , s _7,1 .

Suppose the second information column (ie node 2, f = 2) is invalid. By

steps

2 and 3 in algorithm 1, bits s _0,2 , s _1,2 , s _4,2 , s _5,2 can be passed through Calculations to reconstruct.

s _{0, 2} = s _{0, 1} + s _{0, 3} + s _{0, 4} + (s _{0, 1} + s _{0, 2} + s _{0, 3} + s _{0, 4} )

s _{1, 2} = s _{1, 1} + s _{1, 3} + s _{1, 4} + (s _{1, 1} + s _{1, 2} + s _{1, 3} + s _{1, 4} )

s _{4, 2} = s _{4, 1} + s _{4, 3} + s _{4, 4} + (s _{4, 1} + s _{4, 2} + s _{4, 3} + s _{4, 4} )

s _{5, 2} = s _{5, 1} + s _{5, 3} + s _{5, 4} + (s _{5, 1} + s _{5, 2} + s _{5, 3} + s _{5, 4} ).

Similarly, the bits s _2,2 , s _3,2 , s _6,2 , and s _7,2 can be reconstructed by the following calculations.

s _{2, 2} = s _{3, 1} + s _{0, 3} + s _{4, 4} + (s _{3, 1} + s _{2, 2} + s _{0, 3} + s _{4, 4} )

s _{3, 2} = s _{4, 1} + s _{1, 3} + s _{5, 4} + (s _{4, 1} + s _{3, 2} + s _{1, 3} + s _{5, 4} )

s _{6, 2} = s _{7, 1} + s _{4, 3} + s _{0, 4} + s _{4, 4} + (s _{11, 1} + s _10, 2 + s _{8, 3} + s _{0, 4} ) + (s _{3 , 1} + s _{2, 2} + s _{0, 3} + s _{4, 4} )

s 7, ₂ = s _{0, 1} + s _{4, 1} + s _{5, 3} + s _{1, 4} + s _{5, 4} + (s _{0, 1} + s _11, 2 + s _{9, 3} + s _{1, 4} ) + (s _4,1 + s _3,2 + s _1,3 + s _5,4 ).

The results show that by downloading 6 bits from the first information column and 4 bits from the third information column, the fourth information column, the first check column and the second check column, the second can be restored. 8 bits in each information column. A total of 22 bits of data were downloaded during the repair process. It can be verified that, for the example shown in FIG. 1, the third information column and the last information column can be reconstructed by downloading 22 bits and 20 bits from 5 columns, respectively.

The key idea in Algorithm 1 is that for each erased information column, the parity sets accessed have a large intersection, resulting in a small number of accesses. And, obviously, if we want to ensure the properties and effective repair of MDS, then choosing the encoding vector is crucial. The next theorem shows that the repair bandwidth of an information column is progressively optimal.

Theorem 11: When

At the time, algorithm 1 knows that the repair bandwidth of the f-th information column is (p-1) ((d + 1) η ^k-3 -η ^kf-2 ) (26)

Proof: Through Algorithm 1,

And

Bit

Checksum set that can be the first checksum

repair. Therefore, we need to download (p-1) η ^k-3 bits from the remaining k-1 information columns respectively

Where i ∈ {1, 2, ..., f-1, f + 1, ..., k} and

And download (p-1) η ^k-3 redundant bits from the first check column

Therefore, a total of (p-1) η ^k-3 bits need to be downloaded.

For t = 1, 2, ..., dk,

Bit

Checkable set

Repair where δ = dk-t + 1. because

So we need to get (p-1) η ^k-3 redundant bits from the check column δ + 1

For column i ∈ {1,2, ..., f-1}, and for

In the set {0, 1, ..., η ^f-1 -δη ^i-1 -1, η ^f -δη ^i-1 , η ^f -δη ^i-1 +1, ..., η ^f -1}

All values in, require (p-1) η ^k-3 bits

And for the column i ∈ {f + 1, f + 2, ..., k}, and for

All values in the set {0,1,2, ..., η ^f-1 -1} require (p-1) η ^k-3 bits

Note that for

with

Bits

In the repair, it has been downloaded by the first check column, so only the column δ + 1 (δ = 1, 2, ..., dk) needs to be downloaded (dk) (p-1) η ^k-3 redundant bits Download (dk) (p-1) η ^{k + if-3} bits from column i (i = 1, 2, ..., f-1).

We can calculate the number of bits downloaded from the d = k + (r-1) / 2 column to repair the information column f.

when

According to the coding matrix and algorithm 1 in (12), the repair bandwidth of column k + 1-f is the same as that of column f. So we only consider

Case. According to Theorem 11, as f increases, the repair bandwidth increases. When f = 1, the repair bandwidth is d (p-1) η ^k-3 , which reaches the optimal value of (1). Even though

In the worst case, the repair bandwidth is

It is strictly smaller than the value (d + 1) / d times in (1). Therefore, the repair bandwidth of any information fault can reach (1) asymptotic optimal repair.

It should be noted that, in the proposed encoding, the parity set of the first parity column is the same as the first parity column in RDP and even odd. The key difference between the proposed encoding and existing binary MDS array codes is the construction of other parity columns. First, compared with the existing binary MDS array codes, in the proposed coding, the parity bits of other parity columns are not the bits corresponding to the straight lines in the array, but the bits corresponding to the polygonal lines. . Second, the number of lines in the proposed encoding can be divisible by η ^k-2 . These two attributes are very important to reduce the repair bandwidth.

The above description is only the preferred embodiments of the present invention, and is not intended to limit the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention shall be included in the protection of the present invention. Within range.

Claims

A method for constructing a repair binary code generation matrix, characterized in that the method for constructing a repair binary code generation matrix includes: setting a construction code c 1 (k, r, d, p), where η = d-k + 1, k ≥3, r≥3 is an odd number, d = k + (r-1) / 2 and τ = (d-k + 1) k-2 , construct the matrix P k × r ; its calculation formula:
The method for constructing a repaired binary code generation matrix according to claim 1, wherein, for j = k + 1, k + 2, ..., k + r, each coding polynomial s j (x) is in a ring C pτ Let (i: j) = {i, i + 1, ..., j} and generate a Pk × r sub-matrix Pk × r (i: j) based on the column index (i: j), where Pk In × r , the sub-matrix P k × r (η + 1: 2η-1) can be obtained by rotating the sub-matrix P k × r (2: η) by 180 degrees. The last row in P k × r (2: η) is A vector that is all 1, and the index of the element in the i-th row and j-th column in P k × r (2: η) is a multiple of η i-1 in the j-th row of the first row, where i = 2, 3, ..., k-1 and j = 1, 2, ..., dk.
The method for constructing a repair binary code generation matrix according to claim 1, wherein the extra bits calculated by the information bits in the method for constructing a repair binary code generation matrix do not need to be stored and are used to calculate redundant bits.
A repair method for repairing a binary code generation matrix, characterized in that the repair method for repairing a binary code generation matrix includes: for 0≤l≤pτ-1, j = 1, 2, ..., r, the j-th check column The l-th check set is defined as P l, 1 = {s l, 1 , sl, 2 , ..., sl, k },
Where 2≤j≤d-k + 1,
Where d-k + 2≤j≤r; assuming the fth information column is invalid, if
Because lmodη f ∈ {0,1,2, ..., η f-1 -1}, bits s l, f are repaired with the first check column; there are factors t = 1,2, ..., dk, which lmodη f ∈ {tη f-1 , tη f-1 +1, ..., (t + 1) η f-1 -1}, bits s l, f are repaired with dk-t + 2 ;in case
Since lmodη k + 1-f ∈ {0, 1, 2, ..., η kf -1}, bits s l, f are repaired with the first check column. There are factors t = 1, 2, ..., dk, which lmodη k + 1-f ∈ {tη kf , tη kf +1, ..., (t + 1) η kf -1}, and bits s l, f are repaired with the d-k + t + 1th check column.
The repairing method for repairing a binary code generation matrix according to claim 4, characterized in that lmodη f ∈ {0, 1, 2, ..., η f-1 -1} and l <(p-1) τ The bits s l, f can be repaired by the check set P l, 1 of the first check column, and (p-1) η k-3 bits s l, f need to be downloaded from the remaining k-1 information columns, respectively . , Where i ∈ {1, 2, ..., f-1, f + 1, ..., k} and lmodη f ∈ {0, 1, 2, ..., η f-1 -1}, And download (p-1) η k-3 redundant bits s l, k + 1 lmodη f ∈ {0,1,2, ..., η f-1 -1} from the first check column, a total of (p-1) η k-3 bits need to be downloaded.
The repair binary code generation matrix repair method according to claim 4, wherein the parity check set of the first parity column and the first of the RDP and even odd in the repair binary code generation matrix repair method The parity columns are the same.
The repair method for repairing a binary code generation matrix according to claim 4, wherein the parity bits of other parity columns in the repair method for repairing a binary code generation matrix are not bits corresponding to straight lines in the array , But the bit corresponding to the polygon line, the number of lines in the proposed encoding can be divisible by η k-2 .