CN115858230A

CN115858230A - Maximum distance separable code construction, repair method and related device

Info

Publication number: CN115858230A
Application number: CN202211157394.2A
Authority: CN
Inventors: 芮佳依; 侯韩旭; 黄勤; 张弓
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2021-09-27
Filing date: 2022-09-22
Publication date: 2023-03-28

Abstract

The application provides a construction method, a repair method and a related device of maximum distance separable codes, wherein the construction method comprises the following steps: acquiring n, k and data codes; n and k indicate that k nodes of the n nodes store data codes and r nodes store check codes; r = n-k, n > 1, k > 0, r > 0, n, k and r are integers; constructing a check matrix H of the MDS code based on n and k; the matrix H is r ² Line (r) ² * s) column, s = n/r; the matrix H comprises a check matrix of n nodes and each node, and the check matrix of each node is r ² R formed by row r column and check matrix of any r nodes in matrix H ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p‑1 ) Is reversible; calculating a check code based on the data code and the matrix H; the obtained check code and the data code form an MDS code. The method and the device can realize the storage coding construction and repair with low computation complexity.

Description

Maximum distance separable code construction, repair method and related device

Technical Field

The present invention relates to the field of data storage technologies, and in particular, to a method for constructing and repairing maximum distance separable codes and a related apparatus.

Background

For cloud storage systems, data is stored in storage nodes, and maintaining data integrity becomes a major challenge. The number of storage nodes is huge, and a large number of faults can occur every day. In a distributed storage coding system, maximum distance separable codes (MDS codes) are also called MDS codes, which can improve data reliability. However, in the existing scheme, for n storage nodes (where n = k + r, k is the number of nodes storing data codes, and r is the number of nodes storing check codes), if MDS codes are used, the repair bandwidth of a single failed node is k times of the data storage amount of the failed node, which causes great waste to network bandwidth resources of the storage system.

In the existing research, the regeneration code proposed by a.dimakis and the like gives the minimum value of the repair bandwidth of a single failed node under a certain condition, so that the repair bandwidth of the node can be effectively reduced. The MDS code with the repair bandwidth reaching the theoretical lowest is called the Minimum Storage Regeneration (MSR) code. At present, the construction method of the MSR code has been widely studied. The epsilon-MSR proposed by Ankit Single Rawat et al in the thesis "MDS Code configurations With Small Sub-packet and Near-optical Repair Bandwidth" can achieve smaller Repair Bandwidth, however, the requirement for the finite field of the epsilon-MSR is large enough, resulting in larger computational complexity.

In view of the above, how to implement a storage coding processing scheme with low computational complexity while satisfying smaller repair bandwidth is an urgent problem to be solved by those skilled in the art.

Disclosure of Invention

The application discloses a construction method and a repair method of maximum distance separable codes and a related device, which can realize the construction and repair of storage codes with low computation complexity under the condition of meeting smaller repair bandwidth.

In a first aspect, the present application provides a method of constructing maximum distance separable MDS codes, the method comprising:

acquiring a parameter n, a parameter k and a data code to be stored; the parameter n and the parameter k indicate that the data codes are stored in k nodes of n storage nodes and indicate that check codes of the data codes are stored in r nodes of the n storage nodes; r = n-k, n is an integer greater than 1, and k and r are both integers greater than 0;

constructing a check matrix H of the MDS code based on the parameter n and the parameter k; the check matrix H is r ² Line (r) ² * s) a matrix of columns, the aforementioned s = n/r; the check matrix H comprises a check matrix of each node of the n storage nodes, and the check matrix of each node is r ² R rows and r columns, the size of the check matrix formed by any r nodes in the check matrix H is r ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Is reversible;

calculating the check code based on the data code and the check matrix H; the check code and the data code obtained by calculation constitute the MDS code.

Optionally, p is greater than

Compared with the existing MDS code construction method, the binary polynomial MDS code construction method has lower calculation complexity. This is because the conventional MDS codes are constructed under a finite field large enough, which results in a high computational complexity; the binary polynomial MDS code provided by the embodiment of the application is constructed based on a binary polynomial ring with a circular structure, and only exclusive or (XOR) operation and circular shift operation are involved in the encoding and decoding processes. In addition, the construction method of the binary polynomial MDS code provided by the embodiment of the application is explicit, that is, the check matrix used in the construction process is given and determined, and the MDS characteristic can be satisfied only by sufficiently increasing the parameter p; the existing method for constructing the MDS code (such as epsilon-MSR) is not explicit, and the MDS characteristic needs to be verified by a software search method under a sufficiently large finite field, which additionally brings about more complex calculation; particularly, when the parameters n and k are both large, for epsilon-MSR, even by means of a software searching method, the MDS characteristics are difficult to verify, so that the MDS code obtained by construction cannot be guaranteed, and the code capable of correctly repairing the failed node cannot be guaranteed. In addition, based on the characteristics of the constructed check matrix H, the lost codes in the failed nodes can be repaired through a smaller repair bandwidth when the failed nodes are repaired.

In a possible embodiment, the check matrix H is composed of S sub-check matrices S, and each sub-check matrix S is r ² Line r ² A matrix of columns; each of the aforementioned sub-check matrices Sby r ² Each sub-check matrix R is a matrix with R rows and R columns; the check matrix of each node consists of R sub check matrices R;

r is above mentioned ² The sub-check matrix R forms a matrix array of R rows and R columns; the diagonal elements in each of the aforementioned sub-check matrices R are non-zero; the R sub-check matrixes R in the 0 th row in the matrix array are all unit matrixes; the sub-check matrix R in the 1 st row to the R-1 st row of the matrix array has a non-zero element in addition to the diagonal elements, and other elements are all zero.

In one possible embodiment, the matrix R _u R sub-check matrixes R in the u-th column in the matrix array are included, and u is an integer which is greater than or equal to 0 and smaller than R;

the aforementioned matrix R _u The positions of non-zero elements except diagonal lines in each of the R sub-check matrices R are determined based on the number of rows and columns of the matrix array of the matrix, and the cyclic shift rule.

The check matrix designed by the application can provide a simple and convenient calculation mode for the subsequent encoding and decoding process, improve the calculation efficiency and simplify the calculation complexity.

In a possible embodiment, the aforementioned parameter n and the aforementioned parameter k further indicate that r of the aforementioned data codes are stored at each of the aforementioned k nodes, and r of the aforementioned check codes are stored at each of the aforementioned r nodes; the k x r data codes and r ² Forming a column vector C of n x r rows by the check codes, wherein the product of the check matrix H and the column vector C is zero;

the calculating the check code based on the data code and the check matrix H includes:

substituting k × r data codes into a formula in which the product of the check matrix H and the column vector C is zero;

converting the above equation into r ² A linear equation set consisting of equations;

calculating r based on the aforementioned system of linear equations ² And checking and coding.

In a possible embodiment, the foregoing r is calculated based on the foregoing linear equation set ² The check code comprises:

calculating r based on the aforementioned system of linear equations ² A coefficient matrix composed of the coefficients of the check codes;

splitting the coefficient matrix into r sub-coefficient matrices, wherein each sub-coefficient matrix in the r sub-coefficient matrices is a Vandermonde matrix;

r check codes are calculated based on each of the sub-coefficient matrices.

According to the method and the device, the coefficient matrix of the unknown code is split into the multiple Van der Mongolian matrixes to be solved, so that the calculation process is greatly simplified, and the calculation efficiency is improved.

In a second aspect, the present application provides a method of repairing a maximum distance separable MDS code constructed by the method of any one of the first aspect above; the n storage nodes comprise a failure node; the repairing method comprises the following steps:

determining a 2n-r-s repair code based on r row elements in the check matrix H, wherein the repair code comprises a data code and/or a check code;

downloading the repair codes in n-1 nodes which are not failed in the n storage nodes;

calculating the lost codes in the failed nodes based on the downloaded repair codes.

According to the repairing method, the repairing bandwidth of the single node failure is 2n-r-s codes, and the repairing bandwidth ratio of the single node failure is (2 n-r-s)/kr. Under the same parameters and the condition of subpackage number, the binary polynomial MDS code provided by the application can reach the minimum value of the repair bandwidth of the existing MDS code, so that the network bandwidth resource can be saved. The number of packets is the number of nodes storing the MDS codes, for example, the number of packets is r in the above embodiment.

In one possible embodiment, the n storage nodes are numbered as integers from 0 to n-1, the failed node is numbered as h × r + i, h is an integer greater than or equal to 0 and less than s, and i is an integer greater than or equal to 0 and less than r; the elements in row r include rows i, r + i,2 + r + i, \ 8230in the check matrix H, and (r-1) r + i;

the determining of the 2n-r-s repair code based on the r row elements in the check matrix H includes:

determining the ith code in the MDS codes stored by each node of the n-1 nodes as the repair code based on the ith row;

determining that the ((i + g) mod r) number in the MDS codes stored by each node of the target node is the repair code based on the g + r + i row, wherein the target node comprises nodes numbered i, r + i,2 + r + i, \ 8230; (s-1) + i except for the node numbered h + r + i, and g is an integer from 1 to r-1.

Based on the structural characteristics of the check matrix H, the rows i, r + i,2 + r + i, \ 8230and (r-1) r + i are selected to determine the repair codes, and compared with the existing scheme that all codes need to be downloaded by the repair nodes, the number of the repair codes which need to be downloaded by the method is greatly reduced, so that the repair bandwidth is reduced, and the network bandwidth resources are saved.

In a possible embodiment, the calculating of the lost codes in the failed node based on the downloaded repair codes includes:

calculating the ith code in the failure node based on the ith code in the MDS codes stored by each node of the n-1 nodes;

(i + g) mod r) of the failed nodes is calculated based on ((i + g) mod r) of the MDS codes stored in each node of the target nodes, the calculated ith code of the failed nodes and the ith code of the n-1 nodes except the nodes numbered i, r + i,2r + i, \8230; (s-1) r + i.

In a third aspect, the present application provides a method of repairing a maximum distance separable MDS code constructed by the method of any one of the first aspect above; the n storage nodes comprise a failure node and a busy node; the repairing method comprises the following steps:

determining 3n-2s-r-2 repair codes based on r +2 row elements in the check matrix H, wherein the repair codes comprise data codes and/or check codes;

downloading the repair code in n-2 nodes of the n storage nodes except the failed node and the busy node;

According to the single busy node long tail repair method, the single busy node long tail repair bandwidth of the binary (n, k) MDS code is 3n-r-2s-2 codes. Thus, the long tail repair bandwidth ratio for a single busy node is (3 n-r-2 s-2)/kr. Under the same parameters and the condition of sub-packet number, the long tail repair algorithm provided by the embodiment of the application has the smallest long tail repair bandwidth in the existing repair algorithm, and network bandwidth resources are saved.

In one possible embodiment, the n storage nodes are numbered as integers from 0 to n-1, the failed node is numbered as h × r + i, the busy node is numbered as z × r + j, h and z are both integers greater than or equal to 0 and less than s, i and j are both integers greater than or equal to 0 and less than r, i ≠ j;

the r +2 rows of elements comprise i, r + i,2 + r + i, \ 8230, (r-1) r + i, q and u + r + q rows in the check matrix H, q is not equal to i, (u + q) mod r = i mod r, and the values of q and u are integers between 0 and r-1;

the aforementioned determining 3n-2s-r-2 repair codes based on the r +2 row elements in the check matrix H includes:

determining the ith code in the MDS codes stored by each node of the n-2 nodes as the repair code based on the ith row;

determining ((i + g) mod r) th codes in the MDS codes stored in each node of a first target node as the repair codes on the basis of the g + r + i row, wherein the first target node is the node numbered i, r + i,2 + r + i, \8230inthe n storage nodes, except the node numbered h + r + i in the (s-1) × r + i, and g is an integer from 1 to r-1;

and determining that the q code in the MDS code stored by each node of a second target node is the repair code based on the q line and the u r + q line, wherein the second target node is the node except for the nodes with the numbers of i, r + i,2r + i, \8230; (s-1) r + i and z r + j in the n storage nodes.

Based on the structural characteristics of the check matrix H, the method selects the i, r + i,2r + i, 8230, (r-1) r + i, q and u r + q to determine the repair codes, and compared with the existing scheme that all codes need to be downloaded by the repair nodes, the number of the repair codes which need to be downloaded by the method is greatly reduced, so that the repair bandwidth is reduced, and the network bandwidth resource is saved.

In one possible embodiment, r of said data codes are stored at each of said k nodes, and r of said check codes are stored at each of said r nodes; the k × r preceding data codes and r ² Each of the check codes constitutes one (r) ² * s) column vectors C of the rows, the product of the check matrix H and the column vectors C being zero;

the calculating of the lost codes in the failed node based on the downloaded repair codes comprises:

substituting the 3n-2s-r-2 repair codes into a formula with the product of the check matrix H and the column vector C being zero;

converting the above formula into a linear equation set consisting of r +2 equations;

the missing codes in the failed nodes are calculated based on the linear system of equations.

In a fourth aspect, the present application provides a method of repairing a maximum distance separable MDS code constructed by the method of any one of the first aspect above; the n storage nodes comprise a failure node and t busy nodes, wherein t is an integer which is larger than 1 and smaller than r-1; the repairing method comprises the following steps:

determining (t + 2) n-r- (t + 1) (t + s) repair codes based on r + t (t + 1) row elements in the check matrix H, wherein the repair codes comprise data codes and/or check codes;

downloading the repair code from n-t-1 nodes except the failed node and the t busy nodes in the n storage nodes;

According to the long tail repair method under the t busy nodes, the long tail repair bandwidth under the t busy nodes of the binary (n, k) MDS code is (t + 2) n-r- (t + 1) (t + s) codes. Thus, the long tail repair bandwidth ratio under t busy nodes is ((t + 2) n-r- (t + 1) (t + s))/kr. Under the same parameters and the condition of sub-packet number, the long tail repair algorithm provided by the embodiment of the application has the smallest long tail repair bandwidth in the existing repair algorithm, and network bandwidth resources are saved.

In a fifth aspect, the present application provides an apparatus for constructing a maximum distance separable MDS code, the apparatus comprising:

the acquisition unit is used for acquiring the parameter n, the parameter k and the data code to be stored; the parameter n and the parameter k indicate that the data codes are stored in k nodes of n storage nodes and indicate that check codes of the data codes are stored in r nodes of the n storage nodes; r = n-k, n is an integer greater than 1, and k and r are both integers greater than 0;

the construction unit is used for constructing a check matrix H of the MDS code based on the parameter n and the parameter k; the check matrix H is r ² Line (r) ² * s) a matrix of columns, the aforementioned s = n/r; the check matrix H comprises a check matrix of each node of the n storage nodes, and the check matrix of each node is r ² Row r, the size of check matrix composition of any r nodes in the check matrix H is r ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Is reversible;

a calculation unit for calculating the check code based on the data code and the check matrix H; the check code and the data code obtained by calculation constitute the MDS code.

In a possible embodiment, the check matrix H is composed of S sub-check matrices S, and each sub-check matrix S is r ² Line r ² A matrix of columns; each of the aforementioned sub-check matrices Sby r ² Each sub-check matrix R is a matrix with R rows and R columns; the check matrix of each node consists of R sub check matrixes R;

r is above mentioned ² The sub-check matrixes R form a matrix array of R rows and R columns; the diagonal elements in each sub-check matrix R are nonzero; the R sub-check matrixes R in the 0 th row in the matrix array are all unit matrixes; the sub-check matrix R in the 1 st row to the R-1 st row of the matrix array has a non-zero element in addition to the diagonal elements, and other elements are all zero.

In one possible embodiment, the matrix R _u R sub-check matrices R including the u-th column in the matrix array, u being an integer greater than or equal to 0 and less than R;

A possible implementationIn this way, the aforementioned parameter n and the aforementioned parameter k further indicate that r of the aforementioned data codes are stored at each of the aforementioned k nodes, and r of the aforementioned check codes are stored at each of the aforementioned r nodes; the k x r data codes and r ² Forming a column vector C of n x r rows by the check codes, wherein the product of the check matrix H and the column vector C is zero; the aforementioned calculation unit is specifically configured to:

converting the above equation into r ² A linear equation system composed of equations;

In a possible implementation, the foregoing computing unit is specifically configured to:

r check codes are calculated based on each of the sub-coefficient matrices.

In a sixth aspect, the present application provides a device for restoring maximum distance separable MDS codes, the MDS codes being obtained by the device configuration of any one of the fifth aspects; the n storage nodes comprise a failure node;

the aforementioned prosthetic device comprises:

a determining unit, configured to determine a 2n-r-s repair code based on r row elements in the check matrix H, where the repair code includes a data code and/or a check code;

a downloading unit, configured to download the repair code in n-1 nodes that are not failed in the n storage nodes;

and the computing unit is used for computing the lost codes in the failed nodes based on the downloaded repair codes.

In one possible embodiment, the n storage nodes are numbered as integers from 0 to n-1, the failed node is numbered as h × r + i, h is an integer greater than or equal to 0 and less than s, and i is an integer greater than or equal to 0 and less than r; the row r elements include rows i, r + i,2 + r + i, \ 8230in the check matrix H, and (r-1) r + i;

the determining unit is specifically configured to:

determining that the ((i + g) mod r) code in the MDS code stored by each node of the target node is the repair code based on the g + r + i row, wherein the target node comprises nodes numbered i, r + i,2r + i, \8230; (s-1) r + i except the nodes numbered h r + i in the n storage nodes, and g is an integer from 1 to r-1.

In a possible implementation, the aforementioned calculating unit is specifically configured to:

calculating the ith code in the failure node based on the ith code in the MDS code stored by each node of the n-1 nodes;

In a seventh aspect, the present application provides a device for restoring a maximum distance separable MDS code constructed by the device of any one of the fifth aspects; the n storage nodes comprise a failure node and a busy node;

the aforementioned prosthetic device comprises:

a determining unit, configured to determine 3n-2s-r-2 repair codes based on r +2 row elements in the check matrix H, where the repair codes include data codes and/or check codes;

a downloading unit, configured to download the repair code in n-2 nodes of the n storage nodes except the failed node and the busy node;

the element in the aforementioned row r +2 includes the i, r + i,2 × r + i, 8230, (r-1) r + i, q, and u × r + q in the aforementioned check matrix H, q is not equal to i, (u + q) mod r = i mod r, and the values of q and u are integers from 0 to r-1;

the aforementioned determining unit is specifically configured to:

determining ((i + g) mod r) code in the MDS code stored in each node of a first target node as the repair code based on the g r + i line, wherein the first target node is the node with the number of i, r + i,2r + i, \8230, (s-1) r + i except the number of h r + i, and g is an integer from 1 to r-1 in the n storage nodes;

In one possible embodiment, r of the data codes are stored in each of the k nodes, and r of the check codes are stored in each of the r nodes; the k x r data codes and r ² Each of the check codes constitutes a (r) ² * s) column vectors C of rows, the product of the check matrix H and the column vectors C being zero;

the aforementioned calculation unit is specifically configured to:

the missing codes in the failed nodes are calculated based on the linear equation set.

In an eighth aspect, the present application provides a device for restoring a maximum distance separable MDS code constructed by the device of any one of the fifth aspects above; the n storage nodes comprise a failure node and t busy nodes, wherein t is an integer which is larger than 1 and smaller than r-1;

the aforementioned device comprises:

a determining unit, configured to determine (t + 2) n-r- (t + 1) (t + s) repair codes based on r + t × t +1 row elements in the check matrix H, where the repair codes include data codes and/or check codes;

a downloading unit, configured to download the repair code in n-t-1 nodes, excluding the failed node and the t busy nodes, of the n storage nodes;

In a ninth aspect, the present application provides an apparatus, which may comprise a processor and a memory, for implementing the method described in the first aspect and possible embodiments thereof. The memory is coupled to the processor, and the processor, when executing the computer program stored in the memory, may cause the apparatus to perform the method according to the first aspect or any of the possible implementations of the first aspect.

The device may also include a communication interface for the device to communicate with other devices, which may be, for example, a transceiver, circuit, bus, module, or other type of communication interface. The communication interface includes a receive interface for receiving messages and a transmit interface for transmitting messages.

In one possible implementation, the apparatus may include:

a memory for storing a computer program;

the processor is used for acquiring the parameter n, the parameter k and the data code to be stored; the parameter n and the parameter k indicate that the data codes are stored in k nodes of n storage nodes and indicate that check codes of the data codes are stored in r nodes of the n storage nodes; r = n-k, n is an integer greater than 1, and k and r are both integers greater than 0; constructing a check matrix H of the MDS code based on the parameter n and the parameter k; the check matrix H is r ² Line (r) ² * s) a matrix of columns, the aforementioned s = n/r; the check matrix H comprises a check matrix of each node of the n storage nodes, and the check matrix of each node is r ² R rows and r columns, the size of the check matrix formed by any r nodes in the check matrix H is r ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Is reversible; calculating the check code based on the data code and the check matrix H; the check code and the data code obtained by calculation constitute the MDS code.

It should be noted that, in the present application, the computer program in the memory may be stored in advance, or may be downloaded from the internet and then stored when the device is used. The coupling in the embodiments of the present application is an indirect coupling or connection between devices, units or modules, which may be in an electrical, mechanical or other form, and is used for information interaction between the devices, units or modules.

In a tenth aspect, the present application provides an apparatus, which may comprise a processor and a memory, for implementing the method described in the second aspect and possible embodiments thereof. The memory is coupled to the processor, and the processor, when executing the computer program stored in the memory, may cause the apparatus to perform the method according to the second aspect or any of the possible implementations of the second aspect.

In one possible implementation, the apparatus may include:

a memory for storing a computer program;

a processor, configured to determine a 2n-r-s repair code based on r row elements in the check matrix H, where the repair code includes a data code and/or a check code; downloading the repair codes in n-1 nodes which are not failed in the n storage nodes; calculating the lost codes in the failed nodes based on the downloaded repair codes.

It should be noted that, in the present application, the computer program in the memory may be stored in advance, or may be downloaded from the internet and stored when the device is used. The coupling in the embodiments of the present application is an indirect coupling or connection between devices, units or modules, which may be in an electrical, mechanical or other form, and is used for information interaction between the devices, units or modules.

In an eleventh aspect, the present application provides an apparatus, which may comprise a processor and a memory, for implementing the method described in the third aspect and possible embodiments thereof. The memory is coupled to the processor, and the processor, when executing the computer program stored in the memory, may cause the apparatus to implement the method according to any of the possible implementations of the third aspect or the third aspect.

In one possible implementation, the apparatus may include:

a memory for storing a computer program;

the processor is used for determining 3n-2s-r-2 repair codes based on r +2 row elements in the check matrix H, wherein the repair codes comprise data codes and/or check codes; downloading the repair code in n-2 nodes of the n storage nodes except the failed node and the busy node; calculating the lost codes in the failed nodes based on the downloaded repair codes.

In a twelfth aspect, the present application provides an apparatus, which may comprise a processor and a memory, for implementing the method described in the fourth aspect above. The memory is coupled to the processor, which when executing the computer program stored in the memory, causes the apparatus to carry out the method of the fourth aspect as described above.

In one possible implementation, the apparatus may include:

a memory for storing a computer program;

a processor, configured to determine (t + 2) n-r- (t + 1) (t + s) repair codes based on r + t × t +1 row elements in the check matrix H, where the repair codes include data codes and/or check codes; downloading the repair code in n-t-1 nodes except the failed node and the t busy nodes from the n storage nodes; calculating the lost codes in the failed nodes based on the downloaded repair codes.

In a thirteenth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the method described in any one of the first aspect and the possible implementation manners.

In a fourteenth aspect, the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method of any one of the second aspect and its possible embodiments.

In a fifteenth aspect, the present application provides a computer-readable storage medium, which stores a computer program, which when executed by a processor, implements the method of any one of the third aspect and its possible embodiments.

In a sixteenth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the method of the fourth aspect described above.

In a seventeenth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, causes the computer to perform the method according to any of the first aspect.

In an eighteenth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, causes the computer to perform the method of any of the above second aspects.

In a nineteenth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, causes the computer to perform the method according to any of the above third aspects.

In a twentieth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, causes the computer to perform the method according to the fourth aspect as described above.

It will be appreciated that the apparatus of the fifth to twelfth aspects, the computer storage medium of the thirteenth to sixteenth aspects, and the computer program product of the seventeenth to twentieth aspects provided above are each arranged to perform the method provided in the first to fourth aspects. Therefore, the beneficial effects achieved by the method can refer to the beneficial effects in the corresponding method, and are not described herein again.

Drawings

The drawings to be used in the embodiments of the present application will be described below.

FIG. 1 is a schematic diagram of a scenario of the present application;

fig. 2 is a schematic flow chart of an MDS code construction method provided in the present application;

fig. 3 is a schematic flowchart of an MDS code repairing method provided by the present application;

FIG. 3A is a diagram illustrating a download of repair codes during an MDS code repair process according to the present application;

fig. 4 is a schematic flowchart of an MDS code repairing method provided by the present application;

fig. 4A is a schematic diagram illustrating a node grouping provided in the present application;

fig. 4B is a schematic diagram of a check matrix provided in the present application;

FIG. 4C is a schematic diagram of a matrix provided herein;

fig. 5 is a schematic flowchart of an MDS code repairing method provided in the present application;

fig. 6 to 9 are schematic diagrams illustrating logical structures of the apparatuses provided in the present application;

fig. 10 to fig. 13 are schematic hardware structures of the apparatus provided in the present application.

Detailed Description

Embodiments of the present application are described below with reference to the drawings.

Referring to fig. 1, fig. 1 is a schematic diagram illustrating a possible system architecture applicable to the embodiment of the present application. The system 100 includes a server 110, a storage node 120, and a terminal device 130. Both storage node 120 and terminal device 130 may communicate with server 110. One or more terminal devices 130, one of which is illustrated in fig. 1, may be included in the system 100; likewise, one or more storage nodes 120 may be included in the system 100, illustrated as n in FIG. 1, where n is an integer greater than 1.

The terminal device 130 may transmit data to be stored to the server 110, and the server 110 encodes the received data and distributively stores the encoding into the n storage nodes 120. The terminal device 130 may also transmit a request for acquiring data to the server 110, and the server 110 acquires data from the storage node 120 based on the request and transmits the acquired data to the terminal device 130.

Illustratively, the server 110 may be a switch in a network, a controller of a distributed storage system, or a service device dedicated to distributed storage of data, etc. The storage node 120 may be a disk or a hard disk or other storage device. The terminal device 130 may be a computer, a mobile terminal, an internet of things device, and the like.

It should be noted that the system architecture shown in fig. 1 is only an example, and does not limit the embodiments of the present application.

As can be seen from the above description, in a distributed storage system, data encoding is stored in a distributed manner to a plurality of storage nodes, and the plurality of storage nodes may fail to cause data loss, so that maintaining the integrity of stored data becomes a major challenge. In the existing scheme, data is encoded into maximum distance separable codes (MDS codes) for storage, so as to improve the storage reliability of the data, because when a storage node fails or fails to cause a code on the node to be lost, corresponding codes can be downloaded from the remaining nodes without failure, and the lost codes can be calculated based on the codes. However, in the existing scheme, in order to repair a code of a failed node, either all codes are downloaded from all non-failed nodes for calculation, which results in a large repair bandwidth (the repair bandwidth is the number of codes downloaded to repair a single failed node), and wastes network bandwidth resources, or calculation is performed through a sufficiently large finite field, which results in a large calculation complexity and wastes calculation resources. Therefore, in order to implement a storage coding process with low computational complexity while satisfying a smaller repair bandwidth, embodiments of the present application provide a method for constructing an MDS code and a method for repairing the MDS code.

The following first describes a method for constructing an MDS code according to an embodiment of the present application. Referring to fig. 2, the method includes, but is not limited to, the steps of:

s201, acquiring a parameter n, a parameter k and a data code to be stored; the parameter n and the parameter k indicate that the data code is stored at k nodes of n storage nodes and indicate that the check code of the data code is stored at r nodes of the n storage nodes; r = n-k, n is an integer greater than 1, and k and r are both integers greater than 0.

In a specific embodiment, in a possible implementation manner, the server may receive the data to be stored, the parameter n, and the parameter k, which are sent by the terminal device. And then, the server encodes the data to be stored to obtain data codes. For example, based on the parameters n and k, the server may divide the data to be stored into k × r data blocks, and encode each data block to obtain one data code, so that k × r data codes may be obtained. R = n-k. Illustratively, the data encoding may be binary encoding.

Alternatively, in another possible embodiment, the server may receive data to be stored sent by the terminal device, and then, the server may determine the parameter n and the parameter k based on the data amount of the data to be stored. Then, the data to be stored is encoded, and the specific implementation of encoding may refer to the description in the previous paragraph, which is not described herein again.

In addition, the server may know, based on the parameters n and k, that n storage nodes need to be selected for storing data, and the data codes to be stored obtained by the coding may be stored in k nodes of the n storage nodes in a distributed manner, and since there are k × r data codes, each node of the k nodes may store r data codes; and the other r nodes in the n storage nodes can be used for storing check codes of the data codes, and similarly, each node in the r nodes can store r check codes. The check code may be used to assist in repairing lost codes in a failed node when a storage node fails. The check code may be generated based on the data code and the check matrix, and specific implementation will be described in detail later, which is not described herein again.

For convenience of the following description, the k nodes storing the data codes may be referred to as data nodes, and the r nodes storing the check codes may be referred to as check nodes.

Illustratively, the server may be the server 110 shown in fig. 1 described above, and the terminal device may be the terminal device 130 shown in fig. 1 described above.

S202, constructing a check matrix H of the MDS code based on the parameter n and the parameter k; the check matrix H is r ² Line (r) ² * s) a matrix of columns, s = n/r; the check matrix H comprises a check matrix of each node of the n storage nodes, and the check matrix of each node is r ² R rows and r columns, and the size of the check matrix formed by any r nodes in the check matrix H is r ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Is reversible.

In a specific embodiment, in order to ensure the reliability of data, that is, to repair data in a failed node in time after the node fails, a corresponding check code may be generated based on a data code to be stored, and the data code and the check code are stored in association, so that the corresponding data code and the check code can be acquired to repair the data code of the failed node after the node fails. In general, the check code may be generated by a check matrix. In the embodiment of the application, in order to improve the reliability of data, an encoding mode of an MDS code is adopted, and therefore the check matrix is the check matrix of the MDS code.

Furthermore, in consideration of reducing the repair bandwidth of the code after the single node fails as much as possible and reducing the code calculation amount of the check code and the code repair calculation amount in the failed node, the embodiment of the present application is based on a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Generating a check matrix of the MDS code and based on the binary polynomial ring F ₂ [x]mod(1+x+…+x ^p ^-1 ) And calculating the check code corresponding to the data code to be stored. Wherein mod represents a modulo operation; f ₂ [x]A set of polynomials of coefficient 0 or 1 representing a variable x, whose power may be an integer greater than 0.

Illustratively, the above-described binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Wherein p is a prime number, p is greater than n, and 2 is the prime field Z _p Principal element (principal element) of (1). In particular, Z _p Set of positive integers obtained by modulo p for all positive integers, when 2 ^w mod p ≠ 0, w is an integer from 1 to p-2, and 2 ^p-1 When mod p =0, 2 is called the prime field Z _p The source of (1).

In an embodiment, n = r × s, where r is the number of the check nodes, and s is a positive integer. n = r × s may indicate that the n storage nodes are divided into s node groups, each group including r nodes. For convenience of description, the n storage nodes are numbered as integers from 0 to n-1, the node numbered 0 may be referred to as node 0, the node numbered 1 may be referred to as node 1, and so on, and the node numbered n-1 may be referred to as node n-1. In addition, the s node groups are numbered as integers from 0 to s-1; for the s node groups, the h group includes nodes numbered h r to h r + (r-1), and h takes on an integer from 0 to s-1.

Based on the above description, the check matrix of each node group of the s node groups may be constructed first, and taking the H-th group as an example, since the H-th group includes r storage nodes, then, the check matrix H of the H-th group is obtained _h A check matrix comprising the r storage nodes, one check matrix for each storage node. Suppose with H _{h_m1} Represents the check matrix H _h The H-th group of (1) comprises a check matrix of nodes H r + m1, wherein m1 is an integer from 0 to r-1, then H _h ＝[H _{h_0} H _{h_1} …H _{h_(r-1)} ]I.e. H _{h_0} Check matrix for node H r, H _{h_1} Check matrix for node H r +1, \ 8230;, H _{h_(r-1)} A check matrix for node h r + (r-1).

Check matrix H of the node H r + m1 _{h_m1} Comprising r submatrices, say with H _{h_m1_m2} Represents the check matrix H _{h_m1} M2 of (a), m2 being an integer of 0 to r-1, then H _{h_m1} ＝[H _{h_m1_0} H _{h_m1_1} …H _{h_m1_(r-1)} ] ^T . The H _{h_m1_m2} Is a matrix of r rows and r columns. When m2=0, H _{h_m1_m2} Is a unit matrix of r, i.e. H _{h_m1_0} Is an identity matrix of r. When m2 ≠ 0, H _{h_m1_m2} R +1 identical non-zero elements, and 0 for all elements except the r +1 non-zero elements; specifically, r of the r +1 identical non-zero elements are in the H _{h_m1_m2} On the diagonal of (a), the (r + 1) th non-zero element of the (r + 1) identical non-zero elements is in the matrix H _{h_m1_m2} At the (m 1+ m 2) mod r column of the m1 th row. Can analyze out H _{h_m1} Neutron matrix from H _{h_m1_1} To H _{h_m1_(r-1)} Is obtained by clockwise cyclic shift at the m1 th row of the sub-matrix. Or, alternatively, H _{h_m1} Neutron matrix from H _{h_m1_1} To H _{h_m1_(r-1)} The position of the (r + 1) th non-zero element in (b) can be obtained by performing counterclockwise cyclic shift on the (m 1) th row of the sub-matrix, and the embodiment of the present application takes a forward pointer as an example. In addition, the ring F is based on the above binary polynomial ₂ [x]mod(1+x+…+x ^p-1 ) Matrix H _{h_m1_m2} In (1)The non-zero element being x ^m2*(h*r+m1) 。

Illustratively, the matrix H is described above for ease of understanding _h See, for example, matrix H shown below _h ：

As can be seen, the matrix H shown in (1) above _h A check matrix comprising r storage nodes, the check matrix for each storage node comprising r submatrices, each submatrix being r rows and r columns, whereby the check matrix for each storage node is r ² Rows r and columns, the matrix H _h Is r ² Line r ² And (4) columns.

Based on the above description, a check matrix H for each node group of the s node groups has been constructed _h Then, the check matrices of the s node groups are combined to obtain a check matrix H of the MDS code, where H = [ H ] ₀ H ₁ …H _s-1 ]。H ₀ I.e. the check matrix H _h The matrix when the middle h =0 is the check matrix of the 0 th group in the s node groups; in the same way, H ₁ I.e. the check matrix H _h The matrix when the middle h =1 is the check matrix of the 1 st group in the s node groups; by analogy with that, H _s-1 I.e. the check matrix H _h And the matrix when the middle h = s-1 is the check matrix of the s-1 th group in the s node groups. Due to the matrix H _h Comprising r ² Line r ² The elements of the column, matrix H then includes r ² Line r ² * s columns of elements. Thus, the construction of the check matrix H of the MDS code is completed.

As can be seen from the above description, for the n storage nodes, r data codes may be stored in each of the k data nodes, and r check codes may be stored in each of the r check nodes. If all k × r codes stored by any k nodes of the n storage nodes can decode k × r data codes, a binary polynomial (n, k) MDS code formed by the k × r data codes and the r × r check codes satisfies MDS characteristics. The MDS characteristic is equivalent to verificationThe check matrix composition of any r storage nodes in the matrix H has the size of r ² *r ² In a binary polynomial ring F ₂ [x]/(1+x+…+x ^p-1 ) Is reversible.

Verified that when p is larger than

Then, the above-mentioned binary polynomial (n, k) MDS code satisfies MDS characteristics. Although the upper limit of p satisfying the MDS characteristics is much greater than n, given a specific n and r, we can select a smaller value of p by computer software and verify if all the syndrome matrices are at F ₂ [x]/(1+x+…+x ^p-1 ) Is reversible. For example, when r =4 and n<105, the binary polynomial (n, k) MDS code can be at p through computer software verification>n satisfies the MDS characteristic.

To facilitate understanding of the check matrix H of the MDS code, the following is exemplified. For example, assume that when n =9,k =6, r =3 and s =3. Let p =11. Then, the check matrix of the MDS code of the bivariate polynomial (n, k) = (9, 6) is:

s203, calculating the check code based on the data code and the check matrix H; the check code and the data code obtained by calculation form the MDS code.

Based on the above description, there are k × r data codes and r × r parity codes, and there are k × r + r = (k + r) × r = n × r codes in total. The k x r data codes are known, and the r x r check codes need to be calculated to obtain the check codes. The n storage nodes are integers from 0 to n-1, assuming that the r codes for storage to node y are represented as c _0,y ,c _1,y ,…,c _r-1,y And the value of y is an integer between 0 and n-1. The column vector of n × r codes is denoted as C, C = [ C = _0,0 ,c _1,0 ,…,c _r-1,0 ,c _0,1 ,c _1,1 ,…,c _r-1,1 ,……,c _0,n-1 ,c _1,n-1 ,…,c _r-1,n-1 ] ^T Then, there is H × C =0, i.e.:

H*[c _0,0 ,c _1,0 ,…,c _r-1,0 ,c _0,1 ,c _1,1 ,…,c _r-1,1 ,……,c _0,n-1 ,c _1,n-1 ,…,c _r-1,n-1 ] ^T ＝0。

the size of H is r ² *(r ² * s), the size of the column vector C being (n × r) × 1= (s × r) × 1= (r) = ² * s) × 1, then, the multiplication of H and C yields a value of r ² *1, the column vector equals 0, then r is in the column vector ² Each element is equal to 0, so that r can be obtained ² An equation of r ² The equations constitute a linear system of equations. Since k × r data codes are known in the row vector C, the unknowns in the set of linear equations are r × r parity codes, which need to be solved based on the set of linear equations.

In a specific embodiment, among the n storage nodes, the k data nodes for storing the data codes may be any k nodes in the n storage nodes, and the r check nodes for storing the check codes may be any r nodes in the n storage nodes. In the embodiment of the present application, it may be assumed that the r check nodes may be numbered 0, r, 2r, \ 8230; (r-1) × r, for example, if n =9,k =6,r =3, the 3 check nodes are node 0, node 3 and node 6. Nodes corresponding to the numbers except the r numbers in the numbers 0 to n-1 are data nodes. Based on this, the known k × r data codes can be substituted into the linear equation system, and after calculation and matrix transformation, the following calculation results can be obtained:

it can be seen that the right row vector in the above calculation result (3) is composed of unknown r × r check codes, and the left matrix is the coefficient matrix of the unknown check codes. Because r × r unknowns exist, if the unknowns are solved at one time, very complex calculation needs to be performed, and in order to reduce the complexity of the calculation, in the embodiment of the present application, the calculation process of the solution may be performed by dividing into r times, and r check codes are solved each time.

In a specific embodiment, according to the above result, r row elements may be extracted each time, based on the structure of the coefficient matrix, the r row elements extracted each time may be elements of r rows where r check codes with the same sequence number are stored in r check nodes, a new coefficient matrix composed of the elements thus extracted is a Vandermonde matrix (Vandermonde), and the van der monde matrix may be used to quickly calculate the r check codes by using LU decomposition of the Vandermonde matrix (see the article "a unified form of event and RDP codes and the third effect decoding" IEEE TCOM 2018).

Specifically, taking the first solving calculation as an example, the value of l is an integer between 1 and r-1, and the l, r + l,2r + l, \8230isextracted from the calculation result (3), and the following extraction results are obtained for the (r-1) r + l row elements:

as can be seen from the above extraction result (4), the right row vector is formed by the extracted r check codes c _l,0 ,c _l,r ,c _l,2r ,…,c _l,(r-1)r And the matrix on the left is a coefficient matrix of the r check codes, and the coefficient matrix is a Van der Waals matrix. The r check codes are then computed using the LU decomposition of the Vandermonde matrix.

Repeating the above calculation r-1 times to obtain (r-1) × r check codes, and leaving the remaining unsolved r check codes as c _0,0 ,c _0,r ,c _0,2r ,…,c _0,(r-1)r . To solve the r check codes, (r-1) × r check codes obtained by the solution may be substituted into the calculation result (3), and the following calculation results are obtained after calculation and matrix transformation:

from the above calculation result (5), it can be seen that the right row vector is formed by the extracted r check codes c _0,0 ,c _0,r ,c _0,2r ,…,c _0,(r-1)r And the matrix on the left is a coefficient matrix of the r check codes, and the coefficient matrix is a Van der Waals matrix. Then, r check codes are quickly calculated, again using LU decomposition of the Vandermonde matrix.

And based on the r times of solving calculation, the r x r check codes can be calculated, so that the whole coding process of the MDS is completed. To facilitate understanding of the above-mentioned process of r solving calculations, the following description is given.

Assuming that the check matrix H is as described in (2) above, nodes numbered 0, 3, and 6 can be selected as check nodes, and the number of data codes is k × r =6 × 3= 18. Based on the above equation H × C =0, a linear equation set consisting of r × r =9 linear equations may be obtained, the 18 data codes may be substituted into the linear equation set, and after calculation and matrix transformation, the following calculation results may be obtained:

it can be seen that the right row vector in the above calculation result (6) is composed of unknown r × r =9 check codes, and the left matrix is the coefficient matrix of the unknown check codes. Because there are 9 unknowns, if it is solved out at one time, very complex calculation needs to be performed, and in order to reduce the complexity of the calculation, in the embodiment of the present application, the calculation process of the solution may be divided into r times, and 3 check codes are solved out each time. Based on the description of the above-described solution calculation of the 1 st time (in this example, the value of l is 1 and 2), in the 1 st solution calculation, the elements can be extracted from the 1 st, 4 th and 7 th lines in the above calculation result (6), and the following extraction results can be obtained:

as can be seen from the above extraction result (7), the right row vector is encoded by the extracted 3 parity codes c _1,0 ,c _1,3 ,c _l,6 And the matrix on the left is a coefficient matrix of the 3 check codes, and the coefficient matrix is a Van der Waals matrix. The 3 check codes are then calculated using the LU decomposition of the Vandermonde matrix.

In the 2 nd solving calculation, the elements can be extracted from the 2 nd, 5 th and 8 th lines in the above calculation result (6), and the following extraction results can be obtained:

as can be seen from the above extraction result (8), the right row vector is encoded by the extracted 3 check codes c _2,0 ,c _2,3 ,c _2,6 And the matrix on the left is a coefficient matrix of the 3 check codes, and the coefficient matrix is a Van der Waals matrix. The 3 check codes are then computed using the LU decomposition of the Vandermonde matrix.

Obtaining the check code c through the 1 st and 2 nd solving calculation _1,0 ,c _1,3 ,c _l,6 ,c _2,0 ,c _2,3 ,c _2,6 The obtained 6 check codes can be substituted into the calculation result (6), and the following calculation result is obtained after calculation and matrix transformation:

from the above calculation result (9), it can be seen that the right row vector is encoded by the extracted 3 parity codes c _0,0 ,c _0,3 ,c _0,6 And the matrix on the left is a coefficient matrix of the 3 check codes, and the coefficient matrix is a Van der Waals matrix. Then, 3 check codes are quickly calculated again using LU decomposition of the Vandermonde matrix.

In the present application, a multi-ring F based approach may be employed ₂ [x]mod(1+x+…+x ^p-1 ) Solving the system of equations. Optionally, in order to reduce the computational complexity, in the above coding solution process, all the loops F are in multiple terms ₂ [x]mod(1+x+…+x ^p ^-1 ) Can be first processed by polynomial ring F ₂ [x]mod(1+x ^p ) The final result is modeled by polynomial 1+ x + \ 8230 + ^p-1 And (4) finishing.

And the server obtains the n x r binary polynomial MDS codes and correspondingly stores the n x r codes into the n storage nodes. In particular, code c _0,y ,c _1,y ,…,c _r-1,y And sending the data to a node with the number y for storage, wherein y =0,1,2, \ 8230;, n-1.

In summary, the construction method of the binary polynomial MDS code proposed in the embodiment of the present application has lower computational complexity compared to the existing construction method of MDS codes. This is because the conventional MDS codes are constructed under a finite field large enough, which results in a high computational complexity; the binary polynomial MDS code proposed in the embodiment of the present application is constructed based on a binary polynomial ring having a cyclic structure, and only an exclusive or (XOR) operation and a cyclic shift operation are involved in the encoding and decoding processes. In addition, the construction method of the binary polynomial MDS code provided by the embodiment of the application is explicit, that is, the check matrix used in the construction process is given and determined, and the MDS characteristic can be satisfied only by sufficiently increasing the parameter p; the existing MDS code (such as epsilon-MSR) construction method is not explicit, and the MDS characteristics need to be verified by means of a software search method under a sufficiently large finite field, which additionally brings about more complex calculation; particularly, when the parameters n and k are both large, for epsilon-MSR, even by means of a software searching method, the MDS characteristics are difficult to verify, so that the fact that an MDS code is obtained by construction cannot be guaranteed, and the fact that the code of a failed node can be correctly repaired cannot be guaranteed.

Based on the MDS code check matrix H constructed above and the binary polynomial MDS code constructed based on the check matrix H, the method for repairing the MDS code provided in the embodiment of the present application is described below. The repairing method comprises a repairing method that after a single storage node fails, other n-1 nodes can normally download codes, and a repairing method that after the single storage node fails, other n-1 nodes have busy nodes and the codes cannot be downloaded in the busy nodes, because the busy nodes are processing other tasks and cannot provide resources to cooperate with downloading the codes for repairing.

The restoration method that after a single storage node fails, other n-1 nodes can normally download codes is described below. Referring to fig. 3, the repair method includes, but is not limited to, the following steps:

s301, determining a 2n-r-S repair code based on r row elements in the check matrix H, wherein the repair code comprises data code and/or check code.

In a specific embodiment, the server may sense the state of each node in the n storage nodes, and if the server senses that a failed node occurs, the server may download a part of the codes from the other n-1 nodes to repair the codes in the failed node. The server determines the code for repairing the failed node as follows:

let c _i,y Represents the ith code stored in the node y, wherein y is the integer between 0 and n-1, and i is the integer between 0 and r-1. For the convenience of the following description, it is assumed that the failed node is numbered h × r + i, and h is an integer between 0 and s-1 based on the above description.

In this embodiment of the present application, based on the structure of the check matrix H, a code for repairing a lost code in a failed node may be determined based on r row elements in the check matrix H, and the code for repairing may be referred to as a repair code, where the repair code may be a data code, a check code, or a mixed code of the two. The r row elements may be any r rows in the check matrix H, but in the embodiment of the present application, the i, r + i,2r + i, \8230, (r-1) r + i row elements in the check matrix H may be exemplarily selected based on the composition structure of the check matrix H to reduce the computational complexity. The following description will take the selected row r elements of i, r + i,2r + i, \8230; (r-1) r + i as an example.

Specifically, based on the ith row element, it may be determined that the ith code in the MDS code stored by each of the n-1 nodes is the repair code.

Then, based on the above-mentioned r + i-th row element, the ((i + 1) mod r) th code in the MDS code stored by each node of the target node can be determined as the repair code. The target node includes s-1 nodes of the n storage nodes, which are numbered i, r + i,2 + r + i, \ 8230, (s-1) × r + i except for a failure node, which is numbered h + r + i.

Similarly, based on the line element 2r + i, the ((i + 2) mod r) th code in the MDS codes stored by each node of the target node can be determined as the repair code.

Then, for the g × r + i th row, the ((i + g) mod r) th code in the MDS codes stored by each node of the target node can be determined to be the repair code based on the g × r + i th row, where g is an integer from 1 to r-1.

Based on the above description, 2n-r-s repair codes can be determined. Specifically, n-1 repair codes can be determined based on the i-th row element, s-1 repair codes can be determined based on the g x r + i-th row element, and if g is taken from 1 to r-1, (s-1) × (r-1) repair codes can be determined, and n-1+ (s-1) =2n-r-s.

S302, downloading the repair code in n-1 nodes which are not failed in the n storage nodes.

The server determines the 2n-r-s repair codes, can send a code acquisition request to the storage node where the repair codes are located, and the corresponding storage node sends the corresponding codes to the server based on the request, so that the repair codes are downloaded. For example, taking the storage node with the number 0 as an example, the server determines that the 0 th code in the node 0 is a repair code, and then sends a request for acquiring the 0 th code to the node 0, and after receiving the request, the node 0 reads the 0 th code from its own memory and sends the 0 th code to the server.

And S303, calculating the lost codes in the failed nodes based on the downloaded repair codes.

After the server obtains the 2n-r-s repair codes, the lost codes in the failed nodes can be calculated by combining the elements of the (i, r + i,2r + i, \8230and (r-1) r + i rows.

Specifically, based on the foregoing description:

then, the multiplication of the element of the ith row by the column vector C is equal to 0, and a linear equation can be obtained; since the value of i is an integer between 0 and r-1, and the 0 th row to the r-1 th row of the check matrix H are composed of r × s, i.e., n unit matrices of r × r, the code in the linear equation is only c _i,0 ,c _i,1 ,c _i,2 ,…,c _i,r-1 N codes except the ith code c of the failed node _i,h*r+i Except that n-1 codes are not obtained, other codes are known, and the codes c can be solved by substituting the n-1 codes into the linear equation _i,h*r+i 。

For other codes in the failed node, r-1 linear equations may be obtained based on the above-mentioned g × r + i line and the above-mentioned equation H × C =0. Taking a linear equation obtained by multiplying elements in the g x r + i-th row by a column vector C and being equal to 0 as an example, the ((i + g) mod r) th code downloaded from s-1 nodes in the target nodes, the ith code of the repaired failed node h x r + i, and the ith code downloaded from nodes {0,1, \8230;, n-1} \ { i, r + i, \8230; (s-1) r + i } can be substituted into the linear equation, so that the ((i + g) mod r) th code of the failed node h x r + i can be calculated. The node {0,1, \8230;, n-1} \ { i, r + i, \8230; (s-1) r + i } refers to: from node 0 to node n-1, n-s nodes except for s nodes i, r + i, \8230; (s-1) r + i.

The repair of missing codes in the failed node h r + i can be done based on the above calculations. After the server obtains the r codes in the failure node, the r codes are sent to a new storage node for storage, and therefore reliability of data is guaranteed.

To facilitate understanding of the repair method described above with reference to fig. 3, the following description is given by way of example.

Exemplarily, assuming that the check matrix H is as shown in (2) above, the MDS codes finally obtained based on the check matrix are n × r =9 × 3=27, where k × r =6 × 3=18 codes for data and r × r =3 × 3=9 codes for check. Based on the above description, the failed node is numbered h r + i =3h + i.

If the failed node 3h + i is node 0, node 3 or node 6, 3 codes can be downloaded from two nodes except the failed node in the node 0, node 3 and node 6, respectively, the 0 th code is downloaded from

nodes

1,2,4,5,7 and 8, respectively, and the 3 codes of the failed node are repaired by the downloaded 2 + 3+6=12 codes.

If the failed node 3h + i is node 1, node 4 or node 7, 3 codes may be downloaded from two nodes except the failed node among the node 1, node 4 and node 7, respectively, the 1 st code is downloaded from

nodes

0,2,3,5,6 and 8, respectively, and 3 codes of the failed node are repaired by 2 × 3+6=12 codes obtained through downloading.

If the failed node 3h + i is node 2, node 5 or node 8, 3 codes may be downloaded from two nodes except the failed node among the node 2, node 5 and node 8, respectively, the 2 nd code is downloaded from

nodes

0,1,3,4,6 and 7, respectively, and the 3 codes of the failed node are repaired by 2 × 3+6=12 codes obtained through downloading.

Exemplarily, referring to fig. 3A, fig. 3A shows a case where the node 0 is a failed node, and codes respectively downloaded from other 8 nodes are needed to repair the node 0.

According to the repairing method, the repairing bandwidth of the single node failure is 2n-r-s codes, and the repairing bandwidth ratio of the single node failure is (2 n-r-s)/kr. For example, in the above example where n =9,k =6,r =3, the repair bandwidth ratio for the single node failure is (2 n-r-s)/kr =12/18=0.667. Under the same parameters and the condition of the number of packets, the binary polynomial MDS code provided by the embodiment of the application can reach the minimum value of the repair bandwidth of the existing MDS code. The number of packets is the number of nodes storing the MDS codes, for example, the number of packets is r in the above embodiment.

The following describes a repairing method that after a single storage node fails, other n-1 nodes have busy nodes, and codes cannot be downloaded in the busy nodes. In the repair process of a single failed node, when some nodes in the storage node are in busy states, data cannot be downloaded from the busy node, and only data can be downloaded from other nodes without busy nodes. The long tail repair bandwidth is the number of codes required to be downloaded in the long tail repair process.

First, a method for repairing a busy node existing in the above n-1 nodes is described below, and referring to fig. 4, the method includes, but is not limited to, the following steps:

s401, determining 3n-2S-r-2 repair codes based on r +2 row elements in the check matrix H, wherein the repair codes comprise data codes and/or check codes.

In a specific implementation, in a normal case, if, except for a failed node, other n-1 nodes in the n storage nodes can normally download codes, the method for repairing a lost code in the failed node can be implemented according to the method shown in fig. 3. However, if there are busy nodes in the n-1 nodes, the used repair codes cannot be downloaded from the busy nodes, and thus repair of lost codes in failed nodes cannot be realized. It can be understood that, since the used repair code cannot be downloaded from the busy node, the used repair code is also unknown in the calculation process, and if the lost code in the failed node is to be solved, the used repair code needs to be solved together. Based on the above description in the repair method shown in fig. 3, r linear equations can be used to solve r missing codes in the failed node, and because the existence of the busy node causes the unknown quantity to increase, more linear equations are needed to solve the r missing codes. In this embodiment, the existence of a busy node is taken as an example for description.

In one embodiment, assume that the failed node is still the node numbered h x r + i, and assume that the busy node is numbered z x r + j. z is an integer of 0 to less than s, and i and j are each an integer of 0 to less than r. Wherein i ≠ j, which indicates that busy node z ≠ r + j does not belong to any of nodes i, r + i,2 ++ i, \8230; (s-1) r + i, then based on the repair method shown in fig. 3, only one repair code which cannot be downloaded from the busy node exists, and the unknown codes to be solved are r + 1.

To solve the r +1 unknown codes, repair codes may be downloaded from n-2 nodes other than the failed node and the busy node among the n storage nodes to repair the lost codes.

Specifically, based on the corresponding description in S301 shown in fig. 3, it may be determined that the i-th code in the MDS code stored by each node of the n-2 nodes is a repair code, and it may be determined that the ((i + g) mod r) th code in the MDS code stored by each node of the target node is a repair code, where g is an integer from 1 to r-1. The number of repair codes determined so far is 2n-r-s-1, but this is not enough to solve the above r +1 unknown codes. Based on the structure of the check matrix H, other repair codes can be determined based on the elements of the q-th row and the u × r + q-th row of the check matrix H.

The values of q and u are integers between 0 and r-1, and (u + q) mod r = i mod r, q ≠ i. Based on the elements in the q-th row and the u r + q-th row, the q-th code of n- (s + 1) nodes except the s +1 nodes numbered as z r + j, i, r + i,2r + i, \8230, (s-1) r + i is determined as the repair code.

Based on the above description, 3n-r-2s-2 repair codes can be determined. Specifically, n- (s + 1) repair codes can be determined based on the qth row and the uxr + q row, and then the 2n-r-s-1 repair codes are added, so that the total number of 3n-r-2s-2 repair codes is obtained.

S402, downloading the repair codes in n-2 nodes except the failed node and the busy node in the n storage nodes.

The server determines the 3n-r-2s-2 repair codes, can send a code acquisition request to the storage node where the repair codes are located, and the corresponding storage node sends the corresponding codes to the server based on the request, so that the repair codes are downloaded. For example, taking the storage node with the number 0 as an example, the server determines that the 0 th code in the node 0 is a repair code, and then sends a request for acquiring the 0 th code to the node 0, and after receiving the request, the node 0 reads the 0 th code from its own memory and sends the 0 th code to the server.

And S403, calculating the lost codes in the failed node based on the downloaded repair codes.

After the server obtains the 3n-r-2s-2 repair codes, the lost codes in the failure nodes can be calculated by combining the elements in the (i, r + i,2r + i, \8230; (r-1) r + i, q, u + r + q rows.

Specifically, based on the foregoing description:

then, the multiplication of the element of the ith row by the column vector C is equal to 0, and a linear equation can be obtained. In addition, r-1 linear equations can be obtained by multiplying the elements in the g x r + i th row by the column vector C, wherein the multiplication is equal to 0, and the value of g is an integer from 1 to r-1. Based on the elements in the (i, r + i,2r + i, \ 8230; (r-1) r + i rows, r linear equations can be obtained, then, the repair codes obtained based on the elements in the (i, r + i,2r + i, \ 8230; (r-1) r + i rows are substituted into the r linear equations, and the joint solution obtains the following calculation results:

wherein the content of the first and second substances,

it can be seen that the right row vector in the calculation result (10) is composed of the r +1 unknown codes, and the left matrix is the coefficient matrix of the unknown codes. In order to better distinguish between the missing coded coefficient matrix in the failed node and the unknown repair coded coefficient matrix in the busy node, the missing coded coefficient matrix and the unknown repair coded coefficient matrix are separated by a vertical line in the above (10) and are separated by a matrix E _r And (3) a coefficient matrix representing r missing codes in the failed node, which can be specifically referred to in (11) above.

In addition, the multiplication of the q-th row element by the column vector C is equal to 0, and the multiplication of the u x r + q row element by the column vector C is equal to 0, so that two linear equations can be obtained, and r +2 linear equations can be obtained in total. Based on the foregoing description, there are r +1 unknown codes, but one more unknown code is found in the linear equation obtained based on the qth line and the uxr + q line, so that there are r +2 unknown codes in total. And forming a linear equation set by the r +2 linear equations to obtain the r +2 unknown codes through solving, so that r lost codes in the failure nodes are obtained.

In addition, the linear equation set composed of r +2 linear equations may be transformed to obtain a determinant of (r + 2) × (r + 2), which is the binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) And the maximum degree of the polynomial may be verified to be less than

Satisfying the solvability of the above equation set.

To facilitate understanding of the single busy node long tail repair method described above, the following is exemplified.

Exemplarily, assuming that the check matrix H is as shown in (2) above, the MDS codes finally obtained based on the check matrix are n × r =9 × 3=27, where k × r =6 × 3=18 codes for data and r × r =3 × 3=9 codes for check codes. Based on the above description, the number of failed node is h + r + i =3h + i and busy node is z + r + j =3z + j.

Assuming that the failed node is node 0 and the busy node is node 1, according to the above-mentioned long tail repair method for a single busy node, 3 codes can be downloaded from nodes 3 and 6, respectively, and 0 th and 2 nd codes can be downloaded from

nodes

2,4,5,7 and 8, respectively. From

rows

0,2,3,5 and 6 of the check matrix, the following calculations can be made:

it can be seen that the right row vector in the calculation (12) consists of the 5 unknown codes, where c _0,0 ,c _1,0 ,c _2,0 3 codes lost for a failed node, c _0,1 ,c _2,1 2 codes which cannot be downloaded in the busy node; the matrix on the left is the coefficient matrix for the unknown code.

The determinant of the matrix shown in (12) above is:

the determinant shown in (13) above is in a binary polynomial ring F ₂ [x]mod(1+x+…+x ¹⁰ ) Is reversible. Thus, the failing 3 symbols c can be solved _0,0 ,c _1,0 ,c _2,0 And two symbols c of busy node 1 _0,1 ,c _2,1 . The single-node long tail repair bandwidth ratio for this example is 0.889.

In summary, according to the single busy node long tail repair method, the single busy node long tail repair bandwidth of the binary (n, k) MDS code is 3n-r-2s-2 codes. Thus, the long tail repair bandwidth ratio for a single busy node is (3 n-r-2 s-2)/kr. Under the same parameters and the condition of sub-packet number, the long tail repair algorithm provided by the embodiment of the application has the minimum long tail repair bandwidth in the existing repair algorithm, and the calculation complexity is reduced compared with the existing algorithm.

In another possible embodiment, another single-node long tail repair method is provided. The concrete implementation is as follows:

based on the foregoing description, the MDS code is stored in n storage nodes, where k nodes are used for storing data codes and r nodes are used for storing check codes of the data codes. And, each of the n storage nodes stores r codes, i.e., the MDS code includes r × n codes. Illustratively, the MDS code may be a small packet MDS code (MDS-SSL) with small sub-packet partitioning level.

In the embodiment of the present application, the n storage nodes are divided into r groups, each group including s nodes. For ease of understanding, reference may be made to fig. 4A for example. It can be seen that packet 1 includes s points node 1 to node s, packet 2 includes s points node s +1 to node 2s, and so on, and packet r includes s points node n-s +1 to node n. The (u 1-1) th s + v 1-th node represents the v 1-th node in the u 1-th packet. The (u 1-1) th s + v 1-th node can be simply represented as a node (u 1, v 1). U1 is an integer of 1 or more and r or less. V1 is an integer of not less than 1 and not more than s.

The code stored in the above-mentioned node (u 1, v 1) may be denoted as C _(u1-1)s+v1 。

Wherein, C _(x；(u1,v1) An xth code stored in the v1 th node representing the u1 th packet; x is an integer greater than 0 and less than or equal to r. />

Denotes a finite field of at least n +1 size>

The extension of (3).

The check matrix H' of the MDS code can be exemplarily expressed as follows:

the size of the check matrix H' is r ² Rows r n columns. Wherein, I is an identity matrix of r. { lambda ] _x } _x∈[n] Is a finite field

N mutually different non-zero elements. Ψ _(u1-1)s+v1,p A matrix of r, in which position>

The upper element is psi _(u1-1)s+v1,p All other positions are 0, u1 is belonged to [ r ]]，v1∈[s]，p∈[r-1]。{ψ _x,y } _{x∈[n],y∈[r-1]} Is a limited field->

The above elements. Wherein for a positive integer e, a->

Is composed of

The check matrix H' can be expressed as

It can be seen that the check matrix H' may include r sub-matrices. Wherein, the submatrix H' _i Is r rows r x n columns, and the value of i is an integer between 0 and r-1.

The check matrix H' may be a check matrix having a different structure from the check matrix H described in the above description related to fig. 2. However, the check matrix H' is associated with the column matrix C of the MDS code consisting of n × r codes as described above ^T Multiplication is still equal to zero, i.e. H'. C ^T And =0. It will be appreciated thatColumn matrix C ^T The column vector C described in the above description relating to fig. 2 is also an object of representing the MDS code, and is for convenience of describing two different representations of the respective corresponding embodiments.

Based on the H'. C ^T H can be obtained from = 0' ₀ *C ^T =0 and

based on H' ₀ *C ^T =0 the following linear equation can be obtained:

based on

The following linear equation can be obtained:

in the check matrix H', each column corresponds to one symbol, and each r column corresponds to one node. For convenience of understanding, the corresponding check matrix is illustrated by taking n =6,k =3,r =3 as an example, and the details are as follows:

each column in the check matrix corresponds to one code, each 3 columns corresponds to one node, and each two nodes are one group. For example, the matrix exemplarily shown above includes 18 columns, and in order from left to right, the leftmost column is the 1 st column, and the rightmost column is the 18 th column. Then:

the matrix of 9 rows and 3 columns formed by the 1 st, 2 nd and 3 rd columns is corresponding to the nodes (1, 1)And a check submatrix, wherein the 1 st, 2 nd and 3 rd columns respectively correspond to the 1 st, 2 nd and 3 rd codes stored in the nodes (1, 1). Ψ in the syndrome matrix corresponding to the node (1, 1) _1,1 In position (1)

I.e. the element in the position of row 1 and column 2 is psi _1,1 And the remaining elements are 0. Psi in the syndrome matrix corresponding to the node (1, 1) _1,2 In position->

I.e. the element in the position of row 1 and column 3 is psi _1,2 And the remaining elements are 0.

The 4 th, 5 th and 6 th columns form a matrix with 9 rows and 3 columns, which is a check submatrix corresponding to the nodes (1, 2), and the 4 th, 5 th and 6 th columns respectively correspond to the 1 st, 2 nd and 3 rd codes stored in the nodes (1, 2). Ψ in the syndrome matrix corresponding to the node (1, 2) _2,1 In position (1)

I.e. the element in the position of row 1 and column 2 is psi _2,1 And the remaining elements are 0.Ψ in the syndrome matrix corresponding to the node (1, 2) _2,2 In position->

I.e. the element in the position of row 1, column 3 is psi _1,2 And the remaining elements are 0.

The 7 th, 8 th and 9 th columns form a matrix with 9 rows and 3 columns, which is a check submatrix corresponding to the node (2, 1), and the 7 th, 8 th and 9 th columns respectively correspond to the 1 st, 2 nd and 3 rd codes stored in the node (2, 1). Ψ in the syndrome matrix corresponding to the node (2, 1) _3,1 In position (1)

I.e. the element in the position of row 2 and column 3 is psi _3,1 And the remaining elements are 0.Ψ in the syndrome matrix corresponding to the node (2, 1) _3,2 In position->

I.e. the element in the position of row 2, column 1 is psi _3,2 And the remaining elements are 0.

The matrix of 9 rows and 3 columns formed by 10 th, 11 th and 12 th columns is a check submatrix corresponding to the nodes (2, 2), and the 10 th, 11 th and 12 th columns respectively correspond to the 1 st, 2 nd and 3 rd codes stored in the nodes (2, 2). Ψ in the syndrome matrix corresponding to the node (2, 2) _4,1 In position (1)

I.e. the element in the position of row 2 and column 3 is psi _4,1 And the remaining elements are 0.Ψ in the syndrome matrix corresponding to the node (2, 2) _4,2 In position->

I.e. the element in the position of row 2 and column 1 is psi _4,2 And the remaining elements are 0.

The 13 th, 14 th and 15 th columns form a matrix with 9 rows and 3 columns, which is a check submatrix corresponding to the corresponding node (3, 1), and the 13 th, 14 th and 15 th columns respectively correspond to the 1 st, 2 nd and 3 rd codes stored in the corresponding node (3, 1). Ψ in the syndrome matrix corresponding to the node (3, 1) _5,1 In position (1)

I.e. the element in the position of row 3 and column 1 is psi _5,1 And the remaining elements are 0.Ψ in the syndrome matrix corresponding to the node (3, 1) _5,2 In position->

I.e. the element in the position of row 3 and column 2 is psi _5,2 And the remaining elements are 0.

The matrix of 9 rows and 3 columns formed by the 16 th, 17 th and 18 th columns is a check submatrix corresponding to the nodes (3, 2), and the 16 th, 17 th and 18 th columns respectively correspond to the 1 st, 2 nd and 3 rd codes stored in the nodes (3, 2). Ψ in the syndrome matrix corresponding to the node (3, 2) _6,1 In (1),in position

I.e. the element in the position of row 3 and column 1 is psi _6,1 And the remaining elements are 0.Ψ in the syndrome matrix corresponding to the node (3, 1) _6,2 In position->

I.e. the element in the position of row 3 and column 2 is psi _6,2 And the remaining elements are 0.

In a specific implementation, in the case that there is a node failure in the n storage nodes, which results in that the codes stored in the node are not available or lost, r codes in the failed node (also referred to as a failed node) may be repaired based on the check matrix H'. Assuming that the damaged node is (u 1, v 1), then the r codes that need to be repaired are { C } _{(1；(u1*,v1*))} ,C _{(2；(u1*,v1*))} ,…,C _{(r；(u1*,v1*))} }。

Illustratively, to repair the r codes, H ' in the check matrix H ' may be first based ' ₀ To H' _r-1 The u1 th row of each of the r matrices downloads the repair code of the failed node. In particular, based on the H' ₀ To H' _r-1 The u1 th row of each of these r matrices may determine that the downloaded repair code is: all codes stored in all nodes except the failed node in the u1 th group, and the u1 th codes stored in all nodes except the node of the u1 th group in the n storage nodes. Then a total of 2n-s-r repair codes need to be downloaded.

And additionally, H' ₀ *C ^T =0, then H' ₀ U1 th row of (1) and the C ^T The multiplication is also equal to 0, thus obtaining the following equation:

in the same way, the method for preparing the composite material,

then H' ₁ ，H’ ₂ ，…，H’ _r-1 U1 th row in (1) and the C ^T The multiplication is also equal to 0, thus obtaining the following equation:

the value of p in the formula (4 ') is an integer between 1 and r-1, and r-1 equations can be obtained by substituting the integer between 1 and r-1 into the formula (4').

That is, one equation in the above formula (3 ') is obtained, and r-1 equations obtained based on the above formula (4') are obtained, and r equations are obtained in total. And the unknowns in the r equations are the r codes stored in the failed node. The r equations solve for the r unknowns, and thus the r codes in the failed node can be solved. However, in the case where one node (a node grouped differently from the failed node) of the n storage nodes is in a busy state in addition to the failed node, that is, there is one busy node, the repair method is no longer applicable. Since the code in the busy node is not available, the u1 st code in the busy node cannot be downloaded, so that an equation is lacked, and r unknowns cannot be calculated. Therefore, other repair codes need to be acquired additionally to obtain more equations, and meanwhile, introduced unknowns are reduced as much as possible, so that the failed node can be repaired with the smallest repair bandwidth as possible. See the description below for specific implementations.

Assuming that the busy node is (u 1', v 1'), u1'≠ u1' ₀ U1^ row of (a) row matrix H (0, u1 ^) and matrix H' ₃ The row matrix h (3, u1 ^) composed of the u1^ th row in (b) downloads another part of the repair code of the failed node. Wherein the relationship between u1 and u1 is:

on toolIn the implementation, after obtaining the row matrices h (0, u1 ^) and h (3, u1 ^), the two row matrices can be linearly combined to obtain a row matrix E. In an exemplary manner, the first and second electrodes are,

it is understood that a is a constant, and the value thereof can be any real number, such as 1,2, or 3, etc. The following description will take a =1 as an example.

Since the matrix H '0 and the matrix H'3 are respectively corresponding to the C ^T The multiplications all equal zero, then the row matrices h (0, u1 ^) and h (3, u1 ^) are respectively with C ^T The multiplication is also equal to zero. Thus, the matrices E and C resulting from the linear combination of the row matrices h (0, u1 ^) and h (3, u1 ^) are ^T The multiplication is also equal to zero. I.e. E C ^T =0, the linear equation can be obtained after the unfolding:

in the above formula (5 '), when u1= u1', v1= v1', the substitution into the above formula (5') includes:

it can be seen that in equation (5 ') above, there is no need to obtain the coding in busy nodes (u 1', v1 '), i.e. no new unknowns are introduced in equation (5'). Then, r +1 equations of the formula (5 ') and the above formulas (3 ') and (4 ') can solve r codes of the failed node and u1 code of the busy node, thereby realizing the repair of the failed node.

In the above-mentioned formula (5'),

symbols that have been downloaded in a non-long tail state (i.e., when no busy node is present) do not increase repair bandwidth. Therefore, the long tail bandwidth only needs to be considered

Increased bandwidth in this term. In a limited domain

On, if>

Then there are at most two differences

So that->

The number of codes requiring multiple downloads resulting from matrix E is at least gamma _c And (r-1) s-3. The number of codes gamma requiring multiple downloads resulting from the matrix E _c I.e. the new bandwidth caused by E. Exemplarily, assume +>

Wherein (u) _a ,v _a )≠(u1’,v1’),(u _b ,v _b )≠(u1’,v1’),u _a ≠u1*,u _b Not u1, the codes that require multiple downloads are in addition to packet u1, except for node (u 1', v1 '), (u 1 ') in the remaining packets _a ,v _a )，(u _b ,v _b ) The u1 < th > code in all nodes except.

At this time, the long tail repair bandwidth is: gamma ray ₁ ＝2n-s-r-1+(r-1)s-3＝3n-2s-r-4。

To facilitate understanding of the single-node long-tail repair method, the following description is given by way of example. Also taking n =12,k =8,r =4 as an example, the corresponding check matrix H' is shown in fig. 4B. It is to be understood that, in order to clearly show specific data in the whole check matrix H ', the check matrix H ' is shown by being split into 4 parts, and in particular, the four parts shown in (a) and (B) in fig. 4B can be restored to a complete check matrix H '.

Illustratively, the check matrix H

The embodiments are merely examples, which are not intended to limit the scope of the present disclosure.

It can be seen that the 12 storage nodes are divided into 4 groups, each group comprising 3 nodes. Assume that the failed node is the 1 st node (1, 1) in the 1 st packet, i.e., u1 x =1, v1 x =1; the busy node is the 1 st node (2, 1) in the 2 nd packet, u1'=2, v1' =1. Then, to repair the failed node, it may be based on H 'in the check matrix H' ₀ To H' ₃ The 1 st row of each of the 3 matrices, and

a repair code for the failed node is determined. The matrix H' ₀ To H' ₃ May be respectively denoted as h _0,1 、h _1,1 、h _2,1 And h _3,1 . Then, h is _0,1 、h _1,1 、h _2,1 、h _3,1 And E may constitute a matrix of 5 rows, 12 x 4=48 columns { [ square ] }>

After the matrix W is expanded, the matrix W can be obtained as shown in fig. 4C. It is to be understood that, in order to clearly show specific data in the whole matrix W, the matrix W is shown by being divided into 4 parts, and in particular, the four parts shown in fig. 4C can be restored to a complete matrix W.

Due to W + C ^T =0, then the codes corresponding to columns that are all zeros in the obtained matrix may not be downloaded. In addition, due to

Therefore, in the matrix obtained above, the node (2, 2) and the column corresponding to the 2 nd code in the node (2, 3) are all zero columns. Thus, the repair code that needs to be downloaded is: all of the nodes (1, 2) and (1, 3)Coding; the 1 st code of the node (2, 2), the node (2, 3), the node (3, 1), the node (3, 2), the node (3, 3), the node (4, 1), the node (4, 2) and the node (4, 3); and 2 nd codes of the nodes (3, 1), the nodes (3, 2), the nodes (3, 3), the nodes (4, 1), the nodes (4, 2) and the nodes (4, 3). I.e. 22 codes need to be downloaded, the repair bandwidth is 22.

Based on W C ^T =0, the equation is expanded and transformed to obtain the following equation:

solving this equation can obtain the 4 encodings in the failed node (1, 1) and the 1 st encoding in the busy node (2, 1) as described above, so that repair of the failed node can be achieved.

In this embodiment, a new row is added, and a new check relationship between codewords is provided to solve the problem of insufficient check relationship between codewords required for repair due to unavailable information of busy nodes. In view of

The element of the position corresponding to the second symbol in the nodes (2, 1), (2, 2) and (2, 3) in the new row is 0, so that the check relation defined by the new row does not introduce a new unknown quantity (the second symbol in the busy node (2, 1)) and does not download the second symbol in the nodes (2, 2) and (2, 3). Therefore, the optimal bandwidth of the proposed long tail repair scheme can be achieved.

The above description is based on a repair method implemented in the presence of a busy node, and the following description is directed to a repair method implemented in the presence of a plurality of busy nodes to achieve missing codes in a failed node. Due to the fact that a plurality of busy nodes exist, the process of repairing r codes in the failed nodes is called long tail repair under t busy nodes, and the corresponding repair method is called a long tail repair method under t busy nodes. Referring to fig. 5, the method includes, but is not limited to, the steps of:

s501, determining (t + 2) n-r- (t + 1) (t + S) repair codes based on r + t × t +1 row elements in the check matrix H, where the repair codes include data codes and/or check codes.

In a specific implementation, because there are multiple busy nodes, more repair codes cannot be downloaded from the busy nodes, so that more unknown codes exist in the process of solving the r codes of the failed node, and thus more codes need to be downloaded from normal nodes to solve the r codes. The following is an exemplary description.

In an embodiment, assume a failed node or a node numbered h x r + i and assume that there are t busy nodes numbered z ₁ *r+j ₁ ,z ₂ *r+j ₂ ,z ₃ *r+j ₃ ,…,z _t *r+j _t 。2≤t≤r-2，z ₁ ,z ₂ ,z ₃ ,…,z _t Are all integers between 0 and s-1, i, j ₁ ,j ₂ ,j ₃ ,…,j _t All values of (a) are integers from 0 to r-1. Wherein, and i ≠ j ₁ ≠j ₂ ≠j ₃ ≠…≠j _t This indicates that busy node does not belong to any of nodes i, r + i,2r + i, \8230; (s-1) r + i, s nodes. Based on the repair method shown in fig. 3, it can be known that t repair codes that cannot be downloaded from the t busy nodes are present, and the unknown codes to be solved are r + t.

To solve the r + t unknown codes, repair codes may be downloaded from n-t-1 nodes other than the failed node and the busy node among the n storage nodes to repair the lost codes.

Specifically, based on the corresponding description in S301 shown in fig. 3, it may be determined that the i-th code in the MDS code stored in each node of the n-t-1 nodes is a repair code, and it may be determined that the ((i + g) mod r-th code in the MDS code stored in each node of the target node is a repair code, where g is an integer from 1 to r-1. The number of the determined repair codes is 2n-r-s-t, but the number is not enough to solve the r + t unknown codes. Based on the structure of the check matrix H, other repair codes can be determined based on other row elements in the check matrix H.

Specifically, first, row vectors of t rows may be selected from the 0 th row to the r-1 th row in the check matrix H, assuming that the selected t rows are the e-th row ₁ ,e ₂ ,…,e _t Line, the e th ₁ ,e ₂ ,…,e _t Rows are different from any two of the i, r + i,2r + i, \ 8230, rows (r-1) r + i, above. Based on the e _b The element of the row may determine the e-th node of the n storage nodes, except t busy nodes and n-t-s nodes numbered i, r + i,2 + r + i, \ 8230; (s-1) r + i _b Each code is a repair code, where b =1,2, \8230;, t.

Then, for the consideration of the repair bandwidth as small as possible, the row vectors with as few unknown variables as possible in the row vectors can be selected to determine other repair codes. For example, it may be in the r-th row to r-th row of the check matrix H ² -1 row select between the rows a row vector of t x t rows, each of the t x t row vectors satisfying at a busy node z _b *r+j _b In the corresponding sub-row vector with the length r, only the ith, e ₁ ,e ₂ ,…,e _t Each element is a nonzero value, and all other elements are 0. The t rows are different from any two of the i, r + i,2r + i, \8230inrows (r-1) r + i. The t x t row vectors and the e-th row vector ₁ ,e ₂ ,…,e _t T row vectors of the rows, i.e., a total of t × t +1 row vectors, are used for subsequent solution equation set calculations, see the description of S503 for details, which will not be described here.

Based on the above description, (t + 2) n-r- (t + 1) (t + s) repair codes can be determined. In particular, based on e ₁ ,e ₂ ,…,e _t The row may determine t x (n-t-s) repair codes, plus the 2n-r-s-t repair codes described above, for a total of (t + 2) n-r- (t + 1) (t + s) repair codes.

S502, downloading the repair codes in n-t-1 nodes except the failure node and the t busy nodes in the n storage nodes.

The server determines the (t + 2) n-r- (t + 1) (t + s) repair codes, and can send a code acquisition request to the storage node where the repair code is located, and the corresponding storage node sends the corresponding code to the server based on the request, so that the repair code is downloaded. For example, taking the storage node with the number 0 as an example, the server determines that the 0 th code in the node 0 is a repair code, and then sends a request for acquiring the 0 th code to the node 0, and after receiving the request, the node 0 reads the 0 th code from its own memory and sends the 0 th code to the server.

And S503, calculating the lost codes in the failed node based on the downloaded repair codes.

After the server obtains the (t + 2) n-r- (t + 1) (t + s) repair codes, the lost codes in the failure nodes can be calculated by combining the elements of the (i, r + i), 2r + i, \ 8230; (r-1) r + i, rows and the t +1 row vectors.

Specifically, based on the foregoing description:

then, the multiplication of the element of the ith row by the column vector C is equal to 0, and a linear equation can be obtained. In addition, r-1 linear equations can be obtained by multiplying the elements in the g x r + i th row by the column vector C, wherein the multiplication is equal to 0, and the value of g is an integer from 1 to r-1. Namely, r linear equations can be obtained based on the elements in the rows (i, r + i,2r + i, \8230; (r-1) r + i), then the repair codes obtained based on the elements in the rows (i, r + i,2r + i, \8230; (r-1) r + i) are substituted into the r linear equations, and the joint solution obtains the following calculation results:

it can be seen that the right row vector in the calculation result (14) is composed of the r + t unknown codes, and the left matrix is the coefficient matrix of the unknown codes. In order to better distinguish between the coefficient matrix of missing codes in the failed nodes and the coefficient matrix of unknown repair codes in the busy nodes, the coefficient matrix of missing codes in (14) aboveThe matrix and the coefficient matrix of the unknown repair code are separated by vertical lines and by a matrix E _r And (3) a coefficient matrix representing r missing codes in the failed node, which can be specifically referred to in (11) above.

In addition, t × t +1 row vectors in the check matrix H may be represented as:

η _a ,v _a,1 *r+η _a ,v _a,2 *r+η _a ,…,v _a,t *r+η _a ，

a is an integer from 1 to t, v is an integer between 1 and r-1, and:

{v _a,1 *r+η _a mod r,v _a,2 *r+η _a mod r,…,v _a,t *r+η _a mod r}＝{i,η ₁ ,η ₂ ,…,η _a-1 ,η _a+1 …,η _t }。

based on the t × t +1 line vectors, t × t (t + 1) linear equations can be obtained, and by adding the r linear equations in (14), there are total r + t × t (t + 1) linear equations, and there are r + t (t + 1) unknown codes in the r + t × t (t + 1) linear equations, where r unknown codes are lost codes in the failed node, t (t + 1) unknown codes are newly added unknown codes in the t (t + 1) line vectors, and the t × t (t + 1) unknown codes are:

and forming a linear equation system by the r + t (t + 1) linear equations, and solving to obtain the r + t (t + 1) unknown codes.

In addition, the linear equation set consisting of the r + t (t + 1) linear equations may be converted to obtain a determinant of (r + t (t + 1)) which is the binary polynomial ring F ₂ [x]And the maximum degree of the polynomial may be verified to be less than

Satisfying the solvability of the system of equations.

According to the long tail repair method under the t busy nodes, the long tail repair bandwidth under the t busy nodes of the binary (n, k) MDS code is (t + 2) n-r- (t + 1) (t + s) codes. Thus, the long tail repair bandwidth ratio under t busy nodes is ((t + 2) n-r- (t + 1) (t + s))/kr. Under the same parameters and the condition of sub-packet number, the long tail repair algorithm provided by the embodiment of the application has the minimum long tail repair bandwidth in the existing repair algorithm, and the calculation complexity is reduced compared with the existing algorithm.

The above description mainly describes a method for constructing and repairing MDS codes provided in the embodiments of the present application. It is understood that each device comprises corresponding hardware structures and/or software modules for executing each function in order to realize the corresponding function. The elements and steps of the various examples described in connection with the embodiments disclosed herein may be embodied as hardware or a combination of hardware and computer software. Whether a function is performed in hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiment of the present application, the device may be divided into the functional modules according to the method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that the division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

In the case of dividing each functional module according to each function, fig. 6 shows a schematic diagram of a possible logical structure of an apparatus, which may be the server, or may be a chip in the server, or may be a processing system in the server, and the like. The apparatus 600 comprises an acquisition unit 601, a construction unit 602 and a calculation unit 603. Wherein:

an obtaining unit 601, configured to obtain a parameter n, a parameter k, and a data code to be stored; the parameter n and the parameter k indicate that the data codes are stored in k nodes of n storage nodes and indicate that check codes of the data codes are stored in r nodes of the n storage nodes; r = n-k, n is an integer greater than 1, and k and r are both integers greater than 0;

a constructing unit 602, configured to construct a check matrix H of the MDS code based on the parameter n and the parameter k; the check matrix H is r ² Line (r) ² * s) a matrix of columns, the aforementioned s = n/r; the check matrix H comprises a check matrix of each node of the n storage nodes, and the check matrix of each node is r ² R rows and r columns, the size of the check matrix formed by any r nodes in the check matrix H is r ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Is reversible;

a calculating unit 603, configured to calculate the check code based on the data code and the check matrix H; the check code and the data code obtained by calculation constitute the MDS code.

In a possible embodiment, the check matrix H is composed of S sub-check matrices S, and each sub-check matrix S is r ² Line r ² A matrix of columns; each of the aforementioned sub-check matrices Sby r ² Each of the sub-check matrices R is a matrix of R rows and R columns; the check matrix of each node consists of R sub check matrices R;

In one possible embodiment, the matrix R _u Comprising R of said sub-check matrices R of the u-th column of said matrix array,u is an integer greater than or equal to 0 and less than r;

the aforementioned matrix R _u The positions of non-zero elements in each matrix of the R sub-check matrices R, except for the diagonal lines, are determined based on the number of rows and columns of the matrix array of the matrix, and a cyclic shift rule.

In a possible embodiment, the aforementioned parameter n and the aforementioned parameter k further indicate that r of the aforementioned data codes are stored at each of the aforementioned k nodes, and r of the aforementioned check codes are stored at each of the aforementioned r nodes; the k x r data codes and r ² Forming a column vector C of n x r rows by the check codes, wherein the product of the check matrix H and the column vector C is zero; the aforementioned calculating unit 603 is specifically configured to:

In a possible implementation, the aforementioned calculating unit 603 is specifically configured to:

r check codes are calculated based on each of the sub-coefficient matrices.

For specific operations and beneficial effects of each unit in the apparatus 600 shown in fig. 6, reference may be made to the corresponding description in fig. 2 and possible method embodiments thereof, which are not described herein again.

In the case of dividing each functional module according to each function, fig. 7 shows a schematic diagram of a possible logical structure of an apparatus, which may be the above-mentioned server, or may be a chip in the server, or may be a processing system in the server, and the like. The apparatus 700 comprises a determination unit 701, a download unit 702 and a calculation unit 703. Wherein:

a determining unit 701, configured to determine a 2n-r-s repair code based on r row elements in the check matrix H, where the repair code includes a data code and/or a check code;

a downloading unit 702, configured to download the repair code in n-1 nodes that are not failed in the n storage nodes;

a calculating unit 703, configured to calculate a missing code in the failed node based on the downloaded repair code.

the determining unit 701 is specifically configured to:

In a possible implementation manner, the aforementioned calculating unit 703 is specifically configured to:

For specific operations and benefits of each unit in the apparatus 700 shown in fig. 7, reference may be made to the corresponding description in fig. 3 and possible method embodiments thereof, which are not described herein again.

In the case of dividing each functional module according to each function, fig. 8 shows a schematic diagram of a possible logical structure of an apparatus, which may be the above-mentioned server, or may be a chip in the server, or may be a processing system in the server, and the like. The apparatus 800 includes a determination unit 801, a download unit 802, and a calculation unit 803. Wherein:

a determining unit 801, configured to determine 3n-2s-r-2 repair codes based on r +2 row elements in the check matrix H, where the repair codes include data codes and/or check codes;

a downloading unit 802, configured to download the repair code in n-2 nodes of the n storage nodes except the failed node and the busy node;

a calculating unit 803, configured to calculate a missing code in the failed node based on the downloaded repair code.

In one possible embodiment, the n storage nodes are numbered as integers from 0 to n-1, the failed node is numbered as h r + i, the busy node is numbered as z r + j, h and z are both integers greater than or equal to 0 and less than s, i and j are both integers greater than or equal to 0 and less than r, i ≠ j;

the aforementioned determination unit 801 is specifically configured to:

In one possible embodiment, r of the data codes are stored in each of the k nodes, and r of the check codes are stored in each of the r nodes; the k x r data codes and r ² Each of the check codes constitutes a (r) ² * s) column vectors C of the rows, the product of the check matrix H and the column vectors C being zero;

the aforementioned calculation unit 803 is specifically configured to:

For specific operations and benefits of each unit in the apparatus 800 shown in fig. 8, reference may be made to the corresponding description in fig. 4 and its possible method embodiments, which are not described herein again.

In the case of dividing each functional module according to each function, fig. 9 shows a schematic diagram of a possible logical structure of an apparatus, which may be the above-mentioned server, or may be a chip in the server, or may be a processing system in the server, and the like. The apparatus 900 comprises a determining unit 901, a downloading unit 902 and a calculating unit 903. Wherein:

a determining unit 901, configured to determine (t + 2) n-r- (t + 1) (t + s) repair codes based on r + t × t +1 row elements in the check matrix H, where the repair codes include data codes and/or check codes;

a downloading unit 902, configured to download the repair code in n-t-1 nodes, except the failed node and the t busy nodes, of the n storage nodes;

a calculating unit 903, configured to calculate a lost code in the failed node based on the downloaded repair code.

For specific operations and benefits of each unit in the apparatus 900 shown in fig. 9, reference may be made to the corresponding description in fig. 5 and its possible method embodiment, which is not described herein again.

Fig. 10 is a schematic diagram illustrating a possible hardware structure of the apparatus provided in the present application, where the apparatus may be the server, or may be a chip in the server, or may be a processing system in the server, and the like. The apparatus 1000 comprises: a processor 1001, a memory 1002, and a communication interface 1003. The processor 1001, the communication interface 1003, and the memory 1002 may be connected to each other or connected to each other through a bus 1004.

Illustratively, the memory 1002 is used for storing computer programs and data of the apparatus 1000, and the memory 1002 may include, but is not limited to, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or portable read-only memory (CD-ROM), etc.

The communication interface 1003 includes a transmitting interface and a receiving interface, and the number of the communication interfaces 1003 may be multiple, and is used for supporting the apparatus 1000 to perform communication, such as receiving or transmitting data or messages.

The processor 1001 may illustratively be a central processing unit, a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, transistor logic, a hardware component, or any combination thereof. A processor may also be a combination of computing functions, e.g., a combination of one or more microprocessors, a digital signal processor and a microprocessor, or the like. Processor 1001 may be configured to read the program stored in memory 1002 as described above to cause device 1000 to perform the MDS construction method as described above in figure 2 and its possible embodiments.

In one possible implementation, the processor 1001 may be configured to read the program stored in the memory 1002, and perform the following operations: acquiring a parameter n, a parameter k and a data code to be stored; the parameter n and the parameter k indicate that the data codes are stored in k nodes of n storage nodes and indicate that check codes of the data codes are stored in r nodes of the n storage nodes; r = n-k, n is an integer greater than 1, and k and r are both integers greater than 0; constructing a check matrix H of the MDS code based on the parameter n and the parameter k; the check matrix H is r ² Line (r) ² * s) a matrix of columns, the aforementioned s = n/r; the check matrix H comprises a check matrix of each node of the n storage nodes, and the check matrix of each node is r ² R rows and r columns, the size of the check matrix formed by any r nodes in the check matrix H is r ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Is reversible; calculating the check code based on the data code and the check matrix H; the check code and the data code obtained by calculation constitute the MDS code.

For specific operations and beneficial effects of each unit in the apparatus 1000 shown in fig. 10, reference may be made to the corresponding description in fig. 2 and possible method embodiments thereof, which are not described herein again.

Fig. 11 is a schematic diagram illustrating a possible hardware structure of the apparatus provided in the present application, where the apparatus may be the server, or may be a chip in the server, or may be a processing system in the server, and the like. The apparatus 1100 comprises: a processor 1101, a memory 1102, and a communication interface 1103. The processor 1101, communication interface 1103, and memory 1102 may be connected to each other or to each other through a bus 1104.

Illustratively, the memory 1102 is used for storing computer programs and data of the apparatus 1100, and the memory 1102 may include, but is not limited to, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable read-only memory (CD-ROM), and the like.

The communication interface 1103 includes a sending interface and a receiving interface, and the number of the communication interfaces 1103 may be plural, so as to support the apparatus 1100 to perform communication, such as receiving or sending data or messages.

The processor 1101 may be, for example, a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, transistor logic, a hardware component, or any combination thereof. A processor may also be a combination of computing functions, e.g., a combination of one or more microprocessors, a digital signal processor and a microprocessor, or the like. The processor 1101 may be configured to read the program stored in the memory 1102, so that the apparatus 1100 performs the MDS code repair method as described in the above fig. 3 and its possible embodiments.

In one possible implementation, the processor 1101 may be configured to read the program stored in the memory 1102, and perform the following operations: determining a 2n-r-s repair code based on r row elements in the check matrix H, wherein the repair code comprises a data code and/or a check code; downloading the repair codes in n-1 nodes which are not failed in the n storage nodes; calculating the lost codes in the failed nodes based on the downloaded repair codes.

For specific operations and benefits of each unit in the apparatus 1100 shown in fig. 11, reference may be made to the corresponding description in fig. 3 and possible method embodiments thereof, which are not described herein again.

Fig. 12 is a schematic diagram illustrating a possible hardware structure of the apparatus provided in the present application, where the apparatus may be the server, or may be a chip in the server, or may be a processing system in the server, and the like. The apparatus 1200 includes: a processor 1201, a memory 1202, and a communication interface 1203. The processor 1201, the communication interface 1203, and the memory 1202 may be connected to each other or to each other through a bus 1204.

Illustratively, the memory 1202 is used for storing computer programs and data of the apparatus 1200, and the memory 1202 may include, but is not limited to, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable read-only memory (CD-ROM), and the like.

The communication interfaces 1203 include a transmitting interface and a receiving interface, and the number of the communication interfaces 1203 may be multiple, and the communication interfaces are used for supporting the apparatus 1200 to perform communication, such as receiving or transmitting data or messages.

Illustratively, the processor 1201 may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, transistor logic, a hardware component, or any combination thereof. A processor may also be a combination of computing functions, e.g., a combination of one or more microprocessors, a digital signal processor and a microprocessor, or the like. The processor 1201 may be configured to read the program stored in the memory 1202, so that the apparatus 1200 performs the MDS code repairing method as described in fig. 4 and its possible embodiments.

In a possible implementation, the processor 1201 may be configured to read a program stored in the memory 1202, and perform the following operations: determining 3n-2s-r-2 repair codes based on r +2 row elements in the check matrix H, wherein the repair codes comprise data codes and/or check codes; downloading the repair code in n-2 nodes of the n storage nodes except the failed node and the busy node; calculating the lost codes in the failed nodes based on the downloaded repair codes.

For specific operations and advantages of each unit in the apparatus 1200 shown in fig. 12, reference may be made to the corresponding description in fig. 4 and possible method embodiments thereof, which are not described herein again.

Fig. 13 is a schematic diagram illustrating a possible hardware structure of the apparatus provided in the present application, where the apparatus may be the server, or may be a chip in the server, or may be a processing system in the server, and the like. The apparatus 1300 includes: processor 1301, memory 1302, and communications interface 1303. The processor 1301, the communication interface 1303 and the memory 1302 may be connected to each other or to each other through a bus 1304.

Illustratively, the memory 1302 is used for storing computer programs and data of the apparatus 1300, and the memory 1302 may include, but is not limited to, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or portable read-only memory (CD-ROM), etc.

The communication interfaces 1303 include a transmitting interface and a receiving interface, and the number of the communication interfaces 1303 may be multiple, and the communication interfaces 1303 are used for supporting the apparatus 1300 to perform communication, such as receiving or transmitting data or messages.

The processor 1301 may illustratively be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, transistor logic, a hardware component, or any combination thereof. A processor may also be a combination of computing functions, e.g., a combination of one or more microprocessors, a digital signal processor and a microprocessor, or the like. The processor 1301 may be configured to read the program stored in the memory 1302, so that the apparatus 1300 performs the MDS code repairing method as described in fig. 5 and its possible embodiments.

In one possible implementation, the processor 1301 may be configured to read the program stored in the memory 1302, and perform the following operations: determining (t + 2) n-r- (t + 1) (t + s) repair codes based on r + t (t + 1) row elements in the check matrix H, wherein the repair codes comprise data codes and/or check codes; downloading the repair code from n-t-1 nodes except the failed node and the t busy nodes in the n storage nodes; calculating the lost codes in the failed nodes based on the downloaded repair codes.

For specific operations and benefits of each unit in the apparatus 1300 shown in fig. 13, reference may be made to the corresponding description in fig. 5 and possible method embodiments thereof, which are not described herein again.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor to implement the MDS code constructing method described in any one of the above fig. 2 and its possible method embodiments.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor to implement the MDS code repairing method described in any one of the above fig. 3 and its possible method embodiments.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor to implement the MDS code repairing method described in any one of the above fig. 4 and its possible method embodiments.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, where the computer program is executed by a processor to implement the MDS code repairing method described in any one of the above-described fig. 5 and its possible method embodiments.

Embodiments of the present application further provide a computer program product, which when read and executed by a computer, is used for implementing the method for constructing MDS codes described in any embodiment of fig. 2 and its possible method embodiments.

An embodiment of the present application further provides a computer program product, and when the computer program product is read and executed by a computer, the method for repairing MDS codes described in any embodiment of fig. 3 and its possible method embodiments is described above.

An embodiment of the present application further provides a computer program product, which when read and executed by a computer, is configured to perform the method for repairing MDS codes according to any one of the embodiments of fig. 4 and its possible method embodiments.

Embodiments of the present application further provide a computer program product, which when read and executed by a computer, performs the method for repairing MDS codes described in any one of the embodiments of fig. 5 and its possible embodiments.

In summary, the method for constructing a binary polynomial MDS code proposed by the present application has lower computational complexity compared to the conventional method for constructing an MDS code. This is because the conventional MDS codes are constructed in a finite field large enough, which results in a large computational complexity; the binary polynomial MDS code proposed in the embodiment of the present application is constructed based on a binary polynomial ring having a cyclic structure, and only an exclusive or (XOR) operation and a cyclic shift operation are involved in the encoding and decoding processes. In addition, the construction method of the binary polynomial MDS code provided by the embodiment of the application is explicit, that is, the check matrix used in the construction process is given and determined, and the MDS characteristic can be satisfied only by sufficiently increasing the parameter p; the existing method for constructing the MDS code (such as epsilon-MSR) is not explicit, and the MDS characteristic needs to be verified by a software search method under a sufficiently large finite field, which additionally brings about more complex calculation; particularly, when the parameters n and k are both large, for epsilon-MSR, even by means of a software searching method, the MDS characteristics are difficult to verify, so that the fact that an MDS code is obtained by construction cannot be guaranteed, and the fact that the code of a failed node can be correctly repaired cannot be guaranteed. In addition, based on the characteristics of the constructed check matrix H, the lost codes in the failed nodes can be repaired through a smaller repair bandwidth when the failed nodes are repaired.

The terms "first," "second," and the like in this application are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency or limitation on the number or order of execution. It will be further understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first image may be referred to as a second image, and similarly, a second image may be referred to as a first image, without departing from the scope of the various described examples. The first image and the second image may both be images, and in some cases, may be separate and distinct images.

It should also be understood that, in the embodiments of the present application, the size of the serial number of each process does not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

It will be further understood that the terms "comprises," "comprising," "includes," and/or "including," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be appreciated that reference throughout this specification to "one embodiment," "an embodiment," "one possible implementation" means that a particular feature, structure, or characteristic described in connection with the embodiment or implementation is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" or "one possible implementation" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A method of constructing a maximum distance separable MDS code, the method comprising:

acquiring a parameter n, a parameter k and a data code to be stored; the parameter n and the parameter k indicate that the data code is stored at k nodes of n storage nodes and indicate that the check code of the data code is stored at r nodes of the n storage nodes; r = n-k, n is an integer greater than 1, and k and r are both integers greater than 0;

constructing a check matrix H of the MDS code based on the parameter n and the parameter k; the check matrix H is r ² Line (r) ² * s) a matrix of columns, said s = n/r; the check matrix H comprises a check matrix of each node of the n storage nodes, and the check matrix of each node is r ² R rows, and the check matrix of any r nodes in the check matrix H is formed into r ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Is reversible;

calculating the check code based on the data code and the check matrix H; and the check code and the data code which are obtained through calculation form the MDS code.

2. The method of claim 1, wherein the check matrix H comprises S sub-check matrices S, and each sub-check matrix S is r ² Line r ² A matrix of columns; each of the sub-check matrices Sby r ² The device comprises sub-check matrixes R, a matrix control unit and a control unit, wherein each sub-check matrix R is a matrix with R rows and R columns; the check matrix of each node consists of R sub check matrixes R;

said r ² The sub-check matrixes R form a matrix array of R rows and R columns; the diagonal elements in each sub-check matrix R are nonzero; the R sub-check matrixes R in the 0 th row in the matrix array are all unit matrixes; the sub-check matrix R from the 1 st row to the R-1 st row in the matrix array has a non-zero element besides the diagonal elements, and other elements are all zero.

3. The method of claim 2, wherein the matrix R is _u R sub-check matrixes R in the u-th column in the matrix array are included, and u is an integer which is greater than or equal to 0 and smaller than R;

the matrix R _u The positions of non-zero elements in each matrix of the R included sub-check matrices R except for diagonal lines are determined based on the number of rows and columns of the matrix array of the each matrix and a cyclic shift rule.

4. The method of any of claims 1-3, wherein the parameter n and the parameter k further indicate that r of the data codes are stored at each of the k nodes, and r of the check codes are stored at each of the r nodes; the k x r data codes and r ² Forming a column vector C of n x r rows by the check codes, wherein the product of the check matrix H and the column vector C is zero;

substituting k × r data codes into a formula that the product of the check matrix H and the column vector C is zero;

converting the equation to r ² A linear equation set consisting of equations;

calculating r based on the system of linear equations ² And (c) encoding the check code.

5. The method of claim 4, wherein said calculating r is based on said system of linear equations ² The check code includes:

calculating r based on the system of linear equations ² A coefficient matrix composed of the check coding coefficients;

and calculating r check codes based on each sub coefficient matrix.

6. A method of repairing a maximum distance separable MDS code constructed by the method of any one of claims 1 to 5; the n storage nodes comprise a failure node;

the method comprises the following steps:

downloading the repair code in n-1 nodes of the n storage nodes which are not failed;

calculating missing codes in the failed node based on the downloaded repair codes.

7. The method of claim 6, wherein the n storage nodes are numbered as integers from 0 to n-1, wherein the failed node is numbered as h x r + i, wherein h is an integer greater than or equal to 0 and less than s, and wherein i is an integer greater than or equal to 0 and less than r; the row r elements comprise rows i, r + i,2 + r + i, \ 8230in the check matrix H, and (r-1) r + i;

the determining a 2n-r-s repair code based on r row elements in the check matrix H includes:

determining, based on the ith row, that an ith code in the MDS code stored by each of the n-1 nodes is the repair code;

determining ((i + g) mod r) th code in the MDS code stored by each node of a target node as the repair code based on the g r + i th row, wherein the target node comprises nodes numbered i, r + i,2r + i, \8230; (s-1) r + i except for the node numbered h r + i, and g is an integer from 1 to r-1.

8. The method of claim 7, wherein the calculating missing codes in a failed node based on the downloaded repair codes comprises:

calculating an ith code in the failed node based on the ith code in the MDS code stored by each of the n-1 nodes;

(i + g) mod r code in the failed nodes is calculated based on the ((i + g) mod r) code in the MDS code stored by each node of the target node, the calculated ith code in the failed nodes and the ith code of the n-1 nodes except the nodes numbered i, r + i,2r + i, \8230; (s-1) + r + i.

9. A method of repairing a maximum distance separable MDS code constructed by the method of any one of claims 1 to 5; the n storage nodes comprise a failure node and a busy node;

the method comprises the following steps:

10. The method of claim 9, wherein the n storage nodes are numbered as integers from 0 to n-1, the failed node is numbered as h r + i, the busy node is numbered as z r + j, h and z are both integers greater than or equal to 0 and less than s, i and j are both integers greater than or equal to 0 and less than r, i ≠ j;

the r +2 rows of elements comprise the i, r + i,2 + r + i, \ 8230, (r-1) r + i, q and u + r + q rows in the check matrix H, q is not equal to i, (u + q) mod r = i mod r, and the values of q and u are integers from 0 to r-1;

the determining 3n-2s-r-2 repair codes based on r +2 row elements in the check matrix H includes:

determining, based on the ith row, that an ith code in the MDS code stored by each of the n-2 nodes is the repair code;

determining ((i + g) mod r) th code in the MDS code stored by each node of a first target node as the repair code based on the g r + i row, wherein the first target node is the node numbered i, r + i,2r + i, \ 8230in the n storage nodes, except the node numbered h r + i in the (s-1) r + i, and g is an integer from 1 to r-1;

determining a q-th code in the MDS code stored by each node of a second target node as the repair code based on the q-th line and the u r + q-th line, wherein the second target node is the nodes except for nodes numbered i, r + i,2r + i, \8230; (s-1) _ r + i and z r + j.

11. The method of claim 10, wherein r of said data codes are stored at each of said k nodes, and r of said check codes are stored at each of said r nodes; the k x r data codes and r ² Each of said check codes constitutes one (r) ² * s) a column vector C of rows, the product of the check matrix H and the column vector C being zero;

said calculating missing codes in failed nodes based on said downloaded repair codes, comprising:

converting the arithmetic into a linear equation set consisting of r +2 equations;

calculating missing codes in the failed node based on the system of linear equations.

12. A method of repairing a maximum distance separable MDS code constructed by the method of any one of claims 1 to 5; the n storage nodes comprise a failure node and t busy nodes, wherein t is an integer larger than 1 and smaller than r-1;

the method comprises the following steps:

determining (t + 2) n-r- (t + 1) (t + s) repair codes based on r + t (t + 1) row elements in the check matrix H, the repair codes comprising data codes and/or check codes;

downloading the repair code in n-t-1 nodes of the n storage nodes except the failed node and the t busy nodes;

13. An apparatus for constructing maximum distance separable MDS codes, the apparatus comprising:

the acquisition unit is used for acquiring the parameter n, the parameter k and the data code to be stored; the parameter n and the parameter k indicate that the data code is stored at k nodes of n storage nodes and indicate that the check code of the data code is stored at r nodes of the n storage nodes; r = n-k, n is an integer greater than 1, and k and r are both integers greater than 0;

the construction unit is used for constructing a check matrix H of the MDS code based on the parameter n and the parameter k; the check matrix H is r ² Line (r) ² * s) a matrix of columns, said s = n/r; the check matrix H comprises a check matrix of each node of the n storage nodes, and the check matrix of each node is r ² R rows, and the check matrix of any r nodes in the check matrix H is formed into r ² *r ² In a binary polynomial ring F ₂ [x]mod(1+x+…+x ^p-1 ) Is reversible;

a calculation unit, configured to calculate the check code based on the data code and the check matrix H; and the check code and the data code which are obtained through calculation form the MDS code.

14. The apparatus of claim 13, wherein the check matrix H is composed of S sub-check matrices S, and each sub-check matrix S is r ² Line r ² A matrix of columns; each of the sub-check matrices Sby r ² The device comprises sub-check matrixes R, a matrix control unit and a control unit, wherein each sub-check matrix R is a matrix with R rows and R columns; each sectionThe check matrix of the point consists of R sub check matrixes R;

15. The apparatus of claim 14, wherein the matrix R is _u R sub-check matrixes R in the u-th column in the matrix array are included, and u is an integer which is greater than or equal to 0 and smaller than R;

the matrix R _u The positions of non-zero elements except diagonal lines in each of the R sub-check matrixes R are determined based on the number of rows and columns of each matrix in the matrix array and a cyclic shift rule.

16. The apparatus of any of claims 13-15, wherein the parameter n and the parameter k further indicate that r of the data codes are stored at each of the k nodes, and r of the check codes are stored at each of the r nodes; the k x r data codes and r ² Forming a column vector C of n x r rows by the check codes, wherein the product of the check matrix H and the column vector C is zero; the computing unit is specifically configured to:

converting the equation to r ² A linear equation set consisting of equations;

17. The apparatus according to claim 16, wherein the computing unit is specifically configured to:

and calculating r check codes based on each sub coefficient matrix.

18. A device for restoring a maximum distance separable MDS code constructed by the device of any one of claims 13 to 17; the n storage nodes comprise a failure node;

the repair device includes:

19. The apparatus of claim 18 wherein the n storage nodes are numbered as integers from 0 to n-1, the failed node is numbered as h x r + i, h is an integer greater than or equal to 0 and less than s, and i is an integer greater than or equal to 0 and less than r; the row r elements comprise rows i, r + i,2 + r + i, \ 8230in the check matrix H, and (r-1) r + i;

the determining unit is specifically configured to:

20. The apparatus according to claim 19, wherein the computing unit is specifically configured to:

21. A device for restoring a maximum distance separable MDS code constructed by the device of any one of claims 13 to 17; the n storage nodes comprise a failure node and a busy node;

the repair device includes:

22. The apparatus of claim 21, wherein the n storage nodes are numbered as integers from 0 to n-1, the failed node is numbered as h x r + i, the busy node is numbered as z x r + j, h and z are both integers greater than or equal to 0 and less than s, i and j are both integers greater than or equal to 0 and less than r, i ≠ j;

the determining unit is specifically configured to:

23. The apparatus of claim 22, wherein r of said data codes are stored at each of said k nodes, and wherein r of said check codes are stored at each of said r nodes; the k x r data codes and r ² Each of said check codes constituting one (r) ² * s) a column vector C of rows, the product of the check matrix H and the column vector C being zero;

the computing unit is specifically configured to:

24. A device for restoring a maximum distance separable MDS code constructed by the device of any one of claims 13 to 17; the n storage nodes comprise a failure node and t busy nodes, wherein t is an integer larger than 1 and smaller than r-1;

the device comprises:

a downloading unit, configured to download the repair code in n-t-1 nodes of the n storage nodes except the failed node and the t busy nodes;

25. An apparatus comprising a processor and a memory; wherein the memory is configured to store a computer program and the processor is configured to invoke the computer program to cause the apparatus to perform the method according to any of claims 1-5.

26. An apparatus comprising a processor and a memory; wherein the memory is adapted to store a computer program and the processor is adapted to invoke the computer program to cause the apparatus to perform the method of any of claims 6-8.

27. An apparatus comprising a processor and a memory; wherein the memory is adapted to store a computer program and the processor is adapted to invoke the computer program to cause the apparatus to perform the method of any of claims 9-11.

28. An apparatus comprising a processor and a memory; wherein the memory is adapted to store a computer program and the processor is adapted to invoke the computer program to cause the apparatus to perform the method of claim 12.

29. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of any one of claims 1-5;

or the computer program, when executed by a processor, implements the method of any of claims 6-8;

or, when executed by a processor, to implement the method of any one of claims 9 to 11;

alternatively, the computer program, when executed by a processor, implements the method of claim 12.

30. A computer program product comprising a computer program, characterized in that the computer program realizes the method of any one of claims 1-5 when executed by a processor;

or, when executed by a processor, to implement the method of any one of claims 6 to 8;

or implementing the method of claim 12 when executed by a processor.

31. A method for repairing a maximum distance separable MDS code, wherein the MDS code has parameters (n, k); the n and k indicate that k of n storage nodes store a data encoding of the MDS code and r of the n storage nodes store a check encoding of the data encoding of the MDS code; r = n-k, n is an integer greater than 1, k is an integer greater than 0, and r is an integer greater than 3; check matrix of MDS code

r = n-k; of said check matrix HSize r ² Row r + n column, matrix H' _i The value of i is an integer between 0 and r-1;

the n storage nodes are averagely divided into r groups, each group comprises s nodes, and s = n/r; the v1 th node of the u1 th grouping is a failure node, and the v1 'th node of the u1' th grouping is a busy node; u1' ≠ u 1; the method comprises the following steps:

based on matrix H' ₀ The row matrix H (0, u1 ^) and the matrix H 'consisting of the u1^ th row in (a)' ₃ A row matrix h (3, u1 ^) of u1^ rows of downloading the first part of repair code for the failed node, wherein,

calculating missing codes in the failed node based on the downloaded first partial repair codes.

32. The method of claim 31, wherein the n storage nodes store r codes per storage node, and wherein the MDS code is represented as a column matrix C of one (r x n) row ^T ；

The matrix is H' ₀ The row matrix H (0, u1 ^) and the matrix H 'consisting of the u1^ th row in (a)' ₃ The row matrix h (3, u1 ^) consisting of the u1^ th row downloads the first partial repair code of the failed node, comprising:

obtaining a row matrix E based on the linear combination of H (0, u1 ^), H (3, u1 ^) and lambda, wherein lambda is an element on a diagonal line of a check sub-matrix corresponding to the busy node in the check matrix H', and the row matrix E and the column matrix C ^T The multiplication is equal to zero;

based on the row matrix E and the column matrix C ^T The first portion of repair code is downloaded by multiplying an equation equal to zero.

33. The method of claim 30 or 31, wherein r is 4.

34. The method according to any one of claims 30-33, further comprising:

based on the H' ₀ To H' _r-1 Downloading a second part of repair codes of the failed nodes from the u1 th row of each matrix in the r matrixes;

said calculating missing codes in said failed node based on said downloaded first partial repair codes, comprising:

calculating a missing code in the failed node based on the downloaded first partial repair code and the second partial repair code.