WO2010049276A1

WO2010049276A1 - Multiple erasure protection

Info

Publication number: WO2010049276A1
Application number: PCT/EP2009/063453
Authority: WO
Inventors: Mike Harvey
Original assignee: International Business Machines Corporation
Priority date: 2008-10-28
Filing date: 2009-10-15
Publication date: 2010-05-06

Abstract

A method and system multiple erasure protection are provided. The method includes providing (202) a set of a number Nd of data elements; generating (201) a number Np of parity elements using constant weight codes, wherein Nd=binomial (Np, C) with a fan-out parameter of C, including: providing (203) an identity matrix for the Nd data elements; creating (204) a parity matrix with all binary strings of Np digits containing exactly C number of "1"s for the Nd data elements, wherein each data element contributes to a different combination of parity elements. The method further includes: defining (205) parity sets with one parity element, and data elements corresponding to a column of the parity matrix, that sum to zero under XOR operations; and generating (206) parity sets of linear combinations of columns of the parity matrix combined using XOR operations. Data recovery is carried out by the steps of: selecting (401) parity sets including non-erased parity elements; reading (402) the parity elements of the selected parity sets; applying (403) XOR operations to the parity sets to recover erased data elements; and writing (404) the recovered data elements.

Description

MULTIPLE ERASURE PROTECTION

This invention relates to the field of multiple erasure protection. In particular, the invention relates to multiple erasure protection with constant weight extendable codes.

Methods of protecting against erasures in a data channel, such as a RAID (Redundant Array of Independent Disks) array, are known in the prior art. Existing methods include Reed- Solomon codes, HoVer ("HoVer Erasure Codes For Disk Arrays" by J. L. Hafner, Dependable Systems and Networks 2006, Pages 217 - 226, (Digital Object Identifier 10.1109/DSN.2006.40), and Modified Even-Odd (MEO) codes. Reed-Solomon codes can have very high space efficiency but require special hardware for good performance. HoVer and MEO work with a simple XOR operation.

The characteristics of existing codes depend very much on the number of elements. For example, a Reed-Solomon checksum code with a large number of elements is very space efficient, but the rebuild times, in the event of an erasure, will be long because a large number of elements will be read. Further, in the usual arrangement with data + P + Q, where P is an XOR of the data and Q is a Reed-Solomon checksum, all the remaining data becomes critical in the event of two erasures, i.e. any further erasures will necessarily result in data loss, and this condition will persist until one of the erased elements is rebuilt.

According to a first aspect of the present invention there is provided a method for multiple erasure protection, comprising: providing a set of a number Nd of data elements; generating a number Np of parity elements using constant weight codes, wherein Nd=binomial (Np, C) with a fan-out parameter of C, including: providing an identity matrix for the Nd data elements; creating a parity matrix with all binary strings of Np digits containing exactly C number of "l"s for the Nd data elements, wherein each data element contributes to a different combination of parity elements.

According to a second aspect of the present invention there is provided a computer program product stored on a computer readable storage medium for multiple erasure protection, comprising computer readable program code means for performing the steps of: providing a set of a number Nd of data elements; generating a number Np of parity elements using constant weight codes, wherein Nd=binomial (Np, C) with a fan-out parameter of C, including: providing an identity matrix for the Nd data elements; creating a parity matrix with all binary strings of Np digits containing exactly C number of "l"s for the Nd data elements, wherein each data element contributes to a different combination of parity elements.

According to a third aspect of the present invention there is provided a system for multiple erasure protection of data, comprising: a processor; an input mechanism for inputting a set of a number Nd of data elements; a module for generating a number Np of parity elements using constant weight codes, wherein Nd=binomial (Np, C) with a fan-out parameter of C, including: an identity matrix generator for the Nd data elements; a parity matrix generator with all binary strings of Np digits containing exactly C number of "l"s for the Nd data elements, wherein each data element contributes to a different combination of parity elements.

A method of protecting against up to three erasures in a data channel, such as a RAID array, is described. The method uses a number of checksum elements to protect a number of data elements against erasure.

The subject matter regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of the specification. The invention, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings in which:

Figure 1 is a schematic representation of a data structure in accordance with the present invention;

Figure 2 is a flow diagram of a method in accordance with an aspect of the present invention; Figure 3 is a flow diagram of a method in accordance with another aspect of the present invention;

Figure 4 is a flow diagram of a method in accordance with an further aspect of the present invention;

Figure 5 is a block diagram of a system in accordance with the present invention; and

Figure 6 is a block diagram of a computer system in which the present invention may be implemented.

It will be appreciated that for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numbers may be repeated among the figures to indicate corresponding or analogous features.

In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to obscure the present invention.

The use of a family of Constant Weight Extendable (CWE) codes to correct erasures in a data channel is described. The combinatorial codes consist of:

• A fan-out parameter C equal to the number of parity elements contributed to by each data element (sometimes called the out-degree of the data element).

• A number Np of parity elements.

• A number Nd = binomial(Np, C) (sometimes called N choose C) data elements. This results in a CWE[Nd, Np, C] code. Not all data elements need to be used. Any unused data elements are taken to be zero and contribute nothing to the parity calculation. All checksums (parity elements) can be calculated using only XOR operations.

The number of parity elements Np, and hence the number of data elements Nd, may be increased dynamically because the parity sets of CWE[Nd, Np, C] are all subsets of the parity sets in CWE[Nd+x, Np+n, C], where n is a positive integer and x (also positive) depends on Np, C and n.

Note that although Np and Nd are dynamic, the fan-out C is fixed. It will be clear to one skilled in the art that a fan-out of 2 will give codes with performance very similar to HoVer or MEO, while a fan-out of 3 gives codes with protection against 3 erasures and, as Np increases, higher space efficiency. Higher fan-outs are possible and although no increase against erasure is gained and the cost of modifying a data element with all its parity elements goes up, the space efficiency also tends to improve.

The 3-way CWE codes show some very desirable characteristics. CWE[4, 4, 3] is an alternative to RAID 10, having 50% space efficiency, 3-way protection, efficient full-stride operations and the ability to recover from the loss of all 4 data elements. CWE[56, 8, 3] has 87.5% space efficiency, 3-way protection and could be used in large arrays where both space efficiency and a high degree of protection are required. While even larger codes with hundreds or even thousands of data elements have potential to be used in MAID (Massive Array of Idle Disk) arrays. Not all data elements need be used, so CWE[20, 6, 3] could be used with only 18 of its data elements to give 75% space efficiency (18 data + 6 parity).

The code CWE[4, 4, 3] is described in detail and it is shown how it can be extended to CWE[IO, 5, 3], CWE[20, 6, 3], CWE[35, 7, 3] and so on.

The general layout of the Constant Weight Extendable (CWE) code is illustrated in Figure 1. Figure 1 shows a representation 100 of the construction of a CWE code with a number Nd of data elements 101 and a number Np parity elements 102. The Nd data elements 101 are represented as an identity matrix 103 and the parity elements 102 are binary strings 104 of Np digits containing exactly C "l"s.

The size of the code is shown in the table below:

Available data Required parity Erasures Fan-out = data Zero sets elements, Nd elements, Np without loss elements/parity see text

4 4 3 3 16

10 5 3 6 32

20 6 3 10 64

35 7 3 15 128

56 8 3 21 256

Np choose 3 = Np 3 (Np-l)(Np-2)/2 2^ΛNp binomial(Np, 3)

Example Codes are given below:

CWE[4, 4, 3] dθdl d2d3pθpl p2p3 0: 1 0 0 0 0 1 1 1 1: 0 1 0 0 1 0 1 1 2: 0 0 1 0 1 1 0 1 3: 0 0 0 1 1 1 1 0

CWE[IO, 5, 3]

d0 dl d2 d3 d4 d5 d6 d7 d8 d9 p0 pi p2 p3 p4

0: 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1

1: 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1

2: 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1

3: 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0

4: 0 0 0 0 1 0 0 0 0 0 1 0 0 1 1

5: 0 0 0 0 0 1 0 0 0 0 1 0 1 0 1 6: 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0

7: 0 0 0 0 0 0 0 1 0 0 1 1 0 0 1

8: 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0

9: 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0

For larger codes it is convenient to omit the identity matrix on the left: CWE[35, 7, 3] pOpl p2p3p4p5p6 dθ: 1 1 1 0 0 0 0 dl: 1 1 0 1 0 0 0 d2: 1 0 1 1 0 0 0 d3: 0 1 1 1 0 0 0 d4: 1 1 0 0 1 0 0 d5: 1 0 1 0 1 0 0 d6: 0 1 1 0 1 0 0 d7: 1 0 0 1 1 0 0 d8: 0 1 0 1 1 0 0 d9: 0 0 1 1 1 0 0 dlO: 1 1 0 0 0 1 0 dll: 1 0 1 0 0 1 0 dl2: 0 1 1 0 0 1 0 dl3: 1 0 0 1 0 1 0 dl4: 0 1 0 1 0 1 0 dl5: 0 0 1 1 0 1 0 dl6: 1 0 0 0 1 1 0 dl7: 0 1 0 0 1 1 0 dl8: 0 0 1 0 1 1 0 dl9: 0 0 0 1 1 1 0 d20: 1 1 0 0 0 0 1 d21: 1 0 1 0 0 0 1 d22: 0 1 1 0 0 0 1 d23: 1 0 0 1 0 0 1 d24: 0 1 0 1 0 0 1 d25: 0 0 1 1 0 0 1 d26: 1 0 0 0 1 0 1 d27: 0 1 0 0 1 0 1 d28: 0 0 1 0 1 0 1 d29: 0 0 0 1 1 0 1 d30: 1 0 0 0 0 1 1 d31: 0 1 0 0 0 1 1 d32: 0 0 1 0 0 1 1 d33: 0 0 0 1 0 1 1 d34: 0 0 0 0 1 1 1

A process is described below which can generate these tables for given values of C and Np.

How to use the codes to protect data

First, looking at CWE[4, 4, 3], the case with C=3 and the number of parity elements Np=4. There are just 4 ways to choose 3 elements from 4 - by omitting each element in turn, so that gives us Nd=4 data elements.

To illustrate the use of this code in a disk array, consider a stride of data consisting of 4 data elements written to 4 disks dθ...d3, together with 4 parity disks pθ...p3 constructed in such a way that each data element contributes to a different combination of 3 parity disks. For example, d0 contributes to all parity disks except pθ, dl contributes to all except pi etc. The result is the following table.

CWE[4, 4, 3]

dOdl d2d3pθplp2p3 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 0 0 1 1 1 1 0

To be clear, the right hand side of the table is constructed row-by-row, using all binomial(Np, C) combinations of pθ..p3. Then the parity columns are read to find the parity sets. The parity elements can be defined for this code as follows:

pO = xor(dl, d2, d3) pi = xor(d0, d2, d3) p2 = xor(dO, dl, d3) p3 = xor(d0, dl, d2)

Perhaps a better way to look at this is to define parity sets which sum to zero (under XOR). The idea is that any parity set which is missing exactly one element due to erasure can be used to reconstruct the missing element.

Defined Parity Sets {pθ, dl, d2, d3} {pi, dθ, d2, d3} {p2, dθ, dl, d3) {p3, dθ, dl, d2}

These are not the only parity sets. In fact any linear combination of the parity elements, together with the appropriate data elements (combined using XOR) makes a parity set. The effect of using XOR is to retain just those data elements that occur an odd number of times in the parity elements in the set. Here is a complete list of parity sets for this code:

All Parity Sets

z[0] = {} z[l] = {p3, dθ, dl, d2}* z[2] = {p2, dθ, dl, d3}* z[3] = {p2, _P3, d2, d3} z[4]= {pl,dθ,d2,d3}* z[5]= {pl,_P3,dl,d3} z[6]= {pl,p2,dl,d2} z[7]= {pl,p2,p3,dθ} z[8]= {_P0,dl,d2,d3}* z[9]= {p0,_P3,d0,d3} z[10] = {p0,_P2,d0,d2}

z[12] = {pθ,pl,dθ, dl} z[13] = {pθ,pl,p3,d2} z[14] = {pO, pi, p2, d3} z[l 5] = {pO, pi, p2, p3, dO, dl, d2, d3}

* indicates a Defined Parity Set - all the rest are calculated.

Referring to Figure 2, a flow diagram 200 shows the basic steps involved in the construction of data elements in accordance with the described method. The number of parity elements Np and the fan-out parameter C are chosen 201 and the number of data elements Nd is calculated as Nd=Binary(Np, C). A stride of data from a data structure written to disks such as a RAID array is input 202. An identity matrix is provided 203 for Nd data elements. The parity matrix is constructed 204 with Np parity rows, with each data element contributing to different combinations of parity elements as described above. Parity sets are defined 205with one parity element and data elements corresponding to a column of the parity matrix that sum to zero under XOR operations. The parity columns are read 206 to find parity sets which sum to zero under XOR operations.

Writing the Data and Parity Elements

In a full-stride write, the parity elements can be calculated from the data elements as shown. In addition to writing the 4 data elements (4W), the operation takes 6 binary XORS (6X) between elements and an additional 4 writes (4W) for the parity elements: To create pO, pi, p2 and p3

Let tl = xor(dO, dl)

Let t2 = xor(d2, d3), then

z[8] => pO = xor(dl, d2, d3) = xor(dl, t2) z[4] => pi = xor(dO, d2, d3) = xor(dO, t2) z[2] => p2 = xor(dO, dl, d3) = xor(d3, tl) z[l] => p3 = xor(dO, dl, d2) = xor(d2, tl)

Modifying a data element involves updating 3 parity elements. This is inevitable with 3-way protection. It needs 4 read operations (4R), 4 XOR operations (4X) and 4 write operations (4W). For example:

To modify dθ, pi, p2 and p3 must also be updated. First read the old data dO old, pl old, p2_old and p3_old:

Let t = xor(dO, dO old)

z[4] => pi = xor(t, pl old) z[2] => p2 = xor(t, p2_old) z[l] => p3 = xor(t, p3_old)

Referring to Figure 3, a flow diagram 300 shows the basic steps of modifying a data element. A data element is modified 301 and the previous data element is read 302 together with the parity elements. XOR operations are applied 303 to the previous data element and the modified data element to obtain t. The parity elements are updated 304 by applying an XOR operation oft with the previous parity elements.

In the special case of CWE[4, 4, 3] a further optimisation is possible when modifying a single data element. It is not necessary to read the data being changed. Instead, the data is read from the remaining 3 data elements, generate the 3 parity elements that need to change using all the data and write the modified data and the 3 changed parity elements. Thus there are 3 read operation (3R), 5 XOR operations (5X) and 4 write operations (4W).

Recovery

It is demonstrated that recovery can be made from 3 erasures and, in some cases, 4 erasures. In particular, the code allows the recovery of all 4 data elements from the 4 parity elements. The recovery takes 4 disk reads (4R), 6 binary XORs (6X) and 4 disk writes 4W, so 4R+6X+4W in total:

To recover dθ, dl, d2 and d3

Let tl = xor(pO, pi) Let t2 = xor(p2, p3)

z[7] => dO = xor(pl, p2, p3) = xor(pl, t2) z[l l] => dl = xor(pO, p2, p3) = xor(pO, t2) z[13] => d2 = xor(pO, pi, p3) = xor(tl, p3) z[14] => d3 = xor(pO, pi, p2) = xor(tl, p2)

One Erasure

It is, of course, possible to recover any single data element from one of the three Defined Parity Sets which contain it, but it may be preferable to use one of the parity sets; z[7], z[l 1], z[13] or z[14]. Each contains 3 parity elements and only one data element, allowing the data element to be reconstructed using only parity elements. The cost of this recovery is 3 disk reads (3R), 2 binary XORs (2X) and one disk write (IW), so 3R+2X+1W.

For example: To recover d3

z[14] => d3 = xor(pO, pl, p2) Two Erasures

It has already been shown that all data elements can be recovered using just the parity elements and vice versa. Because of the symmetry of the code all such cases are essentially similar. Here is an example to show that the number of steps involved is 4R+3X+2W:

For example: To recover dl and d3:

Let t = xor(pO, p2) z[l l] => dl = xor(pO, p2, p3) = xor(t, p3) z[14] => d3 = xor(pO, pi, p2) = xor(t, pi)

For two erasures, if one is data and one is parity, consider the 4 Defined Parity Sets; the data element occurs in 3 of the 4 sets, the parity in only one. Therefore at least two of the sets have lost just the erased data element and either may be used to reconstruct it. There are essentially two cases according to whether or not the parity element and the data element occur in the same defined parity set or not. Both can be achieved in 4R+3X+2W

For example: To recover p2 and d2 using z[2] and z[4]:

Let t = xor(dl, d3) z[4] => d2 = xor(pO, dl, d3) = xor(pO, t) z[2] => p2 = xor(dO, dl, d3) = xor(dO, t)

For example: To recover p2 and dl using z[l] and z[3]:

Let t = xor(p3, d2) z[l] => dl = xor(p3, dθ, d2) = xor(t, dθ) z[3] => p2 = xor(p3, d2, d3) = xor(t, d3) Three Erasures

It has already been shown that if all the erasures are data, or all are parity it is possible to recover them. That leaves the cases where the erasures are 1 data and 2 parity or 2 data and 1 parity. It is shown that there must be at least one parity set with exactly one erasure in either case. Consider the defined parity sets z[l], z[2], z[4] and z[8]: z[8] = {pθ, dl, d2, d3} z[4] = {pi, dθ, d2, d3} z[2] = {p2, dθ, dl, d3) z[l] = {p3, dθ, dl, d2}

The case of 1 data and 2 parity is solved by noting that each data element is in 3 of the sets and each parity is in one only. So there must be either one or two parity sets that have lost just the data element but not the parity. The data element is recovered using this set then the parity is recovered as before.

If the erasure is 2 data and 1 parity element, first note that for every pair of data elements, 2 sets contain both of the data elements and two sets contain exactly one of the elements. At least one of these 2 sets does not have the parity element erased. This set is used to recover one data element, then proceed as for two erasures. An upper bound on the cost of recovery is 5R+6X+3 W. In the example below the cost is 4R+5X+3 W:

For example: To recover pθ, pi and dθ:

Let t = xor(dl, d2)

z[l] => dO = xor(p3, dl, d2) = xor(p3, t) z[8] => pO = xor(dl, d2, d3) = xor(t, d3) z[12] => pi = xor(pO, dθ, dl)

Referring to Figure 4, a flow diagram 400 shows the basic steps in recovery of data elements, recovery of parity elements, or recovery of a combination of data and parity elements. Parity sets are selected 401 that can re-generate the erased data or parity elements. The parity elements or data elements needed for the parity sets are read 402. XOR operations are applied 403 to the defined parity sets to recover the erased data. The recovered data or parity elements are written 404.

Extending the codes

A feature of these codes is that each one is a subset of the subsequent one. It can be observed that the rows dθ..d3 and columns pθ..p3 of the CWE[20, 63] table give CWE[4, 4, 3], and that rows dθ..d9 with columns pθ..p5 five CWE[IO, 5, 3]. In the table below, the order of the parity columns is reversed to make this clear.

CWE[20, 6, 3]

pO pi p2 p3 p4 p5 dθ: 1 1 1 0 0 0 dl: 1 1 0 1 0 0 d2: 1 0 1 1 0 0 d3: 0 1 1 1 0 0 d4: 1 1 0 0 1 0 d5: 1 0 1 0 1 0 d6: 0 1 1 0 1 0 d7: 1 0 0 1 1 0 d8: 0 1 0 1 1 0 d9: 0 0 1 1 1 0 dlO: 1 1 0 0 0 1 dll: 1 0 1 0 0 1 dl2: 0 1 1 0 0 1 dl3: 1 0 0 1 0 1 dl4: 0 1 0 1 0 1 dl5: 0 0 1 1 0 1 dl6: 1 0 0 0 1 1 dl7: 0 1 0 0 1 1 dl8: 0 0 1 0 1 1 dl9: 0 0 0 1 1 1

The defined zero sets of each code (derived from the columns of the table) are subsets of the zero sets of all larger codes with the same C. This is apparent when it is considered that the top n rows of the table are contained within the top n + x rows for any positive integer x.

The codes are therefore extendable as smaller codes can be converted to larger codes by the addition of elements that have been initialized to zero.

It is shown that data recovery is possible from any one, two or three erasures. The cost of the recovery is not considered here, except to note that the cost of recovery increases in line with the fan-out given in the Code Size Table. For example, using CWE[56, 8, 3], the fan- out is 21 (21 data elements contribute to each parity element), so the cost of recovering a single element would be 2 IR + 2OX +1W. As with all codes, there is an unavoidable tradeoff between efficiency and cost of recovery.

Recovery from one erasure

Each data element can be recovered from any of the three zero sets to which it contributes.

Parity can be recovered from data.

Recovery from two erasures

If two data elements are lost, they have a maximum of two parity elements in common. As each data element contributes to three parity elements, there must exist a parity element for each data element that contains that data element but not the other. If one data and one parity elements are lost, at least two parity sets remain from which to recover the data. After recovering the data, or if both losses are parity elements, the parity elements are rebuilt from the data elements.

Recovery from three erasures There are four possibilities for the erasures; 3 parity elements, 2 parity and 1 data, 1 parity and 2 data and finally 3 data.

Three parity erasures may be recovered from the data.

Two parity and one data - there is still one parity element containing the missing data element, and as all other data is present, the data element is recoverable. Then the two parity elements may be recovered from the data.

One parity and two data - just as for two data erasures there must be at least one parity element for each of the two data elements containing that data element but not the other. In this case one of these two parity elements might have been erased. Recover using the other one, then proceed as for two erasures.

Three data - this provides the only case where recovery is not possible using the defined parity sets only. Consider the three erased data elements. Each contributes to exactly three parity elements, giving nine contributions in all. The number of parity elements contributed to by the erasures must be at least three and, in fact cannot be exactly three because that would imply that there are three identical lines in the abbreviated code table. In fact all the lines are different. If there are five or more parity elements contributed to then, when we distribute the nine contributions among them, we see that one of the parity sets must be missing only one element (9/5 < 2), so we recover using that parity set, then proceed as for two erasures. That leaves the case where the nine erasures' contributions are distributed among exactly four parity elements. For example, the erasure of dθ, dl and d2 in the table above is such a set. The parity elements they contribute to being pθ, pi, p2 and p3

pθ pl p2 p3 p4 p5 dθ: 1 1 1 0 0 0 dl : 1 1 0 1 0 0 d2: 1 0 1 1 0 0 Nine erasures in three rows distributed between 4 columns means that one column contains contributions from all three erasures. The other columns contain contributions from two erasures each. Note; There cannot be a column with only one contribution because that would imply that two rows in the table were the same, which is impossible from the definition of the code. So, one column has three contributions from erased data elements and the other three have two each. We take the XOR of the parity set with 3 erasures with any of the parity sets with two erasures to give a new parity set with exactly one erasure. We use this to recover one of the data elements and then proceed as for two erasures.

Pseudo-code to generate CWE[Nd, Np, C]

Let Codeword = C x "1" followed by (Np - C) x "0" Loop:

Output Codeword

If Codeword does not contain "10" then stop

Split Codeword before and after the first occurrence of "10" into 3 parts to give <part 1> "10"<part3>

Let Codeword = Reverse(<partl>) "01" <part2>

Goto Loop

Starting with Codeword = "11100000" and, after formatting the output, this will produce:

p0 pi p2 p3 p4 p5 p6 p7 dθ: 1 1 1 0 0 0 0 0 dl: 1 1 0 1 0 0 0 0 d2: 1 0 1 1 0 0 0 0 d3: 0 1 1 1 0 0 0 0 d4: 1 1 0 0 1 0 0 0 d5: 1 0 1 0 1 0 0 0 d6: 0 1 1 0 1 0 0 0 d7: 1 0 0 1 1 0 0 0 d8: 0 1 0 1 1 0 0 0 d9: 0 0 1 1 1 0 0 0 dlO: 1 1 0 0 0 1 0 0 dll: 1 0 1 0 0 1 0 0 dl2: 0 1 1 0 0 1 0 0 dl3: 1 0 0 1 0 1 0 0 dl4: 0 1 0 1 0 1 0 0 dl5: 00 1 1 0 1 00 dl6: 1 0 0 0 1 1 0 0 dl7: 0 1 0 0 1 1 0 0 dl8: 0 0 1 0 1 1 0 0 dl9: 0 0 0 1 1 1 0 0 d20: 1 1 0 0 0 0 1 0 d21: 1 0 1 0 0 0 1 0 d22: 0 1 1 0 0 0 1 0 d23: 1 0 0 1 0 0 1 0 d24: 0 1 0 1 0 0 1 0 d25: 0 0 1 1 0 0 1 0 d26: 1 0 0 0 1 0 1 0 d27: 0 1 0 0 1 0 1 0 d28: 0 0 1 0 1 0 1 0 d29: 0 0 0 1 1 0 1 0 d30: 1 0 0 0 0 1 1 0 d31: 0 1 0 0 0 1 1 0 d32: 0 0 1 0 0 1 1 0 d33: 0 0 0 1 0 1 1 0 d34: 0 0 0 0 1 1 1 0 d35: 1 1 0 0 0 0 0 1 d36: 1 0 1 0 0 0 0 1 d37: 0 1 1 0 0 0 0 1 d38: 1 0 0 1 0 0 0 1 d39: 0 1 0 1 0 0 0 1 d40: 0 0 1 1 0 0 0 1 d41: 1 0 0 0 1 0 0 1 d42: 0 1 0 0 1 0 0 1 d43: 0 0 1 0 1 0 0 1 d44: 0 0 0 1 1 0 0 1 d45: 1 0 0 0 0 1 0 1 d46: 0 1 0 0 0 1 0 1 d47: 0 0 1 0 0 1 0 1 d48: 0 0 0 1 0 1 0 1 d49: 0 0 0 0 1 1 0 1 d50: 1 0 0 0 0 0 1 1 d51: 0 1 0 0 0 0 1 1 d52: 0 0 1 0 0 0 1 1 d53: 0 0 0 1 0 0 1 1 d54: 0 0 0 0 1 0 1 1 d55: 0 0 0 0 0 1 1 1

Referring to Figure 5, a block diagram shows a system 500. The system 500 includes data 510 including data elements 511 for which a parity matrix 512 is stored as well as defined parity sets 512.

A module for generating parity elements 520 is provided from data elements 511 input via an input mechanism 501. The module 520 includes an identity matrix generator 521 for the data elements 511 and a parity matrix generator 522 which generates a parity matrix with all binary strings of parity digits containing exactly C number of "l"s for the data elements.

A parity set module 502 defines parity sets with one parity element and data elements corresponding to a column of the parity matrix 512 that sum to zero under XOR operations. A parity set generator 503 is also included which reads linear combinations of the parity matrix combined using XOR operations. The parity sets 512 are stored in the data 510.

The system 500 also includes a data element modifier 530 to make corresponding modification to the parity matrix 512. A modified data element is input via the input mechanism 501. The modifier 503 includes a data/parity reader 531 for reading a previous data element and its parity elements from the data 510, and XOR operator 532 to apply an XOR operation to the previous data element and the modified data element to obtain t, and an data/parity writer 533 for updating the parity elements by applying an XOR operation oft and the previous parity elements.

The system 500 includes a recovery module 540 for recovery of data or parity elements using the remaining data elements 511, the parity matrix 512 and the parity sets 513. The recovery module 540 includes a parity set selector 541 for selecting parity sets including non-erased parity elements or data elements, a data/parity reader 542 for reading the parity elements or the data elements of the selected parity sets, an XOR operator 543 for applying XOR operations to the parity sets to recover erased data elements or parity elements, and a data/parity writer 444 for writing the recovered data elements or parity elements.

The system 500 may also include a coder extender 504 for extending the codes by converting a smaller code to a larger code by the addition of elements that have been initialized to zero.

The system 500 may further also include a dynamic increaser 505 for dynamically increasing the number of parity elements and hence the number of data elements.

Referring to Figure 6, an exemplary system for implementing the invention includes a data processing system 600 suitable for storing and/or executing program code including at least one processor 601 coupled directly or indirectly to memory elements through a bus system 603. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.

The memory elements may include system memory 602 in the form of read only memory (ROM) 604 and random access memory (RAM) 605. A basic input/output system (BIOS) 606 may be stored in ROM 604. System software 607 may be stored in RAM 605 including operating system software 608. Software applications 610 may also be stored in RAM 605.

The system 600 may also include a primary storage means 611 such as a magnetic hard disk drive and secondary storage means 612 such as a magnetic disc drive and an optical disc drive. The drives and their associated computer-readable media provide non-volatile storage of computer-executable instructions, data structures, program modules and other data for the system 600. Software applications may be stored on the primary and secondary storage means 611, 612 as well as the system memory 602.

The computing system 600 may operate in a networked environment using logical connections to one or more remote computers via a network adapter 616.

Input/output devices 613 can be coupled to the system either directly or through intervening I/O controllers. A user may enter commands and information into the system 600 through input devices such as a keyboard, pointing device, or other input devices (for example, microphone, joy stick, game pad, satellite dish, scanner, or the like). Output devices may include speakers, printers, etc. A display device 614 is also connected to system bus 603 via an interface, such as video adapter 615.

A system for generating parity matrices using the described methods may be provided as a service to a customer over a network.

The invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In an embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.

The invention can take the form of a computer program product accessible from a computer- usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus or device.

The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk read only memory (CD-ROM), compact disk read/write (CD-R/W), and DVD.

Improvements and modifications can be made to the foregoing without departing from the scope of the present invention.

Claims

1. A method for multiple erasure protection, comprising: providing (202) a set of a number Nd of data elements; generating (201) a number Np of parity elements using constant weight codes, wherein Nd=binomial (Np, C) with a fan-out parameter of C, including: providing (203) an identity matrix for the Nd data elements; creating (204) a parity matrix with all binary strings of Np digits containing exactly C number of "l"s for the Nd data elements, wherein each data element contributes to a different combination of parity elements.

2. A method as claimed in claim 1, including: defining (205) parity sets with one parity element and data elements corresponding to a column of the parity matrix, that sum to zero under XOR operations; and generating (206) parity sets of linear combinations of columns of the parity matrix combined using XOR operations.

3. A method as claimed in claim 2, wherein data recovery includes: selecting (401) parity sets including non-erased parity elements; reading (402) the parity elements of the selected parity sets; applying (403) XOR operations to the parity sets to recover erased data elements; and writing (404) the recovered data elements.

4. A method as claimed in claim 3, including: selecting (401) parity sets including non-erased parity elements and data elements; reading (402) the parity elements and the data elements of the selected parity sets; applying (403) XOR operations to the parity sets to recover erased data elements; and writing (404) the recovered data elements.

5. A method as claimed in claim 3 or claim 4, including: selecting (401) parity sets including non-erased parity elements and data elements; reading (402) the parity elements and the data elements of the selected parity sets; applying (403) XOR operations to the parity sets to recover erased data elements and erased parity elements; and writing (404) the recovered data elements and parity elements.

6. A method as claimed in any one of the preceding claims, including extending the codes by converting a smaller code to a larger code by the addition of elements that have been initialized to zero.

7. A method as claimed in any one of the preceding claims, wherein the number of parity elements Np, and hence the number of data elements Nd, is increased dynamically because the parity sets of the constant weighted codes for [Nd, Np, C] are all subsets of the parity sets in constant weighted codes for [Nd+x, Np+n, C], where n is a positive integer and x is a positive integer which depends on Np, C and n.

8. A method as claimed in any one of the preceding claims, wherein not all data elements are used, with any unused data elements taken to be zero and contribute nothing to the parity calculation.

9. A method as claimed in any one of the preceding claims, wherein modifying a data element includes: reading (302) a previous data element and its parity elements; applying (303) an XOR operation to the previous data element and a modified data element to obtain t; and updating (304) the parity elements by applying an XOR operation of t and the previous parity elements.

10. A computer program product stored on a computer readable storage medium for multiple erasure protection, comprising computer readable program code means for performing the steps of: providing (202) a set of a number Nd of data elements; generating (201) a number Np of parity elements using constant weight codes, wherein Nd=binomial (Np, C) with a fan-out parameter of C, including: providing (203) an identity matrix for the Nd data elements; creating (204) a parity matrix with all binary strings of Np digits containing exactly C number of "l"s for the Nd data elements, wherein each data element contributes to a different combination of parity elements.

11. A system for multiple erasure protection of data, comprising: a processor (601); an input mechanism (501) for inputting a set of a number Nd of data elements; a module (520) for generating a number Np of parity elements using constant weight codes, wherein Nd=binomial (Np, C) with a fan-out parameter of C, including: an identity matrix generator (521) for the Nd data elements; a parity matrix generator (522) with all binary strings of Np digits containing exactly C number of "l"s for the Nd data elements, wherein each data element contributes to a different combination of parity elements.

12. A system as claimed in claim 11, including: a parity set module (502) for defining parity sets with one parity element and data elements corresponding to a column of the parity matrix that sum to zero under XOR operations; and a parity set generator (503) for generating parity sets of linear combinations of columns of the parity matrix combined using XOR operations.

13. A system as claimed in claim 12, wherein the system includes a data recovery module (540) including: a parity set selector (541) including non-erased parity elements or data elements; a reader (542) for reading the parity elements or the data elements of the selected parity sets; an XOR operator (543) for applying XOR operations to the parity sets to recover erased data elements or erased parity elements; and a writer (544) for writing the recovered data elements or parity elements.

14. A system as claimed in any one of claims 11 to 13, including a code extender (504) for extending the codes by converting a smaller code to a larger code by the addition of elements that have been initialized to zero.

15. A system as claimed in any one of claims 11 to 14, including a dynamic increaser (505) for dynamically increasing the number of parity elements Np and hence the number of data elements Nd, because the parity sets of the constant weighted codes for [Nd, Np, C] are all subsets of the parity sets in constant weighted codes for [Nd+x, Np+n, C], where n is a positive integer and x is a positive integer which depends on Np, C and n.

16. A system as claimed in any one of claims 11 to 15, including a modifier (530) for modifying a data element including: for a reader (531) for reading a previous data element and its parity elements; an XOR operator (532) for applying an XOR operation to the previous data element and a modified data element to obtain t; and a writer (533) for updating the parity elements by applying an XOR operation oft and the previous parity elements.