A kind of building method of the optimal partial repairable system code based on packet
Technical field
The invention belongs to coding techniques field more particularly to a kind of constructions of the optimal partial repairable system code based on packet
Method can be used for the reparation of loss or damaged data in disk failure node under distributed memory system.
Background technique
Large-scale cloud storage and distributed file system have had reached very big scale, as the elastomer block of Amazon is deposited
Store up EBS and Google file system GoogleFS.In this case, disk failure has become normality.In such systems, it is
Data when disk failure are protected, there are two types of common solutions.The first scheme is to be copied directly to data grouping not
Same disk, to ensure the integrality of information.This scheme is most simple but needs huge storage overhead.Second scheme
It is that the data stored to needs can divide MDS code to encode using maximum distance.In general, real system can preferentially select this side
Case.If k information symbol position is extended to n sign bit by coding, any information sign bit in k information symbol position
When losing or damaging, maximum distance can divide MDS code only to need using k sign bit in existing n-1 sign bit, can be extensive
The information symbol position of loss of appearing again or damage.The disks different with copying to are compared, and maximum distance can divide MDS code significant
Improve redundancy and reliability.Nevertheless, one includes n sign bit, the maximum distance of k information symbol position can divide MDS code
There is also disadvantages, i.e., no matter when restore the sign bit that one is lost or damaged and require to take out k existing symbols from disk
Position participates in repairing, and especially in large-scale distributed file system, maximum distance can divide the storage overhead of MDS code larger.
Summary of the invention
In view of the deficiency of the prior art, the purpose of the present invention is be to propose a kind of optimal partial based on packet
The building method of repairable system code, to solve the problems, such as that required storage overhead is excessive during distributed storage, is reduced in magnetic
The number for the existing sign bit repaired is participated in disk malfunctioning node repair process, is reduced and is updated efficiency.
To achieve the goals above, the present invention is as follows using technical solution:
A kind of building method of the optimal partial repairable system code based on packet, it is desirable to the optimal repairable system code of construction
For object code, the packet for being used to generate optimal partial repairable system code is constructed by the parameters of given object code,
Non- partial test symbol rank number n2 is obtained by these parameters again, can choose according to the size of n2 value using two different sides
Method constructs optimal partial repairable system code, specifically carries out as follows:
Step 1: it is configured to generate the packet of optimal partial repairable system code;
Step 2: fetching portion check character column number n1 and non-partial test symbol rank number n2;
Step 3: optimal partial repairable system code is obtained.
Further according to the building method of the optimal partial repairable system code, it is configured to generate described in step 1
The packet of optimal partial repairable system code;
The dividing condition of original information data is described with packet, one (k, R, 1) packet is come with a binary group (X, β)
It indicates, wherein k indicates that shared k original information data needs to divide, i.e., the information symbol in original information data is k shared, k
The collection of a information symbol position composition is combined into X, and the subset available symbols B of X indicates that symbol B is also referred to as block, can be with as element using block B
Obtain set β;
The set X is divided into different blocks, does not have intersection under the division mode between block and block, all pieces of union is just
It is set X, different blocks can have different element numbers, for the set β being made of block B, all pieces of element in set β
Number is combined and just constitutes set R, and the element of set R is positive integer and maximum value is no more than k;
If each block B has identical element number r, set R that can be denoted as R={ r }, r indicates have in a code word
When one information symbol bit loss or damage, r sign bit in the existing n-1 sign bit of true form word is at most needed to participate in
The information symbol bit recovery that can will be lost or damage wraps (k, R, 1) at this time and is equivalent to packet (k, r, 1);
Each division mode of set X can all obtain some pieces, by the block obtained under all division modes and together
Total block is just obtained, any one element of set X is indicated with symbol a, if symbol a only goes out in t block in total block
It is existing, then claim to wrap (k, R, 1) be it is regular, be denoted as t- regular (k, R, 1) packet;
If the division mode of set X has u kind, u >=2 then claim to wrap (X, β) to be decomposable, available symbols (k, R, 1;U) table
Show, it is clear that (k, R, 1;U) packet is u- rule (k, R, 1;U) it wraps.
Further according to the building method of the optimal partial repairable system code, fetching portion check character in step 2
Column number n1 and non-partial test symbol rank number n2;
If symbol G=(e1,e2,...,ek|p1,p2,...,pn-k) be object code generator matrix, generator matrix G is by k × k
Unit matrix I and k × (n-k) P matrix composition, unit matrix I respective column indicates with symbol ei, and P matrix respective column accords with
Number pi is indicated, with symbol supp (pi) indicate the set of the corresponding line number of all non-zero value in the column of P matrix i-th, symbol | supp (pi)
| indicate the number of all non-zero value in the column of P matrix i-th;
If the i-th column of P matrix meet | supp (pi) |≤r, then the column can be denoted as partial test symbol rank, and symbol n1 is indicated
The number of all partial test symbol ranks in one generator matrix G;
If the i-th column of P matrix are unsatisfactory for | supp (pi) |≤r, then the column can be denoted as non-partial test symbol rank, symbol n2
Indicate the number of all non-partial test symbol ranks in a generator matrix G;
There is n sign bit, k information symbol position for one, smallest hamming distance is (n, k, d) liner code of d, can
To obtain n1+n2+ k=n, wherein smallest hamming distance d indicates to correspond to the different quantity in position between any two code word in code C
Minimum value;
It is available according to the parameter of given object codeIt indicates that n1 takes to be not more thanIt is whole
Number, symbol δ indicates to delete the minimum range minimum value of obtained code after a sign bit of yard C, by n1+n2+ k=n can obtain non-
Partial test symbol rank number n2=n-n1-k。
Further according to the building method of the optimal partial repairable system code, optimal partial is obtained described in step 3
Repairable system code, according to the size of non-partial test symbol rank number n2 value, n2=0 and when n2 > 0, respectively correspond a kind of optimal
The building method of local repairable system code generator matrix, by the available optimal partial repairable system code of generator matrix.
Further according to the building method of the optimal partial repairable system code, non-partial test symbol rank number n2=
When 0, the packet (X, β) that a data acquisition system X only has a kind of division mode is constructed, data acquisition system X is divided into n1 block, with n1 block
For the set β of element, it is denoted as β={ B1,B2,...,Bn1, by each block B in βi(1≤i≤n1) line number is rewritten as k, it arranges
The partial test symbol rank pi of k × 1 that number is 1, is full 0 matrix when pi is initial, and i is the serial number of corresponding blocks Bi;
If the element of i-th piece of corresponding set Bi can be expressed as Bi1,Bi2,...,Bi|Bi|, | Bi| it is the i-th set of blocks Bi
Element number, by the correspondence B in partial test symbol rank pii1,Bi2,...,Bi|Bi|Capable value is set to 1;
Each section check character column pi is arranged in order to k × n of composition1Submatrix is combined with (k × k) unit matrix,
Obtain the generator matrix G=(e of optimal partial repairable system code1,e2,...,ek|p1,p2,...,pn1);
By k position original information data symbol M=(m to be encoded1,m2,m3,...,mk) indicate, M=(m1,m2,
m3,...,mk) it is also referred to as the information bit matrix of 1 × k, G indicates the generator matrix of the optimal partial repairable system code of k × n, by
Expression formula M × G=C can obtain optimal partial repairable system code.
Further according to the building method of the optimal partial repairable system code, non-partial test symbol rank number n2 > 0
When, MDS code can be divided to construct optimal partial repairable system code using decomposable and maximum distance,
The decomposition type of X is u in decomposable (X, β), is denoted as β respectively1,β2,...,βu, the u kind isolation β of Xu
In i-th piece of Bui element for being included be denoted as Bui1,Bui2,...,Bui|Bui|, | Bui| it is the number in i-th piece comprising element;
By βuIn i-th piece be rewritten as behavior k, be classified as the 1 full 0 check character column of k × 1 pui, the pui indicates set X
The column that i-th piece of conversion under u kind isolation generates;
By the B in puiui1,Bui2,...,Bui|Bui|Capable value, which is successively set to maximum distance, can divide the generation square of MDS code
Battle array G'=(e1,e2,...,ek|p1,p2,...pn-k) in i-th arrange Bui1,Bui2,...,Bui|Bui|Capable value, residual value are constant;
Each pui Leie sequence, which is arranged whole replacement maximum distance, can divide the k+1 to k+u in the generator matrix G' of MDS code
Column, obtain the generator matrix G of optimal partial repairable system code;
By k position original information data symbol M=(m to be encoded1,m2,m3,...,mk) indicate, M=(m1,m2,
m3,...,mk) it is also referred to as the information bit matrix of 1 × k, G indicates the generator matrix of the optimal partial repairable system code of k × n, by
Expression formula M × G=C can obtain optimal partial repairable system code.
Compared with prior art, the present invention has following technical effect that
The building method of optimal partial repairable system code proposed by the present invention based on packet, by set X in packet
It divides, reduces the number for participating in the existing sign bit repaired in disk failure node repair process, and then reduce storage
Expense reduces update efficiency, is optimal the local repairable system code ultimately generated.Wherein, efficiency is updated to refer to when letter
There is a sign bit to change in breath sign bit, the maximum value for the sign bit number for needing to change in code C.Update efficiency is indicated with t,
T >=d, as t=d, code C has optimal update efficiency.Here smallest hamming distance d indicates right between any two code word in code C
The minimum value for the quantity for answering position different.
Detailed description of the invention
Fig. 1 is the flow chart of the building method of the optimal partial repairable system code of the present invention based on packet.
Specific embodiment
Technical solution of the present invention is described in detail below in conjunction with drawings and examples, so that those skilled in the art
The protection scope that member can be more clearly understood the solution of the present invention, but be not intended to limit the present invention.
As shown in Figure 1, passing through each of given object code assuming that it is desired that the optimal repairable system code of construction is object code
Item parameter can construct the packet for being used to generate optimal partial repairable system code, then by the available non-part of these parameters
Check character column number n2Value, according to obtained n2The size of value can choose using the optimal office of two different method constructs
Portion's repairable system code.A kind of building method specific steps of optimal partial repairable system code based on packet of the present invention are such as
Under:
Step 1: it is configured to generate the packet of optimal partial repairable system code.
Packet can be used for the dividing condition of original information data to describe, mono- binary of (k, R, a 1) Bao Keyong
Group (X, β) indicates that wherein k indicates that shared k original information data needs to divide, i.e. information symbol in original information data
K shared, the collection of k information symbol position composition is combined into X.The subset available symbols B of X indicates that symbol B is also referred to as block, is with block B
The available set β of element.A kind of division mode is given, corresponding data acquisition system X can be divided into different blocks, the division
There is no intersection under mode between block and block, all pieces of union is exactly set X.Different blocks can have different element numbers,
For the set β being made of block B, in set β, all pieces of element number, which is combined, just constitutes set R.It can be seen that set R
Element be that positive integer and maximum value are no more than k.
If each block B has identical element number r, set R that can be denoted as R={ r }, r is indicated when in a code word
When having an information symbol bit loss or damage, r sign bit in the existing n-1 sign bit of true form word is at most needed to participate in
The information symbol bit recovery that just can will be lost or damage wraps (k, R, 1) at this time and is equivalent to packet (k, r, 1).Each of set X
Division mode can all obtain some pieces, total block just be obtained by the block obtained under all division modes and together, with symbol a table
Show any one element of set X, if symbol a only occurs in t block in total block, then claims to wrap (k, R, 1) to be rule
, it is denoted as t- regular (k, R, 1) packet.If the division mode of set X has a u kind, u >=2, then claim to wrap (X, β) be it is decomposable, can use
Symbol (k, R, 1;U) it indicates, it is clear that (k, R, 1;U) packet is u- rule (k, R, 1;U) it wraps.
Given one contains the data acquisition system X={ 1,2,3,4,5,6,7,8 } of 8 elements, provides its correspondence by above-mentioned knowledge
Four kinds of zoned formats be respectively as follows:
β1={ { 2,3,8 }, { 6,7,4 }, { 1,5 } }, β2={ { 3,4,1 }, { 7,8,5 }, { 2,6 } }
β3={ { 4,5,2 }, { 8,1,6 }, { 3,7 } }, β4={ { 5,6,3 }, { 1,2,7 }, { 4,8 } }
(X, β) be one (8, { 3,2 }, 1;4) decomposable, wherein β is β1,β2,β3,β4Union.
Step 2: fetching portion check character column number n1 and non-partial test symbol rank number n2.
If symbol G=(e1,e2,...,ek|p1,p2,...,pn-k) be object code generator matrix, generator matrix G is by k × k
Unit matrix I and k × (n-k) P matrix composition, unit matrix I respective column indicates with symbol ei, and P matrix respective column accords with
Number pi is indicated.With symbol supp (pi) indicate the set of the corresponding line number of all non-zero value in the column of P matrix i-th.Symbol | supp (pi)
| indicate the number of all non-zero value in the column of P matrix i-th, if the i-th column of P matrix meet | supp (pi) |≤r, then the column can be denoted as
Partial test symbol rank, symbol n1 indicate the number of all partial test symbol ranks in a generator matrix G;If the i-th of P matrix
Column are unsatisfactory for | supp (pi) |≤r, then the column can be denoted as non-partial test symbol rank, and symbol n2 is indicated in a generator matrix G
The number of all non-partial test symbol ranks.
There is n sign bit, k information symbol position for one, smallest hamming distance is (n, k, d) liner code of d, can
To obtain n1+n2+ k=n, wherein smallest hamming distance d indicates to correspond to the different quantity in position between any two code word in code C
Minimum value.It is available according to the parameter of given object codeIt indicates that n1 takes to be not more thanIt is whole
Number, symbol δ indicates to delete the minimum range minimum value of obtained code after a sign bit of yard C here, by n1+n2+ k=n can
Obtain non-partial test symbol rank number n2=n-n1-k。
Step 3: optimal partial repairable system code is obtained.
According to the size of non-partial test symbol rank number n2 value (n2 value is non-negative), n2 is that 0 and n2 is respectively corresponded when being greater than 0
A kind of building method of optimal partial repairable system code generator matrix, can be repaired by the available optimal partial of generator matrix and be
System code.Specific step is as follows for each method:
(3-1) is when non-partial test symbol rank number n2 is 0.
Construct the packet (X, β) that a data acquisition system X only has a kind of division mode, under the division mode, data acquisition system X can be with
It is divided into n1 block, is denoted as β={ B by the set β of element of n1 block1,B2,...,Bn1, by each block B in βi(1≤i≤
n1) line number is rewritten as k, the partial test symbol rank pi of k × 1 that columns is 1, is that (i is corresponding blocks Bi to full 0 matrix when pi is initial
Serial number).
If the element of i-th piece of corresponding set Bi can be expressed as Bi1,Bi2,...,Bi|Bi|(|Bi| it is the i-th set of blocks Bi
Element number), by the correspondence B in partial test symbol rank pii1,Bi2,...,Bi|Bi|Capable value is set to 1.
Each section check character column pi is arranged in order to k × n of composition1Submatrix is combined with (k × k) unit matrix,
Obtain the generator matrix G=(e of optimal partial repairable system code1,e2,...,ek|p1,p2,...,pn1)。
By k position original information data symbol M=(m to be encoded1,m2,m3,...,mk) indicate, M=(m1,m2,
m3,...,mk) it is also referred to as the information bit matrix of 1 × k.G indicates the generator matrix of the optimal partial repairable system code of k × n, by
Expression formula M × G=C can obtain optimal partial repairable system code.
(3-2) is when non-partial test symbol rank number n2 is greater than 0.
In this case, MDS code can be divided to construct optimal partial repairable system code using decomposable and maximum distance,
Specific step is as follows:
The decomposition type of X is u in decomposable (X, β), is denoted as β respectively1,β2,...,βu, the u kind isolation β of Xu
In i-th piece of Bui element for being included be denoted as Bui1,Bui2,...,Bui|Bui|, | Bui| it is the number in i-th piece comprising element.
By βuIn i-th piece be rewritten as behavior k, be classified as the 1 full 0 check character column of k × 1 pui, the pui indicates set X
The column that i-th piece of conversion under u kind isolation generates.
By the B in puiui1,Bui2,...,Bui|Bui|Capable value, which is successively set to maximum distance, can divide the generation square of MDS code
Battle array G'=(e1,e2,...,ek|p1,p2,...pn-k) in i-th arrange Bui1,Bui2,...,Bui|Bui|Capable value, residual value are constant.
Each pui Leie sequence, which is arranged whole replacement maximum distance, can divide the k+1 to k+u in the generator matrix G' of MDS code
Column, obtain the generator matrix G of optimal partial repairable system code.
By k position original information data symbol M=(m to be encoded1,m2,m3,...,mk) indicate, M=(m1,m2,
m3,...,mk) it is also referred to as the information bit matrix of 1 × k.G indicates the generator matrix of the optimal partial repairable system code of k × n, by
Expression formula M × G=C can obtain optimal partial repairable system code.
The specific embodiment based on innovative principle of the present invention is given below, it is assumed that it is desirable that the optimal partial constructed can be repaired
Complex system code length n is 16, i.e. each code word has 16 sign bits.Remember the letter of original information data in distributed memory system
Cease sign bit digit k=8, note restore one lose or damage information symbol position at most need in r=3 code C it is existing its
His sign bit participates in, and the minimum range for deleting obtained code after a code bit of yard C is at least δ=4.
Embodiment one
In the step (3-1) when n2 is equal to 0.
Construction one contains the data acquisition system X={ 1,2,3,4,5,6,7,8 } of 8 elements, it is assumed that the division side of data acquisition system X
Method only has one kind, by step 1 it is found that the division of a 3- rule (8,3,1) packet can be denoted as:
β={ { 2,3,8 }, { 3,4,1 }, { 4,5,2 }, { 5,6,3 }, { 6,7,4 }, { 7,8,5 }, { 8,1,6 }, { 1,2,7 } }
Partial test symbol rank number n1 can be indicated are as follows:
Non- partial test symbol rank number n2 can be indicated are as follows:
n2=n-n1- k=16-8-8=0.
The 1st block of set β is { 2,3,8 }, refers to the subset of X, the corresponding column vector of block the 2nd, 3, the values of 8 rows be
1, remaining row value is 0, is denoted as { 0,1,0,0,0,0,0,1 }T.2nd to 8 piece sequentially generates 7 column vectors respectively, by 8 arrange to
Amount is sequentially placed into the right side of k × k unit matrix, obtains the generator matrix of optimal partial repairable system code:
Embodiment two
In the step (3-2) when n2 is greater than 0.
If maximum distance can divide the generator matrix of MDS code to be W, it may be assumed that
Construction one contains the data acquisition system X={ 1,2,3,4,5,6,7,8 } of 8 elements, from step 1, it is assumed that data
There are two types of the division methods of set X, is denoted as respectively: β1={ { 2,3,8 }, { 6,7,4 }, { 1,5 } }, β2={ 3,4,1 }, 7,8,
5},{2,6}}。
Partial test symbol rank number n1 can be indicated are as follows:
Non- partial test symbol rank number n2 can be indicated are as follows:
n2=n-n1- k-20-8-8=4 > 0
Data acquisition system X is denoted as β there are two types of division mode respectively1,β2, maximum distance can divide in the generator matrix W of MDS code
9th, 10 column will be replaced.Method is as follows:
β1, have 2 three blocks, respectively { 2,3,8 }, { 6,7,4 }, { 1,5 }, then the matrix for substituting the 9th column in W is one
3 × 8 new matrix.The first row of new matrix the 2nd, 3, the 2nd of the values of 8 rows and the 9th column in W the, 3, the value of 8 rows it is identical.Newly
The secondary series of matrix the 6th, 7, the 6th of the values of 4 rows and the 9th column in W the, 7, the value of 4 rows it is identical.Tertial the of new matrix
1, the value of 5 rows in W the 9th column the 1st, the value of 5 rows it is identical.
β1,β2There are three blocks, respectively { 3,4,1 }, { 7,8,5 }, and { 2,6 }, then substituting the matrix that the in W the 10th arranges is also
One 3 × 8 new matrix.The first row of new matrix the 3rd, 4, the 3rd of the value of 1 row and the 10th column in W the, 4, the value phase of 1 row
Together.The secondary series of new matrix the 7th, 8, the 7th of the values of 5 rows and the 10th column in W the, 8, the value of 5 rows it is identical.The third of new matrix
Column the 2nd, the 2nd of the values of 6 rows and the 10th column in W the, the value of 6 rows it is identical.
According to above method, 2 new matrixes are obtained, replaces the 9th, 10 in W to arrange with this 2 new matrixes and has just obtained most
The generator matrix of excellent part repairable system code:
The above is only the preferred embodiment of the present invention is described, technical solution of the present invention is not limited to
This, those skilled in the art's made any known deformation on the basis of major technique design of the invention belongs to the present invention
Claimed technology scope, the specific protection scope of the present invention are subject to the record of claims.