A kind of building method of the optimal partial repairable system code based on bag
Technical field
The invention belongs to coding techniques field, more particularly to a kind of construction of the optimal partial repairable system code based on bag
Method, can be used under distributed memory system being lost in disk failure node or damaged data reparation.
Background technology
Large-scale cloud storage and distributed file system have reached very big scale, and the elastomer block of such as Amazon is deposited
Storage EBS and Google file system GoogleFS.In this case, disk failure has become normality.In such systems, it is
Data during protection disk failure, conventional solution has two kinds.The first scheme is to be copied directly to not packet
Same disk, so as to ensure the integrality of information.This scheme is most simple but needs huge storage overhead.Second scheme
It is that can divide MDS yards of coding using ultimate range to the data for needing storage.In general, real system can preferentially select this side
Case.If k information symbol position is encoded to be extended to n sign bit, any information sign bit in k information symbol
When losing or damaging, ultimate range can divide MDS yards to only need to using k sign bit in existing n-1 sign bit, you can extensive
Loss of appearing again or the information symbol position damaged.The disks different with copying to are compared, ultimate range can divide MDS yards can be significant
Improve redundancy and reliability.Even so, one includes n sign bit, the ultimate range of k information symbol position can divide MDS yards
Shortcoming is there is also, i.e., no matter when is recovered a sign bit lost or damage and is required for taking out k existing symbol from disk
Position participates in repairing, and especially in large-scale distributed file system, ultimate range can divide MDS yards of storage overhead larger.
The content of the invention
For the deficiency that above-mentioned prior art is present, it is to propose a kind of optimal partial based on bag that the purpose of the present invention is
The building method of repairable system code, it is excessive with required storage overhead during solving the problems, such as distributed storage, reduce in magnetic
The number of the existing sign bit of reparation is participated in disk malfunctioning node repair process, is reduced and is updated efficiency.
To achieve these goals, the present invention is as follows using technical scheme:
A kind of building method of the optimal partial repairable system code based on bag, it is desirable to the optimal repairable system code of construction
It is object code, a bag for being used to generate optimal partial repairable system code is constructed by the parameters of the object code for giving,
Again by the non-partial test symbol rank number n2 of these parameter acquirings, the size according to n2 values can be selected using two kinds of different sides
Method construction optimal partial repairable system code, is specifically carried out as follows:
Step one:It is configured to generate the bag of optimal partial repairable system code;
Step 2:Fetching portion check character row number n1 and non-partial test symbol rank number n2;
Step 3:Obtain optimal partial repairable system code.
Further according to the building method of optimal partial repairable system code, generation is configured to described in step one
The bag of optimal partial repairable system code;
Described with bag for the dividing condition of original information data, one (k, R, 1) bag is come with two tuple (X, β)
Represent, wherein k represents that total k original information data needs to divide, i.e., the information symbol in original information data is total k, k
The collection of individual information symbol position composition is combined into X, and the subset available symbols B of X represents that symbol B is also referred to as block, can be with by element of block B
Obtain set β;
The set X is divided into different blocks, and without occuring simultaneously between block and block under the dividing mode, all pieces of union is just
It is set X, different blocks can have different element numbers, for the set β being made up of block B, all pieces of element in set β
Number is combined and just constitute set R, and the element of set R is for positive integer and maximum is no more than k;
If each block B has identical, element number r, set R can be designated as R={ r }, and r is represented to be had when in a code word
When one information symbol bit loss or damage, r sign bit in the existing n-1 sign bit of true form word is at most needed to participate in
The information symbol bit recovery that will can be lost or damage, now wraps (k, R, 1) and is equivalent to wrap (k, r, 1);
Each dividing mode of set X can all obtain some blocks, the block that will be obtained under all dividing modes and together
Total block is just obtained, any one element of set X is represented with symbol a, if in total block, symbol a only goes out in t block
It is existing, then claim bag (k, R, 1) to be regular, it is designated as t- rules (k, R, 1) bags;
If the dividing mode of set X has u kinds, u >=2 then claim bag (X, β) to be decomposable, available symbols (k, R, 1;U) table
Show, it is clear that (k, R, 1;U) bag is u- rules (k, R, 1;U) wrap.
Further according to the building method of optimal partial repairable system code, fetching portion check character in step 2
Row number n1 and non-partial test symbol rank number n2;
If symbol G=(e1,e2,...,ek|p1,p2,...,pn-k) it is the generator matrix of object code, generator matrix G is by k × k
Unit matrix I and k × (n-k) P matrixes composition, unit matrix I respective column symbols ei represented, P matrixes respective column is accorded with
Number pi is represented, with symbol supp (pi) represent the set of the corresponding line number of all non-zero value in arranging of P matrixes i-th, symbol | supp (pi)
| represent the number of all non-zero value in the row of P matrixes i-th;
If the i-th row of P matrixes meet | supp (pi) |≤r, then the row can be designated as partial test symbol rank, symbol n1 is represented
The number of all partial test symbol ranks in one generator matrix G;
If the i-th row of P matrixes are unsatisfactory for | supp (pi) |≤r, then the row can be designated as non-partial test symbol rank, symbol n2
Represent the number of all non-partial test symbol ranks in a generator matrix G;
There is n sign bit for one, k information symbol, smallest hamming distance is (n, k, d) liner code of d, can
To obtain n1+n2+ k=n, wherein smallest hamming distance d represent in yard C the different quantity in correspondence position between any two code word
Minimum value;
According to the parameter of given object code, can obtainRepresent that n1 takes no more thanIt is whole
Number, symbol δ represents the minimum range minimum value of code resulting after a sign bit for deleting yard C, by n1+n2+ k=n can obtain non-
Partial test symbol rank number n2=n-n1-k。
Further according to the building method of optimal partial repairable system code, optimal partial is obtained described in step 3
Repairable system code, according to the size of non-partial test symbol rank number n2 values, n2=0 and n2>Corresponded to respectively when 0 a kind of optimal
The building method of local repairable system code generator matrix, optimal partial repairable system code can be obtained by generator matrix.
Further according to the building method of optimal partial repairable system code, non-partial test symbol rank number n2=
When 0, one data acquisition system X of construction only has a kind of bag of dividing mode (X, β), and data acquisition system X is divided into n1 block, with n1 block
It is the set β of element, is denoted as β={ B1,B2,...,Bn1, by each block B in βi(1≤i≤n1) line number is rewritten as k, row
Number is 1 k × 1 partial test symbol rank pi, is full 0 matrix when pi is initial, and i is the sequence number of corresponding blocks Bi;
If the element of i-th piece of corresponding set Bi can be expressed as Bi1,Bi2,...,Bi|Bi|, | Bi| it is the i-th set of blocks Bi
Element number, by the correspondence B in partial test symbol rank pii1,Bi2,...,Bi|Bi|Capable value is set to 1;
K × n that each several part check character row pi is rearranged in order1Submatrix is combined with (k × k) unit matrix,
Obtain the generator matrix G=(e of optimal partial repairable system code1,e2,...,ek|p1,p2,...,pn1);
By k original information data symbol M=(m to be encoded1,m2,m3,...,mk) represent, M=(m1,m2,
m3,...,mk) information bit matrix of 1 × k is also referred to as, G represents the generator matrix of the optimal partial repairable system code of k × n, by
Expression formula M × G=C can obtain optimal partial repairable system code.
Further according to the building method of optimal partial repairable system code, non-partial test symbol rank number n2>0
When, can divide MDS yards of construction optimal partial repairable system code using decomposable and ultimate range,
The decomposition species of X is u in decomposable (X, β), and β is designated as respectively1,β2,...,βu, the u kind isolations β of Xu
In the elements that are included of i-th piece of Bui be designated as Bui1,Bui2,...,Bui|Bui|, | Bui| it is the number comprising element in i-th piece;
By βuIn i-th piece be rewritten as behavior k, k × 1 full 0 check character the row pui, the pui for being classified as 1 represent set X
I-th piece of row of conversion generation under u kind isolations;
By the B in puiui1,Bui2,...,Bui|Bui|Capable value is set to ultimate range successively can divide MDS yards of generation square
Battle array G'=(e1,e2,...,ek|p1,p2,...pn-k) in i-th arrange Bui1,Bui2,...,Bui|Bui|Capable value, its residual value is constant;
Each pui leus order is arranged into the k+1 to k+u that overall replacement ultimate range can divide in MDS yards of generator matrix G'
Row, obtain the generator matrix G of optimal partial repairable system code;
By k original information data symbol M=(m to be encoded1,m2,m3,...,mk) represent, M=(m1,m2,
m3,...,mk) information bit matrix of 1 × k is also referred to as, G represents the generator matrix of the optimal partial repairable system code of k × n, by
Expression formula M × G=C can obtain optimal partial repairable system code.
Compared with prior art, the present invention has following technique effect:
The building method of the optimal partial repairable system code based on bag proposed by the present invention, by set X in bag
Divide, reduce the number of the existing sign bit for participating in repairing in disk failure node repair process, and then reduce storage
Expense, reduces renewal efficiency, is optimal the local repairable system for ultimately generating code.Wherein, efficiency is updated to refer to when letter
There is a sign bit to change in breath sign bit, the maximum of the sign bit number of change is needed in code C.Efficiency is updated to be represented with t,
T >=d, as t=d, code C has optimal renewal efficiency.Here smallest hamming distance d represents right between any two code word in yard C
Answer the minimum value of the different quantity in position.
Brief description of the drawings
Fig. 1 is the flow chart of the building method of the optimal partial repairable system code based on bag of the present invention.
Specific embodiment
Technical scheme is described in detail below in conjunction with drawings and Examples, so that people in the art
Member can be more clearly understood from the solution of the present invention, but therefore not limit the scope of the invention.
As shown in figure 1, assuming that it is desired that construction optimal repairable system code be object code, by give object code it is each
Item parameter can construct a bag for being used to generate optimal partial repairable system code, then can obtain non-part by these parameters
Check character row number n2Value, according to the n for obtaining2The size of value can be selected using two kinds of optimal offices of different method constructs
Portion's repairable system code.A kind of building method specific steps of optimal partial repairable system code based on bag of the present invention are such as
Under:
Step one:It is configured to generate the bag of optimal partial repairable system code.
Dividing condition for original information data can be described using bag, and one (k, R, 1) bag can use a binary
Group (X, β) represents that wherein k represents that total k original information data needs to divide, i.e. information symbol in original information data
Total k, the collection of k information symbol position composition is combined into X.The subset available symbols B of X represents that symbol B is also referred to as block, is with block B
Element can obtain set β.A kind of dividing mode is given, corresponding data acquisition system X can just be divided into different blocks, the division
Without occuring simultaneously between block and block under mode, all pieces of union is exactly set X.Different blocks can have different element numbers,
For the set β being made up of block B, all pieces of element number is combined and just constitute set R in set β.It can be seen that set R
Element be no more than k for positive integer and maximum.
If each block B has identical, element number r, set R can be designated as R={ r }, and r is represented when in a code word
When having an information symbol bit loss or damaging, r sign bit in the existing n-1 sign bit of true form word is at most needed to participate in
The information symbol bit recovery that will can just lose or damage, now wraps (k, R, 1) and is equivalent to wrap (k, r, 1).Each of set X
Dividing mode can all obtain some blocks, and total block is obtained the block obtained under all dividing modes and just together, with symbol a tables
Show any one element of set X, if in total block, symbol a only occurs in t block, then claim bag (k, R, 1) to be rule
, it is designated as t- rules (k, R, 1) bags.If the dividing mode of set X has u kinds, u >=2 then claim bag (X, β) to be decomposable, can use
Symbol (k, R, 1;U) represent, it is clear that (k, R, 1;U) bag is u- rules (k, R, 1;U) wrap.
Given one, containing 8 data acquisition system X={ 1,2,3,4,5,6,7,8 } of element, its correspondence is provided by above-mentioned knowledge
Four kinds of zoned formats be respectively:
β1={ { 2,3,8 }, { 6,7,4 }, { 1,5 } }, β2={ { 3,4,1 }, { 7,8,5 }, { 2,6 } }
β3={ { 4,5,2 }, { 8,1,6 }, { 3,7 } }, β4={ { 5,6,3 }, { 1,2,7 }, { 4,8 } }
(X, β) is (8, { 3,2 }, 1;4) decomposable, wherein β are β1,β2,β3,β4Union.
Step 2:Fetching portion check character row number n1 and non-partial test symbol rank number n2.
If symbol G=(e1,e2,...,ek|p1,p2,...,pn-k) it is the generator matrix of object code, generator matrix G is by k × k
Unit matrix I and k × (n-k) P matrixes composition, unit matrix I respective column symbols ei represented, P matrixes respective column is accorded with
Number pi is represented.With symbol supp (pi) represent the set of the corresponding line number of all non-zero value in arranging of P matrixes i-th.Symbol | supp (pi)
| the number of all non-zero value in the row of P matrixes i-th is represented, if the i-th row of P matrixes meet | supp (pi) |≤r, then the row can be designated as
Partial test symbol rank, symbol n1 represents the number of all partial test symbol ranks in a generator matrix G;If the i-th of P matrixes
Row are unsatisfactory for | supp (pi) |≤r, then the row can be designated as non-partial test symbol rank, symbol n2 is represented in a generator matrix G
The number of all non-partial test symbol ranks.
There is n sign bit for one, k information symbol, smallest hamming distance is (n, k, d) liner code of d, can
To obtain n1+n2+ k=n, wherein smallest hamming distance d represent in yard C the different quantity in correspondence position between any two code word
Minimum value.According to the parameter of given object code, can obtainRepresent that n1 takes no more thanIt is whole
Number, here symbol δ represent the minimum range minimum value of resulting code after a sign bit for deleting yard C, by n1+n2+ k=n can
Obtain non-partial test symbol rank number n2=n-n1-k。
Step 3:Obtain optimal partial repairable system code.
According to the size of non-partial test symbol rank number n2 values (n2 values non-negative), n2 is that 0 and n2 is corresponded to respectively when being more than 0
A kind of building method of optimal partial repairable system code generator matrix, optimal partial can be obtained by generator matrix can repair be
System code.Each method is comprised the following steps that:
(3-1) is when non-partial test symbol rank number n2 is 0.
One data acquisition system X of construction only has a kind of bag of dividing mode (X, β), and under the dividing mode, data acquisition system X can be with
N1 block is divided into, the set β with n1 block as element is denoted as β={ B1,B2,...,Bn1, by each block B in βi(1≤i≤
n1) line number is rewritten as k, columns is 1 k × 1 partial test symbol rank pi, and for full 0 matrix, (i is corresponding blocks Bi when pi is initial
Sequence number).
If the element of i-th piece of corresponding set Bi can be expressed as Bi1,Bi2,...,Bi|Bi|(|Bi| it is the i-th set of blocks Bi
Element number), by the correspondence B in partial test symbol rank pii1,Bi2,...,Bi|Bi|Capable value is set to 1.
K × n that each several part check character row pi is rearranged in order1Submatrix is combined with (k × k) unit matrix,
Obtain the generator matrix G=(e of optimal partial repairable system code1,e2,...,ek|p1,p2,...,pn1)。
By k original information data symbol M=(m to be encoded1,m2,m3,...,mk) represent, M=(m1,m2,
m3,...,mk) it is also referred to as the information bit matrix of 1 × k.G represents the generator matrix of the optimal partial repairable system code of k × n, by
Expression formula M × G=C can obtain optimal partial repairable system code.
(3-2) is when non-partial test symbol rank number n2 is more than 0.
In this case, MDS yards of construction optimal partial repairable system code can be divided using decomposable and ultimate range,
Comprise the following steps that:
The decomposition species of X is u in decomposable (X, β), and β is designated as respectively1,β2,...,βu, the u kind isolations β of Xu
In the elements that are included of i-th piece of Bui be designated as Bui1,Bui2,...,Bui|Bui|, | Bui| it is the number comprising element in i-th piece.
By βuIn i-th piece be rewritten as behavior k, k × 1 full 0 check character the row pui, the pui for being classified as 1 represent set X
I-th piece of row of conversion generation under u kind isolations.
By the B in puiui1,Bui2,...,Bui|Bui|Capable value is set to ultimate range successively can divide MDS yards of generation square
Battle array G'=(e1,e2,...,ek|p1,p2,...pn-k) in i-th arrange Bui1,Bui2,...,Bui|Bui|Capable value, its residual value is constant.
Each pui leus order is arranged into the k+1 to k+u that overall replacement ultimate range can divide in MDS yards of generator matrix G'
Row, obtain the generator matrix G of optimal partial repairable system code.
By k original information data symbol M=(m to be encoded1,m2,m3,...,mk) represent, M=(m1,m2,
m3,...,mk) it is also referred to as the information bit matrix of 1 × k.G represents the generator matrix of the optimal partial repairable system code of k × n, by
Expression formula M × G=C can obtain optimal partial repairable system code.
Specific embodiment based on innovative principle of the present invention given below, it is assumed that wishing the optimal partial of construction can repair
Complex system code length n has 16 sign bits for 16, i.e. each code word.The letter of original information data in note distributed memory system
Breath sign bit digit k=8, note recover an information symbol position lost or damage at most need in r=3 code C it is existing its
His sign bit is participated in, and deletes minimum range at least δ=4 of resulting code after a code bit of yard C.
Embodiment one
In the step (3-1) when n2 is equal to 0.
Construction one is containing 8 data acquisition system X={ 1,2,3,4,5,6,7,8 } of element, it is assumed that the division side of data acquisition system X
Method only has one kind, and from step one, the division of 3- rules (8,3, a 1) bag can be designated as:
β={ { 2,3,8 }, { 3,4,1 }, { 4,5,2 }, { 5,6,3 }, { 6,7,4 }, { 7,8,5 }, { 8,1,6 }, { 1,2,7 } }
Partial test symbol rank number n1 can be expressed as:
Non- partial test symbol rank number n2 can be expressed as:
n2=n-n1- k=16-8-8=0.
1st block of set β is { 2,3,8 }, refers to the subset of X, the corresponding column vector of block the 2nd, 3, the value of 8 rows be
1, remaining row value is 0, is designated as { 0,1,0,0,0,0,0,1 }T.2nd to 8 piece sequentially generates 7 column vectors respectively, by 8 arrange to
Amount is sequentially placed into the right side of k × k unit matrixs, obtains the generator matrix of optimal partial repairable system code:
Embodiment two
In the step (3-2) when n2 is more than 0.
If ultimate range can divide MDS yards of generator matrix for W, i.e.,:
Construction one contains 8 data acquisition system X={ 1,2,3,4,5,6,7,8 } of element, from step one, it is assumed that data
The division methods of set X have two kinds, are designated as respectively:β1={ { 2,3,8 }, { 6,7,4 }, { 1,5 } }, β2={ 3,4,1 }, 7,8,
5},{2,6}}。
Partial test symbol rank number n1 can be expressed as:
Non- partial test symbol rank number n2 can be expressed as:
n2=n-n1- k-20-8-8=4 > 0
Data acquisition system X has two kinds of dividing modes to be designated as β respectively1,β2, ultimate range can divide in MDS yards of generator matrix W
9th, 10 row will be replaced.Method is as follows:
β1, there are 2 three blocks, respectively { 2,3,8 }, { 6,7,4 }, { 1,5 }, then the matrix for substituting the 9th row in W is one
3 × 8 new matrix.The first row of new matrix the 2nd, 3, the value of 8 rows and the 2nd that the 9th in W arranges, 3, the value of 8 rows it is identical.Newly
The secondary series of matrix the 6th, 7, the 6th of the value of 4 rows and the 9th row in W the, 7, the value of 4 rows it is identical.Tertial the of new matrix
1st, the value of 5 rows and the 9th row in W the 1st, the value of 5 rows it is identical.
β1,β2There are three blocks, respectively { 3,4,1 }, { 7,8,5 }, { 2,6 }, then the matrix for substituting the 10th row in W is also
The new matrix of one 3 × 8.The the 3rd, 4, the in the value of 1 row and W the 10th of the first row of new matrix arrange the 3rd, 4, the value phase of 1 row
Together.The secondary series of new matrix the 7th, 8, the value of 5 rows and the 7th that the 10th in W arranges, 8, the value of 5 rows it is identical.The 3rd of new matrix
Row the 2nd, the 2nd, value of 6 rows is identical with the 10th in W arranges for the value of 6 rows.
Method more than, obtains 2 new matrixes, replaces in W the 9th, 10 to arrange with this 2 new matrixes and has just obtained most
The generator matrix of excellent local repairable system code:
The above is only and the preferred embodiment of the present invention is described, technical scheme is not limited to
This, any known deformation that those skilled in the art are made on the basis of major technique of the invention design belongs to the present invention
Claimed technology category, the specific protection domain of the present invention is defined by the record of claims.