A kind of method for reducing amount of calculation in binary coding storage system
Technical field
The present invention relates to a kind of method that utilization binary matrix is calculated related data, more particularly to a kind of reduction
The method of amount of calculation in binary coding storage system.
Background technology
In recent years, it is per second per point with computer technology and extensive application of the related sensor technology in all trades and professions
All producing the information that perceives the world, meanwhile, the Internet service of hundreds of millions of users at every moment all producing new data,
The historical information for recording people's life simultaneously is also presented explosive growth.The rapid growth of data necessarily brings holding for storage device
It is continuous to increase.Meanwhile, in order to meet the data storage requirement for increasingly extending, the architecture of data-storage system is also evolving
With change, from traditional centralised storage to distributed storage, the new mass data storage such as cloud storage was have also appeared in recent years
Pattern.The scale of storage system is also increasing, thus, how to ensure to reduce data redundancy in the case where data are highly reliable,
And then hardware consumption is reduced, become the focus of attention of area information storage.
Different from traditional many backup policy, in the last few years, technos has developed one kind with coding redundancy strategy as core
New storage system.Coding redundancy storage system can ensure with replication strategy provide identical system reliability while,
The data redudancy of storage system can be greatly reduced, and then substantial amounts of hardware input and power consumption are saved for storage system.
But, coding redundancy strategy is from unlike backup policy, and its management is complex, most importantly, data is being carried out
During storage, coding need to be carried out to it and is calculated and then is produced redundant data.But, cataloged procedure needs the certain meter of consumption system
Calculation amount, when system-computed performance it is relatively low, or system need other side use computing resource when, this can substantially reduce coding meter
The speed of calculation, and then affect the storage speed and efficiency of system.Thus, how to reduce the amount of calculation encoded when file is stored always
It is correcting and eleting codes memory technology focus of attention and difficult point.To solve this difficult problem, researcher proposes binary coded matrix
Storage strategy, and in fact in the construction process of binary coded matrix, it is difficult to directly construction one kind both can guarantee that system was held
Delete effect, and the binary coded matrix with minimum amount of calculation.Therefore, in actual binary encoder matrix construction process
In, all it is to delete performance to meet the appearance of storage system, and do not consider it in an encoding process, if with minimum cataloged procedure
Amount of calculation.Thus, how to search out a kind of method that can reduce binary coded matrix amount of calculation just particularly urgent.
The content of the invention
The purpose of the present invention is that and provide a kind of reduction in binary coding storage system in order to solve the above problems
The method of amount of calculation.
The present invention is achieved through the following technical solutions above-mentioned purpose:
The present invention is comprised the following steps:
(1)If the binary system encoder matrix for arbitrarily being determined by " 0,1 " is Gr·m, Gr·m" 0,1 " composition of serving as reasons two is entered
Matrix processed, the matrix is used to produce redundant data, and it can be embodied as:
(2)According to row vector l of binary coded matrix1, l2..., lr·The number of " 1 " is determined according to the vector in m
XOR calculation times required during check bit are calculated, and calculates any two vectors la, lbBetween the digit that differs;
(3)If vector laMiddle element is k for the digit of " 1 ", then system carries out producing redundant data needs using the vector
Carry out k-1 XOR operation.
Further, for whole encoder matrix Gr·mThe Optimizing Flow for carrying out encoding calculating to original document is as follows:
A:According to G in encoder matrixr·mEach row vector in " 1 " number, to determine and calculate verification according to the row vector
XOR number required for position, the number of " 1 " is marked with k in row vector, then calculated required for check bit using the row vector
XOR time number is(k-1)M, wherein m are the size that each participates in the original data block that verification is calculated;
B:The number of the element identical bits position different from element relatively in encoder matrix between any two row vector, is designated as
(e/d), wherein e represents element identical position number in two vectors;D represents the different position number of element in two vectors;
C:If a certain row vector liXOR number required for (1≤i≤rm) is less than or equal in step B not isotopic number
D, then verification data block directly according to corresponding to the vector calculates the row, and the vector is designated as into lj;
D:Using the vectorial l determined in step Cj, according to identical digit in step B and the ratio of not isotopic number, determine next
Individual calculating row vector, when certain row vector lkWith vectorial ljIsotopic number is not less than identical digit, and lkWith vectorial ljNot isotopic number and its
Remaining each vector is when isotopic number reaches minimum, then according to vectorial ljThe verification data having calculated that is calculating by lkIt is determined that school
Test data;
E:If still not calculating check bit, according to computation rule in step D, with lkBased on vector, under searching
One vector to be calculated, and return to step D;
F:Complete verification position calculating process whether is had determined that, check bit calculating process successively is if so, then preserved, if it is not, then
Calculated according to original corresponding relation.
The beneficial effects of the present invention is:
The present invention compared with prior art, optimizes cataloged procedure, can realize the reduction of cataloged procedure amount of calculation.Depositing
When storage system carries out code storage to data, original check number can be changed according in encoder matrix the characteristics of each row vector
According to the calculating order of block, and then reduce the calculation times of cataloged procedure;Carried out to encoder matrix using method proposed by the present invention
Optimization after calculating order, can store in a computer, in each calculating afterwards, can be according to the optimization after
Rule is calculated;Cataloged procedure optimization method proposed by the present invention, can be applied to all binary matrixs, especially, should
Method goes for any correlated process calculated based on binary matrix, the coding being applicable not only to during data storage
Process, applies also for, when dropout of data block, the process of data reconstruction being carried out to losing data block using binary system check matrix,
With the value promoted the use of.
Description of the drawings
Fig. 1 is (6,3,4) binary system vandermonde sytem matrix Stored Procedure schematic diagram;
Fig. 2 is that row vector correspondence calculates schematic diagram;
Fig. 3 is calculating process optimization schematic diagram.
Specific embodiment
Below in conjunction with the accompanying drawings the invention will be further described:
The present invention is comprised the following steps:
(1)If the binary system encoder matrix for arbitrarily being determined by " 0,1 " is Gr·m, Gr·m" 0,1 " composition of serving as reasons two is entered
Matrix processed, the matrix is used to produce redundant data, and it can be embodied as:
(2)According to row vector l of binary coded matrix1, l2..., lr·mIn the number of " 1 " determine according to the vector
XOR calculation times required during check bit are calculated, and calculates any two vectors la, lbBetween the digit that differs;
(3)If vector laMiddle element is k for the digit of " 1 ", then system carries out producing redundant data needs using the vector
Carry out k-1 XOR operation.
Further, for whole encoder matrix Gr·mThe Optimizing Flow for carrying out encoding calculating to original document is as follows:
A:According to G in encoder matrixr·mEach row vector in " 1 " number, to determine and calculate verification according to the row vector
XOR number required for position, the number of " 1 " is marked with k in row vector, then calculated required for check bit using the row vector
XOR time number is(k-1)M, wherein m are the size that each participates in the original data block that verification is calculated;
B:The number of the element identical bits position different from element relatively in encoder matrix between any two row vector, is designated as
(e/d), wherein e represents element identical position number in two vectors;D represents the different position number of element in two vectors;
C:If a certain row vector liXOR number required for (1≤i≤rm) is less than or equal in step B not isotopic number
D, then verification data block directly according to corresponding to the vector calculates the row, and the vector is designated as into lj;
D:Using the vectorial l determined in step Cj, according to identical digit in step B and the ratio of not isotopic number, determine next
Individual calculating row vector, when certain row vector lkWith vectorial ljIsotopic number is not less than identical digit, and lkWith vectorial ljNot isotopic number and its
Remaining each vector is when isotopic number reaches minimum, then according to vectorial ljThe verification data having calculated that is calculating by lkIt is determined that school
Test data;
E:If still not calculating check bit, according to computation rule in step D, with lkBased on vector, under searching
One vector to be calculated, and return to step D;
F:Complete verification position calculating process whether is had determined that, check bit calculating process successively is if so, then preserved, if it is not, then
Calculated according to original corresponding relation.
Embodiment 1:Binary system of the present invention based on construction on " 0,1 " (6,3, to a certain wait to deposit by 4) vandermonde systematic code
Storage file carries out the process of block encoding generation redundant data and is illustrated.Due to being (6,3,4) vandermonde in the present invention
Systematic code, thus, in the present embodiment original is divided into into 9 microdata blocks.By file(Example image)Nine data point
Block d1,1, d1,2, d1,3, d2,1, d2,2, d2,3, d3,1, d3,2, d3,3Arrange in order, and with encoder matrix G in per 9 in a line
The position of element is corresponding in turn to, according to (6,3,4) vandermonde systematic code can enough produce in addition from 9 original deblockings
9 verification data blocks:p1,1, p1,2, p1,3, p2,2, p2,3, p3,1, p3,1, p3,2, p3,3.Wherein data chunk { d1,1, d1,2, d1,3};
{d2,1, d2,2, d2,3};{d3,1, d3,2, d3,3};{p1,1, p1,2, p1,3};{p2,1, p2,2, p2,3};{p3,1, p3,2, p3,3Respectively constitute
One macrodata block, each independent storage section that each macrodata block will be respectively stored in system as a memory cell
Point in.The generation amount of calculation of wherein 9 verification data blocks depend in generator matrix " 1 " element in each row vector number with
And between each row vector " 1 " element overall distribution situation.0-1 distribution situations in every a line of G all determine a coding
The generation rule of data block:Those the file data piecemeals of all values in certain a line of G corresponding to the element position of " 1 " are entered
Row mould 2 adds up(' XOR ' between data block), the result for obtaining is exactly the coded data block determined by the row, such as Fig. 2 institutes
Show.Computing is carried out to data block using G matrix, its overall calculation schematic diagram is as shown in Figure 1.
To describe the calculating process of the cataloged procedure optimized algorithm on { 0,1 } symbol field in detail, the present invention is given and entered by two
The low amount of calculation optimization process that system (6,3,4) vandermonde systematic code determines:
Then according to the encoder matrix G neutrons matrix V ' of (6,3,4) vandermonde systematic code on { 0,1 } symbol field then can determine that to
Amount l1, l2..., l9, then encoder matrix each vector relations table can determine that according to optimized algorithm step one.L in a table entrya
B (), wherein a are expressed as the order of the row vector of matrix V ';B is required when being expressed as directly generating check bit according to a row vectors
The XOR number wanted.(e/d) e represents the number of identical bits element between any two vector in generator matrix V ' in;D represents generation
The not number of isotopic element between any two vector in matrix V '.The first row vector in matrix V ' is encoded as described above is:[0 1 0
10010 1], i.e., it is represented by table:l1(3).Then the row vector of encoder matrix second is represented by:l2(4), then go
Vectorial l1In the 6th bit element with row vector l2In the 6th bit element it is identical;Row vector l1In the 7th bit element with row vector l2In
7th bit element is identical, and remaining every element is all different, vectorial l1With vectorial l2Between identical bits element number be 2, different bits
Plain number is 7, is denoted in the table as (2/7).Then coding vector relation table is as shown in table 1:
The coding vector relation table of table 1
As shown in table 1:If directly calculating verification data position by the vector in encoder matrix, by vectorial l1It is determined that school
Position is tested, average each verification data position needs 3 XORs, then by vectorial l2It is determined that check bit, average each check number
4 XORs are needed according to position.As shown in Table 1, using vectorial l3Directly calculate verification data position to be less than by other verifications
Interdigit meets the check bit calculated corresponding to the vector, i.e. vector l3Check bit P of corresponding generation1,3Directly counted by the row vector
Calculate and obtain, i.e.,:
In the same manner, l1, l7, l8Corresponding check bit has its corresponding row vector to be directly over XOR generation:
Below by l1, l3, l7, l8Check block obtained by calculating is calculating the verification data corresponding to remaining each vector
Block.As shown in Table 1, by l1The identical bits element number vectorial with other understand with not isotopic element number, can be by l1It is determined that school
Test block to calculate by l9It is determined that verification data block, but due to from vectorial l3, l6It is determined that verification data block calculating l9Need
Less XOR operation, thus, by vectorial l1It is determined that verification data block cannot function as the verification data block that its complement vector determines
Basic verification data block.In the same manner, l3, l7Can not be used as a certain basic verification data block for obtaining remaining verification data block.Due to
By vectorial l8The verification data block of acquisition averagely only needs three XORs to be obtained by vectorial l4The verification data for being determined
Block, therefore, it is possible to by l8It is determined that check block calculate by l4It is determined that verification data block:
Can calculate by l in the same manner6It is determined that verification data block:
Search successively, can respectively by by l4It is determined that verification data block obtain by l after three XORs5Really
Fixed verification data block:
l6It is determined that verification data block may participate in calculating respectively by l2, l9It is determined that verification data block:
So far, encoder matrix determines that the verification data block of generation is all calculated and finishes.Its overall calculation flow process such as Fig. 3 institutes
Show.
For the encoder matrix built using said method, if directly being calculated original piecemeal with original method,
Obtaining complete verification block needs 38 XOR operations, if using set forth herein optimized calculation method, only need 26 XOR fortune
Calculate, i.e., total operation times will save 31.57%.It is required after calculation optimization for the encoder matrix for generating 9 verification data blocks
Amount of calculation be 26 XOR operations, then averagely generating each verification data block needs 26/9=2.89 XOR.I.e. thus
Innovatory algorithm greatly reducing the amount of calculation calculated required for the process of verification data position, so as to greatly save the meter of CPU
Calculation amount.