CN103761171A - Low-bandwidth data reconstruction method for binary coding redundancy storage system - Google Patents

Low-bandwidth data reconstruction method for binary coding redundancy storage system Download PDF

Info

Publication number
CN103761171A
CN103761171A CN201410048536.0A CN201410048536A CN103761171A CN 103761171 A CN103761171 A CN 103761171A CN 201410048536 A CN201410048536 A CN 201410048536A CN 103761171 A CN103761171 A CN 103761171A
Authority
CN
China
Prior art keywords
data
low bandwidth
matrix
check matrix
storage system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410048536.0A
Other languages
Chinese (zh)
Other versions
CN103761171B (en
Inventor
蒋海波
陈建中
李娜
周星梅
王晓京
蒋小强
陈怡�
李�范
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Xinghe Shandong Intelligent Technology Co ltd
Original Assignee
Chengdu Institute of Biology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Institute of Biology of CAS filed Critical Chengdu Institute of Biology of CAS
Priority to CN201410048536.0A priority Critical patent/CN103761171B/en
Publication of CN103761171A publication Critical patent/CN103761171A/en
Application granted granted Critical
Publication of CN103761171B publication Critical patent/CN103761171B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a low-bandwidth data reconstruction method for a binary coding redundancy storage system. The low-bandwidth data reconstruction method for the binary coding redundancy storage system comprises the steps that (1) the corresponding relation between a lost data block and row vectors of a data check matrix is established, and low-bandwidth check matrixes are determined according to sub-matrixes composed of column vectors of the data check matrix corresponding to data blocks which are not lost in the storage system; (2) whether the number of the low-bandwidth check matrixes is larger than one is judged; (3) if yes, whether the I/O pressures, caused by the fact that the lost data block is recovered through all the low-bandwidth check matrixes, on each storage node are equal or not is judged; (4) if not, the low-bandwidth check matrix exerting the smallest I/O pressure is selected to conduct data reconstruction on the lost data block. Compared with the prior art, the low-bandwidth check matrixes are determined, the storage node with the smallest I/O reading pressure is found for recovery of the lost data block, and therefore the network bandwidth consumption caused by data reading of the system storage nodes can be reduced in the process of data block reconstruction.

Description

A kind of low bandwidth data reconstructing method for binary coding redundant storage system
Technical field
The present invention relates to the code storage technical field of electronic information data, particularly data disaster tolerance and the obliterated data low bandwidth reconfiguration technique of data on distributed basic storage architecture.
Background technology
Along with the develop rapidly of cable network technology and radio network technique, utilize network struction distributed memory system to become trend.Yet, logically, network coverage is broad, any node all may cause the permanent loss of some significant data or the damage of in distributed system, storing because of some unpredictable sporadic accidents, as comprised, human operational error, device components fault, earthquake, floods, fire, typhoon and even malice steals etc., may cause thus loss difficult to the appraisal; Sensor network nodes especially for field monitoring, because field environment changes greatly, some extreme environment is comparatively severe, data acquisition node there will be the situation of frequent damage, thereby, how to realize the intact preservation of image data, be badly in need of new memory technology, to meet the high-reliability storage of storage data.
Current data storage is mainly based on " data backup---multi-computer back-up and hot-swap " this data redundancy mechanism, the core of its technology path is file copy technology, such as strange land mirror back-up, backup server etc. that different location is set, be all conventional method, as the vital document of Google's storage system all has 3 even a plurality of copies.Although file copy technology has, storage operation is simple, read operation advantage fast, but the data redundancy of file copy technology path is high, for " disaster recovery ", exist many drawbacks, if mainly adopting file copy technology takes precautions against, obviously on network, at least will the leave unused storage space of a times and even many times of a large amount of backup servers will be had in most time, this will cause the surprising wasting of resources, and the larger this waste of network size is huger.
Current, based on binary coding redundant storage strategy, become gradually the gordian technique of novel information system-based storage architecture.This kind of technology has wide using value at aspects such as data center's storage system, field sensor networks; To system works environmental change greatly, comparatively severe, the situation that back end memory capacity is limited, data are easily lost has good reply effect.
As increasing system is utilized the critical data of correcting and eleting codes redundancy strategy storage system.When system adopts correcting and eleting codes, big or small for the original f of M is split into s the initial piecemeal that size is identical, within each minute, block size is M/s; By s initial piecemeal, calculate r the encoding block that size is identical, s initial piecemeal and r encoding block have formed data to be stored.What by initial piecemeal, obtain that the method for check block adopts is correcting and eleting codes coding techniques, and this kind of technology claims that this correcting and eleting codes is " (s+r, s) correcting and eleting codes " conventionally.For correcting and eleting codes redundant mode, all s+r data block is finally stored in s+r different memory node, as long as the no more than r of malfunctioning node number, the data in malfunctioning node just can not recover original so, therefore for parameter, be the correcting and eleting codes redundancy of s and r, system allows the number of the node that a certain moment breaks down mostly to be r most.If the data in reparation malfunctioning node, correcting and eleting codes redundancy is just many than copying redundancy complexity so.It is example that the coding techniques of take adopts RS code, when a dropout of data block, system need to be arbitrarily s not obliterated data piece be transferred in new node, first decoding obtains original, then coding produces the data block of loss.
Due to when recovering the data block of losing, need arbitrarily s not obliterated data piece be transferred to and in new node, carry out decoding, when data volume is larger, utilize traditional data reconstruction method to bring huge network bandwidth pressure to storage system, reduce the time that obliterated data piece recovers.
Summary of the invention
The problem existing for prior art, the low bandwidth data reconstructing method for binary coding redundant storage system of the network bandwidth pressure bringing to storage system when fundamental purpose of the present invention is to provide a kind of reduction to recover obliterated data piece.
For achieving the above object, the invention provides a kind of low bandwidth data reconstructing method for binary coding redundant storage system, this binary coding redundant storage system comprises an encoder matrix and a data check matrix, this data check matrix comprises row vector and column vector, when the memory node generation of binary coding redundant storage system, damage and cause dropout of data block, the data block of losing is recovered, and this low bandwidth data reconstructing method comprises the steps (1) to step (4):
(1) set up the data block of this loss and the corresponding relation between this data check row matrix vector, and determine low bandwidth check matrix according to the submatrix that in binary coding redundant storage system, the corresponding data check matrix column of obliterated data piece vector does not form;
(2) judge that whether low bandwidth check matrix is more than one;
(3) if low bandwidth check matrix is more than one, whether identical in judgement if utilizing each low bandwidth check matrix to recover needed not obliterated data number of blocks to obliterated data piece, i.e. whether judgement utilizes different low bandwidth check matrixes to recover to obliterated data piece the I/O pressure that brings to each memory node of binary coding redundant storage system identical;
(4) not identical if utilize different low bandwidth check matrixes to recover to obliterated data piece the I/O pressure that brings to each memory node of binary coding redundant storage system, select required reconstruct data piece (data block of not losing) minimum, the low bandwidth check matrix of memory node I/O pressure influence minimum is carried out to data reconstruction to the data block of losing.
Further, when this step (2) judgement low bandwidth check matrix only has one, utilize this low bandwidth check matrix to carry out data reconstruction to the data block of losing.
Further, when this step (3) judgement utilizes each low bandwidth check matrix to recover reconstruct to obliterated data piece, the data volume of needed reconstruct data piece (data block of not losing) is identical, identical to the I/O pressure influence of each memory node of binary coding redundant storage system, select arbitrarily a low bandwidth check matrix to carry out data reconstruction to the data block of losing.
Further, utilize low bandwidth check matrix and part not obliterated data piece the data block of losing is carried out to data reconstruction.
Further, this data check matrix is H (k+r) mrmthis data check matrix comprises (k+r) m row vector and rm column vector, this damage nodes is r ' (1≤r '≤r), the data block that is system loss is r ', the microdata piece that the data block r ' of this loss comprises is r ' m, and this step (1) comprises following steps (11) to step (12):
(11) from data check matrix H (k+r) mrmr ' m column vector of middle selection, makes the non-singular matrix that matrix that the corresponding row vector of microdata piece of matrix that r ' m column vector form and loss forms is (r ' m) * (r ' m);
(12) be somebody's turn to do the non-singular matrix H of (r ' m) * (r ' m) (r ' m) (r ' m)for definite low bandwidth check matrix.
Further, this step (11) comprises following steps (111) to step (115):
(111) computational data check matrix H (k+r) mrmeach column vector in the number of element " 1 ";
(112) from data check matrix H (k+r) mrmin extract the corresponding row vector of obliterated data piece, by the obliterated data piece extracting, form binary matrix H (r ' m) (rm), by data check matrix H (k+r) mrmin remaining row vector form binary matrix H (k+r-r ') mrm, binary matrix H (k+r-r ') mrmthe vector of rm bottom formed a unit matrix, binary matrix H (k+r-r ') mrmthe individual vector of k-r ' on top forms binary matrix H (k-r ') mrm;
(113) determine successively the binary matrix H of the individual vector formation of k-r ' on this top (k-r ') mrmthe number of element in row vector " 0 ", when the number of " 0 " is more than or equal to r ' m in this row vector, records the column vector at each " 0 " element place;
(114) in the column vector at " 0 " element place of this record, further whether searching exists " 0 " element number to be more than or equal to the row vector of r ' m, if nothing, the determined column vector of recording step (113); If have, record new column vector;
(115), according to the number of " 1 " in each group column vector of step (114) record, determine " 1 " element and be r ' m minimum column vector, and determine and be somebody's turn to do " 1 " element and minimum r ' m the H that column vector is corresponding (r ' m) (r ' m)order be full rank, form the non-singular matrix H of (r ' m) * (r ' m) (r ' m) (r ' m).
Further, this step " utilize low bandwidth check matrix and part not obliterated data piece the data block of losing is carried out to data reconstruction " comprises following steps: utilize low bandwidth check matrix H (r ' m) (r ' m)determine the microdata piece that need to participate in data reconstruction; Form a r ' m equation that includes r ' m the microdata piece of losing, utilize the solving equations of this r ' m equation formation to go out r ' m data block of loss.
Further, if this binary matrix H (k-r ') mrmwhile being all less than r ' m without the number of " 0 " element in row vector, cannot obtain low bandwidth check matrix, now, system, when recovering obliterated data piece, cannot obtain the low bandwidth data reconstructing method that reduces system storage node I/O bandwidth.
With respect to prior art, first, the present invention recovers obliterated data piece by determining the low bandwidth check matrix of required reconstruct data amount minimum, in the time of can reducing data block reconstruct, system storage node is carried out to the network bandwidth consumption that data read, reduce storage system internal network and safeguard the pressure of bandwidth, reduce the volume of transmitted data between internal system network, the reading times of reduction system to memory device; Secondly, the present invention can be according to the ruuning situation of data memory node, and the network bandwidth, and I/O situation is determined optimum data reconstruction strategy, to realize the minimum data block of system call, realizes the minimum obliterated data piece reconstructing method of safeguarding bandwidth.At aspects such as mass data storage system, sensor-based system networks, there is good using value.
Accompanying drawing explanation
Fig. 1 is the schematic diagram of existing binary coding redundant storage system data code storage
Fig. 2 is the low bandwidth data reconstructing method system flowchart that the present invention is directed to binary coding redundant storage system
Fig. 3 is the corresponding relation figure between (6,3,4) Fan Demeng systematic code check matrix and deblocking
Embodiment
Below in conjunction with accompanying drawing, describe the specific embodiment of the present invention in detail.
As shown in Figure 1, for existing, utilize binary coding redundant storage system to carry out the schematic diagram of code storage graphic file.Graphic file to be stored is divided into d 1,1, d 1,1, d 1,3individual microdata piece, corresponding D 1, D 2individual data block, also can be called macrodata piece.The corresponding memory node of each macrodata piece.Macrodata piece is the set of microdata piece, and microdata piece is minute module unit minimum in storage system, and for storage file, storage file varies in size, and the size of microdata piece is also different.Each macrodata piece consists of m microdata piece, and each macrodata piece with a global storage in different memory nodes.In cataloged procedure and decode procedure, with microdata Kuai Wei unit, carry out.When storage, with macrodata Kuai Wei unit, store.
If during memory node 1 damage, data block D so 1lose, i.e. corresponding microdata set of blocks d 1,1, d 1,1, d 1,3lose, at this moment just need to carry out decoding reconstruct to obliterated data piece.Wherein, in Fig. 1, the leftmost side is encoder matrix, by existing technology, encoder matrix can be converted to data check matrix, after document No. storage, has just determined the data check matrix recovering for data.Encoder matrix, for source document is carried out to redundancy encoding, produces checking data piece (redundant data piece); When memory node damage appears in system, during dropout of data block, data check matrix is for reconstructing the data block of loss.
As shown in Figure 2, be the low bandwidth data reconstructing method system flowchart that the present invention is directed to binary coding redundant storage system.This low bandwidth data reconstructing method comprises the steps:
There is damage and cause dropout of data block in the memory node of S1, binary coding redundant storage system;
S2, determine the data check matrix of storage system, according to the coding principle of data storage, after document No. storage, just determined the data check matrix recovering for data, this data check matrix comprises row vector and column vector;
S3, judge whether to determine low bandwidth check matrix the whole low bandwidth check matrixes that utilize the file storage corresponding data check matrix of encoder matrix initial used and the relation between data block to determine reconstruction of lost data block.Definite method of low bandwidth check matrix is: set up the data block of this loss and the corresponding relation between this data check row matrix vector, and determine low bandwidth check matrix according to the submatrix that in binary coding redundant storage system, the corresponding data check matrix column of obliterated data piece vector does not form.If can not determine, enter step S4, if can determine, enter step S5;
S4, carry out data reconstruction according to the conventional method;
S5, judged whether if not, to enter step S6 by more than one low bandwidth check matrix, if so, entered step S7;
S6, utilize this low bandwidth check matrix and the intact memory node corresponding with this low bandwidth check matrix to read corresponding data block the data block of losing is carried out to data reconstruction;
Whether S7, judgement judgement utilize each low bandwidth check matrix to recover needed not obliterated data number of blocks to obliterated data piece identical, whether i.e. judgement utilizes different low bandwidth check matrixes to recover to obliterated data piece the I/O pressure that brings to each intact memory node of binary coding redundant storage system identical, if, enter step S8, if not, enter step S9;
S8, select a low bandwidth check matrix and the intact memory node corresponding with this low bandwidth check matrix to read corresponding data block arbitrarily the data block of losing is carried out to data reconstruction;
S9, the I/O that calculates the corresponding memory node of each low bandwidth check matrix read pressure sum;
S10, select whole I/O to read low bandwidth check matrix corresponding to the memory node group of pressure sum minimum as final restructuring matrix, utilize this matrix intact memory node corresponding thereto to read corresponding data block the data block of losing is reconstructed.
The principle of said method is: from check matrix, according to the situation of obliterated data piece, select low bandwidth check matrix, low bandwidth check matrix has been determined needed not obliterated data number of blocks in the process of reconstruction of lost data block, and then in system rejuvenation, because the needed data block of low bandwidth check matrix is less than the needed data block of original method, thereby, the I/O pressure of each memory node of meeting reduction system in data recovery procedure.
If this data check matrix is H (k+r) mrmthis data check matrix comprises (k+r) m row vector and rm column vector, this damage nodes is r ' (1≤r '≤r), the data block that is system loss is r ', the microdata piece that the data block r ' of this loss comprises is r ' m, and usually, damage node number can not surpass r, the data block number of losing can not surpass r, and corresponding microdata piece number of losing can not surpass r * m.Due to when 1≤r ' <r, from data check matrix H (k+r) mrmthe low bandwidth check matrix of r ' m column vector composition recovery obliterated data piece of middle selection has multiple choices method.Thereby whether research exists the wider method of low-dimensional protecting band, to go out to need the low bandwidth check matrix of minimum reconstruct bandwidth be one of innovative point of the present invention to How to choose.Definite method of above-mentioned steps S3 low bandwidth check matrix comprises the steps that S31 is to step S32:
S31, from data check matrix H (k+r) mrmr ' m column vector of middle selection, makes the non-singular matrix that matrix that the corresponding row vector of microdata piece of matrix that r ' m column vector form and loss forms is (r ' m) * (r ' m);
The non-singular matrix H of S32, this (r ' m) * (r ' m) (r ' m) (r ' m)for definite low bandwidth check matrix.
This step S31 comprises following steps S311 to step S315:
S311, computational data check matrix H (k+r) mrmeach column vector in the number of element " 1 ";
S312, from data check matrix H (k+r) mrmin extract the corresponding row vector of obliterated data piece, by the obliterated data piece extracting, form binary matrix H (r ' m) (rm), by data check matrix H (k+r) mrmin remaining row vector form binary matrix H (k+r-r ') mrm, binary matrix H (k+r-r ') mrmthe vector of rm bottom formed a unit matrix, binary matrix H (k+r-r ') mrmthe individual vector of k-r ' on top forms binary matrix H (k-r ') mrm.Due to each column vector of this unit matrix corresponding data block only, therefore, this unit matrix will not affect the number of the data block that participates in restructuring procedure.Therefore, usable range of the present invention can be limited to binary matrix H (k+r-r ') mrmtop, the binary matrix H being built by the individual vector of k-r ' (k-r ') mrm.
S313, the binary matrix H that the individual vector of the k-r ' on definite this top forms successively (k-r ') mrmthe number of element in row vector " 0 ", when the number of " 0 " is more than or equal to r ' m in this row vector, records the column vector at each " 0 " element place;
S314, in the column vector at " 0 " element place of this record, further whether searching exists " 0 " element number to be more than or equal to the row vector of r ' m, if nothing, the determined column vector of recording step S313; If have, record new column vector, so circulation, and record the determined column vector of each circulation.
S315, according to the number of " 1 " in each group column vector of step S314 record, determine " 1 " element and be r ' m minimum column vector, and determine and be somebody's turn to do " 1 " element and minimum r ' m the H that column vector is corresponding (r ' m) (r ' m)order be full rank, form the non-singular matrix H of (r ' m) * (r ' m) (r ' m) (r ' m).
After definite low bandwidth check matrix, utilize low bandwidth check matrix and not obliterated data piece the data block of losing is carried out to data reconstruction, utilize low bandwidth check matrix H (r ' m) (r ' m)determine the microdata piece that need to participate in data reconstruction; Form a r ' m equation that includes r ' m the microdata piece of losing, utilize the solving equations of this r ' m equation formation to go out r ' m data block of loss.
In addition, if this binary matrix H (k-r ') mrmwhile being all less than r ' m without the number of " 0 " element in row vector, low bandwidth check matrix cannot be obtained, data reconstruction can only be carried out according to the conventional method.
Embodiment mono-
When there is node damage in storage system inside, for a storage system by tradition (n, k) MDS correcting and eleting codes structure, when damage appears in have in system≤n-k node, system all needs to call k the data on node to be recovered it, and the correcting and eleting codes of structure has n-k≤k conventionally.
If the check matrix of (6,3) the Fan Demeng system correcting and eleting codes building on scale-of-two as shown in Figure 3, due to β .H (k+r) r=0, wherein β represents to deposit in source document piecemeal and the checking data piecemeal of storage system, with [D 1, D 2, D 3, D 4, L, D 10, L, D (k+r) r] represent.If in storage system there is damage, i.e. storage file piecemeal [D in first memory node 1, D 2, D 3] go out active, β=[X 1, X 2, X 3, D 4, L, D 10, L, D 18], [X wherein 1, X 2, X 3] the corresponding data block for having lost, and [D 10, L, D 18] be checking data piece.Obviously, no matter select data check matrix---any three column vectors in H matrix, as the foundation of recovering obliterated data piece, all can have three checking data pieces to participate in reconstruct.Thereby, can only be by observing the submatrix [l of H matrix 4, l 5, L, l 9] distribution situation of " 0 " " 1 " is determined the reconstructing method of low bandwidth in T.
From H matrix, if will reconstruct three data blocks of loss, need to from 9 column vectors of H matrix, select three column vectors, and the rank of matrix that three column vectors are formed is 3.Nearly step ground for obtaining minimum reconstruct bandwidth reconstructing method, and consumes minimum calculated amount in decode procedure, has following process (as shown in Figure 3):
When restructuring procedure does not need data block D 4participate in computing, the column vector that in H matrix column vector, corresponding the 4th element is " 0 " has: C 2, C 3, C 5, C 9;
Column vector C wherein 2in have 6 elements on position for " 1 ", can be expressed as C 2(6); Column vector C 3in have 4 elements on position for " 1 ", can be expressed as C 3(4); Column vector C 5in have 7 elements on position for " 1 ", can be expressed as C 5(7); Column vector C 9in have 7 elements on position for " 1 ", can be expressed as C 9(7);
When restructuring procedure does not need data block D 5while participating in computing, the column vector that in H matrix column vector, corresponding the 5th bit element is " 0 " has: C 1, C 3, C 7, column vector can be expressed as: C 1(5), C 3(4), C 7(6); Same, when restructuring procedure does not need data block D 6while participating in computing, the column vector that in H matrix column vector, corresponding the 6th element is " 0 " has: C 1, C 2, C 6, C 8, C 9, column vector can be expressed as: C 1(5), C 2(6), C 6(8), C 8(5), C 9(7); When restructuring procedure does not need data block D 7while participating in computing, the column vector that in H matrix column vector, corresponding the 7th bit element is " 0 " has: C 3, C 4, C 5, C 6, C 8, column vector can be expressed as: C 3(4), C 4(8), C 5(7), C 7(6), C 8(5); When restructuring procedure does not need data block D 8while participating in computing, the column vector that in H matrix column vector, corresponding the 8th bit element is " 0 " has: C 1, C 5, C 8, column vector can be expressed as: C 1(5), C 5(7), C 8(5); When restructuring procedure does not need data block D 9while participating in computing, the column vector that in H matrix column vector, corresponding the 9th bit element is " 0 " has: C 2, C 3, column vector can be expressed as: C 2(6), C 3(4).Owing to only having two elements in this row vector, be zero, explanation, when macrodata piece is reconstructed, must have data block D 9participate in recovering reconstruct.
(6,3) Fan Demeng systematic code on binary field, can be easy to determine the low bandwidth restructing algorithm that can save a microdata piece.But can not save the restructing algorithm of two microdata pieces, because there is no on different pieces of information piece identical three column vectors with tense marker in check matrix.Can be from Search Results optional one be recovered Vector Groups, is used for the data block of reconstruction of lost.
For this low bandwidth restructuring procedure, because former method need to be called the macrodata piece on 3 memory nodes when the macrodata piece of reconstruct, i.e. 9 microdata pieces, and utilize the present invention, need 8 microdata pieces,, for whole system, can save 11.1% reconstruct bandwidth, this,, for the limited storage system of internal network, has certain Practical significance.
Embodiment bis-
In order to further illustrate validity of the present invention, the decode procedure that the present invention is directed to STAR code carries out low bandwidth optimization according to the method for the present invention's proposition, the information scale m=5 of STAR code in the present embodiment, and check column scale is 3.According to the building process of STAR code, can obtain, when the information scale m=5 of STAR code, its generator matrix can be expressed as: G = I P , P can be expressed as:
P = 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 1 0 1 1 0 1 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 1 1 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 1 1 0 0 1 1 0 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 0
The character by linear block codes can obtain this coded data check matrix H, if H is expressed as: H = Q I , Q is carried out to mark, can obtain:
Figure BDA0000465138970000131
The low bandwidth data reconstructing method that utilizes the present invention to propose, can obtain the low bandwidth data reconstructing method of system in difference damage situation.During as the damage of first memory node in system, i.e. microdata piece c 0,0, c 1,0, c 2,0, c 3,0during loss, i.e. column vector C in its available Q 3, C 4, C 5, C 6the data block that the submatrix forming carries out losing is reconstructed, c in restructuring procedure 1,1, c 0,2, c 1,2, c 0,3to not participate in restructing operation,, for former reconstructing method, while utilizing the data block on first memory node of this method reconstruct, system can be saved 20% network data transmission bandwidth.As table 1, during for different memory node damage, utilize the effect reaching of low bandwidth data reconstructing method of the present invention:
The low bandwidth restructing algorithm performance of table 1STAR code
Figure BDA0000465138970000132
Figure BDA0000465138970000141
While there is two node damages in system simultaneously, each file will have 8 corresponding dropout of data block, from data check matrix, if reconstruct 8 data blocks of each File lose, need to select in data check matrix 8 column vectors, and the rank of matrix by these 8 column vector structures is 8, the data block that restructural goes out to lose.From data check matrix, for any two nodes damage in system, can select the reconstructing method that can save 1 microdata piece.During the data block in recovering two memory nodes, 19 data blocks of minimum needs, can save a data block.
More than introduced a kind of low bandwidth data reconstructing method for binary coding redundant storage system.The present invention is according in data check matrix " 0; 1 " characteristic distributions, utilize optimizing search method, and read pressure according to the I/O of each memory node, find optimum low bandwidth check matrix and for the restructuring procedure of obliterated data piece, the method can reduce the reconstruct bandwidth of system when reconstruct data piece, alleviates the bandwidth pressure of storage system internal network.The present invention has versatility, can be applied in all code storage systems of utilizing binary matrix structure.The present invention is not limited to above embodiment, and any technical solution of the present invention that do not depart from only carries out to it improvement or change that those of ordinary skills know, within all belonging to protection scope of the present invention.

Claims (8)

1. the low bandwidth data reconstructing method for binary coding redundant storage system, described binary coding redundant storage system comprises an encoder matrix and a data check matrix, described data check matrix comprises row vector and column vector, it is characterized in that, when the memory node generation of binary coding redundant storage system, damage and cause dropout of data block, the data block of losing is recovered, and described low bandwidth data reconstructing method comprises the steps:
(1) set up the data block of described loss and the corresponding relation between described data check row matrix vector, and determine low bandwidth check matrix according to the submatrix that in binary coding redundant storage system, the corresponding data check matrix column of obliterated data piece vector does not form;
(2) judge that whether low bandwidth check matrix is more than one;
(3) if low bandwidth check matrix is more than one, judgement while utilizing each low bandwidth check matrix to recover obliterated data piece needed not obliterated data number of blocks whether identical, i.e. whether judgement utilizes different low bandwidth check matrixes to recover to obliterated data piece the I/O pressure that brings to each memory node of binary coding redundant storage system identical;
(4) not identical if utilize different low bandwidth check matrixes to recover to obliterated data piece the I/O pressure that brings to each memory node of binary coding redundant storage system, select the low bandwidth check matrix of I/O pressure minimum that memory node is brought to carry out data reconstruction to the data block of losing.
2. the low bandwidth data reconstructing method for binary coding redundant storage system as claimed in claim 1, it is characterized in that: when described step (2) judgement low bandwidth check matrix only has one, utilize this low bandwidth check matrix to carry out data reconstruction to the data block of losing.
3. the low bandwidth data reconstructing method for binary coding redundant storage system as claimed in claim 2, it is characterized in that: when described step (3), to utilize different low bandwidth check matrixes to recover to obliterated data piece the I/O pressure that brings to each memory node of binary coding redundant storage system identical, selects arbitrarily a low bandwidth check matrix to carry out data reconstruction to the data block of losing.
4. the low bandwidth data reconstructing method for binary coding redundant storage system as claimed in claim 3, is characterized in that: utilize low bandwidth check matrix and part not obliterated data piece the data block of losing is carried out to data reconstruction.
5. the low bandwidth data reconstructing method for binary coding redundant storage system as described in as arbitrary in claim 1~4, described data check matrix is H (k+r) mrmbe that described data check matrix comprises (k+r) m row vector and rm column vector, described damage nodes is r ' (1≤r '≤r), the data block that is system loss is r ', the microdata piece that the data block r ' of described loss comprises is r ' m, it is characterized in that: described step (1) comprises following steps:
(11) from data check matrix H (k+r) mrmr ' m column vector of middle selection, makes the non-singular matrix that matrix that the corresponding row vector of microdata piece of matrix that r ' m column vector form and loss forms is (r ' m) * (r ' m);
(12) the non-singular matrix H of described (r ' m) * (r ' m) (r ' m) (r ' m)for definite low bandwidth check matrix.
6. the low bandwidth data reconstructing method for binary coding redundant storage system as claimed in claim 5, is characterized in that, described step (11) comprises following steps:
(111) computational data check matrix H (k+r) mrmeach column vector in the number of element " 1 ";
(112) from data check matrix H (k+r) mrmin extract the corresponding row vector of obliterated data piece, by the obliterated data piece extracting, form binary matrix H (r ' m) (rm), by data check matrix H (k+r) mrmin remaining row vector form binary matrix H (k+r-r ') mrm, binary matrix H (k+r-r ') mrmthe vector of rm bottom formed a unit matrix, binary matrix H (k+r-r ') mrmthe individual vector of k-r ' on top forms binary matrix H (k-r ') mrm;
(113) determine successively the binary matrix H of the individual vector formation of k-r ' on described top (k-r ') mrmthe number of element in row vector " 0 ", when the number of " 0 " is more than or equal to r ' m in this row vector, records the column vector at each " 0 " element place;
(114) in the column vector at " 0 " element place of described record, further whether searching exists " 0 " element number to be more than or equal to the row vector of r ' m, if nothing, the determined column vector of recording step (113); If have, record new column vector;
(115) according to the number of " 1 " in each group column vector of step (114) record, determine " 1 " element and be r ' m minimum column vector, and definite and described " 1 " element and r ' m the minimum H that column vector is corresponding (r ' m) (r ' m)order be full rank, form the non-singular matrix H of (r ' m) * (r ' m) (r ' m) (r ' m).
7. the low bandwidth data reconstructing method for binary coding redundant storage system as claimed in claim 6, it is characterized in that, described step " utilize low bandwidth check matrix and part not obliterated data piece the data block of losing is carried out to data reconstruction " comprises following steps:
Utilize low bandwidth check matrix H (r ' m) (r ' m)determine the microdata piece that need to participate in data reconstruction;
Form a r ' m equation that includes r ' m the microdata piece of losing, utilize the solving equations of described r ' m equation formation to go out r ' m data block of loss.
8. the low bandwidth data reconstructing method for binary coding redundant storage system as claimed in claim 7, is characterized in that, if described binary matrix H (k-r ') mrmwhile being all less than r ' m without the number of " 0 " element in row vector, cannot obtain low bandwidth check matrix.
CN201410048536.0A 2014-02-11 2014-02-11 A kind of low bandwidth data reconstructing method for binary coding redundant storage system Active CN103761171B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410048536.0A CN103761171B (en) 2014-02-11 2014-02-11 A kind of low bandwidth data reconstructing method for binary coding redundant storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410048536.0A CN103761171B (en) 2014-02-11 2014-02-11 A kind of low bandwidth data reconstructing method for binary coding redundant storage system

Publications (2)

Publication Number Publication Date
CN103761171A true CN103761171A (en) 2014-04-30
CN103761171B CN103761171B (en) 2017-04-05

Family

ID=50528413

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410048536.0A Active CN103761171B (en) 2014-02-11 2014-02-11 A kind of low bandwidth data reconstructing method for binary coding redundant storage system

Country Status (1)

Country Link
CN (1) CN103761171B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105353974A (en) * 2015-10-08 2016-02-24 华东交通大学 Dual fault-tolerant encoding method applicable to disk array and distributed storage system
CN106788891A (en) * 2016-12-16 2017-05-31 陕西尚品信息科技有限公司 A kind of optimal partial suitable for distributed storage repairs code constructing method
CN106911793A (en) * 2017-03-17 2017-06-30 上海交通大学 The distributed storage data recovery method of I/O optimizations
US10031807B2 (en) 2015-11-04 2018-07-24 International Business Machines Corporation Concurrent data retrieval in networked environments
EP3364541A4 (en) * 2016-12-24 2018-08-22 Huawei Technologies Co., Ltd. Storage controller, data processing chip, and data processing method
CN110968454A (en) * 2018-09-28 2020-04-07 杭州海康威视系统技术有限公司 Method and apparatus for determining recovery data for lost data blocks
CN111679793A (en) * 2020-06-16 2020-09-18 成都信息工程大学 Single-disk fault rapid recovery method based on STAR code

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083379A1 (en) * 2000-11-02 2002-06-27 Junji Nishikawa On-line reconstruction processing method and on-line reconstruction processing apparatus
CN101404563A (en) * 2008-11-20 2009-04-08 吕晓雯 Error control method and system
CN103135946A (en) * 2013-03-25 2013-06-05 中国人民解放军国防科学技术大学 Solid state drive(SSD)-based file layout method in large-scale storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083379A1 (en) * 2000-11-02 2002-06-27 Junji Nishikawa On-line reconstruction processing method and on-line reconstruction processing apparatus
CN101404563A (en) * 2008-11-20 2009-04-08 吕晓雯 Error control method and system
CN103135946A (en) * 2013-03-25 2013-06-05 中国人民解放军国防科学技术大学 Solid state drive(SSD)-based file layout method in large-scale storage system

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105353974A (en) * 2015-10-08 2016-02-24 华东交通大学 Dual fault-tolerant encoding method applicable to disk array and distributed storage system
CN105353974B (en) * 2015-10-08 2018-02-02 华东交通大学 A kind of two fault-tolerant coding methods for being applied to disk array and distributed memory system
US10031807B2 (en) 2015-11-04 2018-07-24 International Business Machines Corporation Concurrent data retrieval in networked environments
CN106788891A (en) * 2016-12-16 2017-05-31 陕西尚品信息科技有限公司 A kind of optimal partial suitable for distributed storage repairs code constructing method
EP3364541A4 (en) * 2016-12-24 2018-08-22 Huawei Technologies Co., Ltd. Storage controller, data processing chip, and data processing method
US10210044B2 (en) 2016-12-24 2019-02-19 Huawei Technologies Co., Ltd Storage controller, data processing chip, and data processing method
CN106911793A (en) * 2017-03-17 2017-06-30 上海交通大学 The distributed storage data recovery method of I/O optimizations
CN106911793B (en) * 2017-03-17 2020-06-16 上海交通大学 I/O optimized distributed storage data repair method
CN110968454A (en) * 2018-09-28 2020-04-07 杭州海康威视系统技术有限公司 Method and apparatus for determining recovery data for lost data blocks
CN110968454B (en) * 2018-09-28 2022-09-09 杭州海康威视系统技术有限公司 Method and apparatus for determining recovery data for lost data blocks
CN111679793A (en) * 2020-06-16 2020-09-18 成都信息工程大学 Single-disk fault rapid recovery method based on STAR code
CN111679793B (en) * 2020-06-16 2023-03-14 成都信息工程大学 Single-disk fault rapid recovery method based on STAR code

Also Published As

Publication number Publication date
CN103761171B (en) 2017-04-05

Similar Documents

Publication Publication Date Title
CN103761171A (en) Low-bandwidth data reconstruction method for binary coding redundancy storage system
CN104052576B (en) Data recovery method based on error correcting codes in cloud storage
Silberstein et al. Lazy means smart: Reducing repair bandwidth costs in erasure-coded distributed storage
Duminuco et al. Hierarchical codes: How to make erasure codes attractive for peer-to-peer storage systems
CN105260259B (en) A kind of locality based on system minimum memory regeneration code repairs coding method
US20140310571A1 (en) Local Erasure Codes for Data Storage
US8775860B2 (en) System and method for exact regeneration of a failed node in a distributed storage system
CN107656832A (en) A kind of correcting and eleting codes method of low data reconstruction expense
US20160006463A1 (en) The construction of mbr (minimum bandwidth regenerating) codes and a method to repair the storage nodes
CN103746774B (en) The fault-tolerant coding method that a kind of efficient data is read
CN105956128B (en) A kind of adaptive coding storage fault-tolerance approach based on simple regeneration code
CN103106124B (en) Intersection reconstruction method based on erasure code cluster memory system
WO2018072294A1 (en) Method for constructing check matrix and method for constructing horizontal array erasure code
CN103916483A (en) Self-adaptation data storage and reconstruction method for coding redundancy storage system
CN107003933B (en) Method and device for constructing partial copy code and data restoration method thereof
CN105356968B (en) The method and system of network code based on cyclic permutation matrices
CN107844272A (en) A kind of cross-packet coding and decoding method for improving error correcting capability
Venkatesan et al. Effect of codeword placement on the reliability of erasure coded data storage systems
CN106788891A (en) A kind of optimal partial suitable for distributed storage repairs code constructing method
CN112799875B (en) Method, system, device and medium for verification recovery based on Gaussian elimination
CN110389848B (en) Partial repetition code construction method based on block construction and fault node repair method
CN114153651B (en) Data encoding method, device, equipment and medium
CN109491835A (en) A kind of data fault tolerance method based on Dynamic Packet code
CN103838649A (en) Method for reducing calculation amount in binary coding storage system
CN103650462A (en) Coding, decoding and data repairing method based on homomorphic self-repairing code and storage system thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211215

Address after: Room 339, Gaoxin building, north of Yuqing East Street and west of central secondary road, Gaoxin District, Weifang City, Shandong Province, 261000

Patentee after: SHANDONG QIFENG ELECTRONIC TECHNOLOGY Co.,Ltd.

Address before: 610041, No. four, 9 South Renmin Road, Chengdu, Sichuan, Wuhou District

Patentee before: CHENGDU INSTITUTE OF BIOLOGY, CHINESE ACADEMY OF SCIENCES

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230612

Address after: 274032 Floor 1, South Span, Building 1, New Century Technopole, No. 666, Jinan Road, Development Zone, Heze City, Shandong Province

Patentee after: Zhongke Xinghe (Shandong) Intelligent Technology Co.,Ltd.

Address before: Room 339, Gaoxin building, north of Yuqing East Street and west of central secondary road, Gaoxin District, Weifang City, Shandong Province, 261000

Patentee before: SHANDONG QIFENG ELECTRONIC TECHNOLOGY Co.,Ltd.