CN101834899B - Distributed adaptive coding and storing method - Google Patents

Distributed adaptive coding and storing method Download PDF

Info

Publication number
CN101834899B
CN101834899B CN2010101596517A CN201010159651A CN101834899B CN 101834899 B CN101834899 B CN 101834899B CN 2010101596517 A CN2010101596517 A CN 2010101596517A CN 201010159651 A CN201010159651 A CN 201010159651A CN 101834899 B CN101834899 B CN 101834899B
Authority
CN
China
Prior art keywords
block
node
information
files
blocks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2010101596517A
Other languages
Chinese (zh)
Other versions
CN101834899A (en
Inventor
王晓京
蒋海波
唐聃
王一丁
肖宜龙
方佳嘉
蔡红亮
王谦
孙宣东
陈峥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Information Technology Co Ltd of CAS
Original Assignee
Chengdu Information Technology Co Ltd of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Information Technology Co Ltd of CAS filed Critical Chengdu Information Technology Co Ltd of CAS
Priority to CN2010101596517A priority Critical patent/CN101834899B/en
Publication of CN101834899A publication Critical patent/CN101834899A/en
Priority to PCT/CN2011/070002 priority patent/WO2011134285A1/en
Application granted granted Critical
Publication of CN101834899B publication Critical patent/CN101834899B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/762Media network packet handling at the source 

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a distributed adaptive coding and storing method comprising the following steps of: detecting the number of storage nodes in a distributed system; adaptively adjusting digits according to the number of the storage nodes in the system and adaptively coding files required to be stored; equalizing the coded files according to the number of the nodes; packaging the equalized files into file blocks by a unified file packaging form; storing the packaged file blocks to each node in the system; detecting on-line stored nodes and judging whether the nodes are completed or not when a certain node sends a file requirement request; if the nodes are uncompleted, acquiring deleted information blocks by the decoding of file blocks of the on-line stored nodes and recombining the decoded deleted information blocks and the traditional information blocks in sequence to obtain an original file. Compared with the prior art, since a coding method is used to realize the storage of the files, the invention realizes higher storage effect of disaster-tolerant node number with lower redundancy.

Description

A kind of distributed adaptive coding and storing method
Technical field
The present invention relates to a kind of coding and storing method of information security field of storage, particularly a kind of distributed adaptive coding and storing method.
Background technology
In the data-centered information age, particularly in recent years, the presentation of data explosive growth, the safe availability how to obtain data with low memory space safely and effectively is the significant problem that field of storage faces.With lower redundancy one of save data challenge that to be field of storage propose in the new century safely and reliably, the sustainable disaster tolerance ability that how to improve storage system also becomes the emphasis of industry research.At present, for improving reliability or the performance of system, mainly be to adopt the scheme that copies, but realize effectively that in large scale distributed system sharing data's consistency is the main bugbear that reproduction technology faces.Simultaneously, for improving the reliability of system, the book copying of file has been improved greatly the redundancy of data, stored for big data quantity, improved greatly the data storage cost, special purpose memory devices is comparatively expensive simultaneously, and is not easy expansion.
Summary of the invention
For the problem that prior art exists, main purpose of the present invention is to provide a kind of redundancy low, and is applicable in the uncertain peer-to-peer network of memory node number or the self-organizing network.
For achieving the above object, the invention provides a kind of distributed adaptive coding and storing method, be applied in the distributed system, this distributed adaptive coding and storing method comprises the steps:
(1) number of memory node in the detection distributed system;
(2) carry out self adaptation according to the number of memory node in the system and adjust code word, the file that needs are stored carries out adaptive coding;
(3) file after will encoding carries out five equilibrium by the node number;
(4) file behind the five equilibrium is packaged into blocks of files with unified file encapsulation format, each blocks of files all comprises encoding block, block of information and check block, comprises codeword information in the described encoding block;
(5) blocks of files after will encapsulating is stored to each node in the system;
(6) send the file requirement request when a certain node, whether completely then detect on-line storage node and decision node;
(7) if node is complete, then the block of information with all on-line storage nodes is sent to the file request node, and restructuring obtains original in order;
(8) if node is imperfect, then utilize still the block of information of obtaining disappearance in the blocks of files decoding of the on-line storage node of activity, will decipher out the block of information of disappearance and the existing block of information acquisition original of recombinating in order; After memory node recovers the block of information of losing, again utilize self-adaptive encoding method to carry out secondary coding and the blocks of files of file encapsulation again to recover to lose, and the blocks of files that will again recover is stored in order still on the on-line storage node of activity.
Wherein, one of step of above-mentioned distributed adaptive coding and storing method " carry out self adaptation according to the number of memory node in the system and adjust code word, the file that needs are stored carries out adaptive coding " specifically comprises the steps:
(1) number of memory node is m in the distributed system, and constructing a parameter is (n-t, n-1, t, (n-t) * t/ (n-1)) correcting and eleting codes, distance is equated, be that constructing variable is (n-t, n-1, t, (n-t) * t/ (n-1)) the numerical value of each row of matrix be that 1 number equates, numerical value of each row is the code word that 1 number also equates, saves as codeword information, wherein n 〉=m, n>t specifically comprises the steps: 1. to make n=m; 2. 1 and n between seek t so that (n-t) * t/ (n-1) is divided exactly establishment, and t ≠ 1, t ≠ n; If 3. do not have t so that (n-t) * t/ (n-1) divides exactly establishment, then allow n=n+1, and return above-mentioned steps 2., until have t so that (n-t) * t/ (n-1) divides exactly establishment; 4. calculate so that
Figure GSA00000103270600031
T when getting minimum value, namely this moment n, t is selected parameter; 5. the 0th row with (n-t) * (n-1) matrix A are 1 to all elements assignment that t-1 is listed as, and other element assignment are 0; 6. to row t≤j<n-1, the numerical value of compute matrix A j row is 1 number, if the numerical value of j row be 1 number less than (n-t) * t/ (n-1), inevitable a I, j '=1 so that in the j ' row numerical value be 1 quantity greater than (n-t) * t/ (n-1), wherein 0≤j '<n-1, and j ' ≠ j, then assignment a I, j=1, a I, j '=0; 7. repeat above-mentioned steps 6., until 1 number is equal to (n-t) * t/ (n-1) in the every row of matrix A;
(2) with the some random division on the matrix A that constructs be t the set D that contained number of elements equates 0D I-1, check block can be drawn by following formula: I=n-t ... n-1, j=0 ... n-1, wherein, d I-(n-t), sFor with the set D I-(n-t)In matrix A corresponding to element in point.
The encoding block that relates in the above-mentioned distributed adaptive coding and storing method, except codeword information, also comprise: data block size, coded block size, block of information size, check block are big or small, data block label, extend information and grouping information, wherein the data block size equals coded block size, data block size and check block size sum, utilizes coded block size, data block size and check block size can obtain respectively encoding block, data block and check block in the blocks of files.
One of step of above-mentioned distributed adaptive coding and storing method " if node is imperfect, is then utilized still the block of information of obtaining disappearance in the blocks of files decoding of the on-line storage node of activity " has multiple implementation as follows:
The first implementation comprises the steps:
(1) blocks of files with all on-line storage nodes is sent to the file request node;
(2) decipher the block of information of obtaining disappearance according to the codeword information in the blocks of files encoding block and check block.
The second implementation comprises the steps:
(1) detects file request node and the internodal network connection situation of each on-line storage;
(2) whether difference is larger to judge between file request node and each on-line storage node the network connection situation;
(3) if the determination result is NO, then the blocks of files with all on-line storage nodes is sent to the file request node, deciphers the block of information of obtaining disappearance according to the codeword information in the blocks of files encoding block and check block;
(4) if judged result is yes, then calculate and recover the required minimum blocks of files number p of disappearance block of information, select p the on-line storage node that is connected optimum with the file request meshed network, the blocks of files of this p optimum on-line storage node is sent to the file request node, deciphers the block of information of obtaining disappearance according to the codeword information in the blocks of files encoding block and check block.
The third implementation comprises the steps:
(1) calculates the required minimum blocks of files number p of recovery disappearance block of information;
(2) detect file request node and the internodal network connection situation of each on-line storage, and whether difference is larger to judge between file request node and each on-line storage node the network connection situation;
(3) if the determination result is NO, select any p on-line storage node, the blocks of files of this p on-line storage node is sent to the file request node, obtain the block of information of disappearance according to the codeword information in the blocks of files encoding block and check block decoding;
(4) if judged result is yes, select p the on-line storage node that is connected optimum with the file request meshed network, the blocks of files of this p optimum on-line storage node is sent to the file request node, deciphers the block of information of obtaining disappearance according to the codeword information in the blocks of files encoding block and check block.
In addition, after decision node is imperfect, if more than one of the quantity of documents of existence disappearance block of information, can utilize still the on-line storage node in activity respectively the file that has disappearance block of information to be recovered, should be responsible for the on-line storage node that the file to a certain existence disappearance block of information recovers this moment and be the file request node.
One of step that relates in the above-mentioned distributed adaptive coding and storing method " is deciphered the block of information of obtaining disappearance according to the codeword information in the blocks of files encoding block and check block ", quantity≤the t/2 that damages when memory node can recover fully to the block of information of disappearance, and this step specifically comprises the steps:
The state initial markers of the check block in the All Files piece that (1) the file request node is received is " available ";
(2) select at random a state to be the check block a of " available " I, jWhether the block of information that checks its verification is deleted, if wherein there is not block of information deleted, then the status indication with this check block is " useless ", if wherein there is and only has a block of information deleted, be " useless " with the status indication of this check block equally then, and this deleted block of information is recovered by following formula:
Figure GSA00000103270600051
I-(n-t) ≠ i ' wherein, s ≠ j ', i=n-t ... n-1, j=0 ... n-1;
(3) repeating step (2) is until the status indication of all check blocks is " useless ".
The present invention is with respect to prior art, and (1) realizes the higher storage effect of disaster tolerance nodes owing to utilizing coding method to realize the lasting availability storage of file with lower redundancy, and along with the increase of memory node, the disaster tolerance ability also can improve thereupon; (2) because the coding method of using can be carried out the self adaptation adjustment according to the number of memory node in the distributed system, the ability and the authority that have the acquisition file that is equal between the memory node, improved extensibility and the adaptivity of system, counted in the uncertain peer-to-peer network of height or the self-organizing network applicable to memory node; (3) because memory node has Self-organization, when having memory node that damage occurs, system can be automatically according to demand, utilize remaining on-line storage node according to the principle of node calculating and memory load equilibrium, carry out file and recover and Data Migration, to realize the sustainable disaster tolerance ability of system.
Description of drawings
Fig. 1 is the flow chart of distributed adaptive coding and storing method storing process of the present invention
Fig. 2 is for storing the schematic diagram in the system into according to distributed adaptive coding and storing method of the present invention
Fig. 3 is the blocks of files schematic diagram in the distributed adaptive coding and storing method of the present invention
Fig. 4 is distributed adaptive coding and storing method file acquisition process the first embodiment flow chart of the present invention
Fig. 5 is distributed adaptive coding and storing method file acquisition process the second embodiment flow chart of the present invention
Fig. 6 is distributed adaptive coding and storing method file acquisition process the 3rd embodiment flow chart of the present invention
Fig. 7 is distributed adaptive coding and storing method node of the present invention file acquisition schematic diagram when complete
File acquisition the first embodiment schematic diagram when Fig. 8 is distributed adaptive coding and storing method node damage of the present invention
File acquisition the second embodiment schematic diagram when Fig. 9 is distributed adaptive coding and storing method node damage of the present invention
Embodiment
Below in conjunction with accompanying drawing, describe the specific embodiment of the present invention in detail.
Distributed adaptive coding and storing method of the present invention is applied in the distributed system, comprises the storing process of file and the acquisition process of file.
Below in conjunction with Fig. 1-Fig. 3, describe the file storing process of distributed adaptive coding and storing method of the present invention in detail.As shown in Figure 1, this document storing process mainly comprises the steps:
The number of memory node as shown in Figure 2, is a distributed system in S11, the detection distributed system, exists m memory node to be connected by computer network;
S12, carry out self adaptation according to the number of memory node in the system and adjust code word, the file of needs storage is carried out adaptive coding, specifically comprise the steps:
(1) number of memory node is m in the distributed system, and constructing a parameter is (n-t, n-1, t, (n-t) * t/ (n-1)) correcting and eleting codes, distance is equated, be that constructing variable is (n-t, n-1, t, (n-t) * t/ (n-1)) the numerical value of each row of matrix be that 1 number equates, numerical value of each row is the code word that 1 number also equates, saves as codeword information, wherein n 〉=m, n>t specifically comprises the steps: 1. to make n=m; 2. 1 and n between seek t so that (n-t) * t/ (n-1) is divided exactly establishment, and t ≠ 1, t ≠ n; If 3. do not have t so that (n-t) * t/ (n-1) divides exactly establishment, then allow n=n+1, and return above-mentioned steps 2., until have t so that (n-t) * t/ (n-1) divides exactly establishment; 4. calculate so that
Figure GSA00000103270600071
T when getting minimum value, namely this moment n, t is selected parameter; 5. the 0th row with (n-t) * (n-1) matrix A are 1 to all elements assignment that t-1 is listed as, and other element assignment are 0; 6. to row t≤j<n-1, the numerical value of compute matrix A j row is 1 number, if the numerical value of j row be 1 number less than (n-t) * t/ (n-1), inevitable a I, j '=1 so that in the j ' row numerical value be 1 quantity greater than (n-t) * t/ (n-1), wherein 0≤j '<n-1, and j ' ≠ j, then assignment a I, j=1, a I, j '=0; 7. repeat above-mentioned steps 6., until 1 number is equal to (n-t) * t/ (n-1) in the every row of matrix A;
(2) with the some random division on the matrix A that constructs be t the set D that contained number of elements equates 0D I-1, check block can be drawn by following formula: I=n-t ... n-1, j=0 ... n-1, wherein, d I-(n-t), sFor with the set D I-(n-t)In matrix A corresponding to element in point.
S13, the file after will encoding carry out five equilibrium by the node number;
S14, file behind the five equilibrium is packaged into blocks of files with unified file encapsulation format, as shown in Figure 3, blocks of files after this encapsulation comprises encoding block altogether, block of information and check block three parts, wherein encoding block comprises the data block size in order, coded block size, the block of information size, the check block size, the data block label, extend information, this eight part of codeword information and grouping information, the data block size equals coded block size, data block size and check block size sum, utilize coded block size, data block size and check block size can be obtained respectively the encoding block in the blocks of files, data block and check block, extend information is the space that the blocks of files later use is reserved, and codeword information is by formed correcting and eleting codes behind the above-mentioned self-adaptive encoding method; Block of information is namely undertaken forming behind the five equilibrium by the node number to original; Check block is for utilizing codeword information and original, by the rear formation of encoding of above-mentioned self-adaptive encoding method.
S15, the blocks of files after will encapsulating are stored to each node in the system.Result after the storage is stored to respectively memory node 1 with the blocks of files that forms after a certain file encapsulation as shown in Figure 2, and memory node 2 is until memory node m.Certainly, can store a more than file in this distributed system, each file is encoded, encapsulates, is stored by identical method, therefore, may store a more than blocks of files in each memory node, forms the blocks of files set.
Below in conjunction with Fig. 4, Fig. 7 and Fig. 8, describe file acquisition process first embodiment of distributed adaptive coding and storing method of the present invention in detail.As shown in Figure 4, the acquisition process of this document mainly comprises the steps:
S41, a certain node send the file requirement request, and this document requesting node can be a certain memory node in this distributed system, also can be the outer a certain nodes of distributed system;
S42, detection on-line storage node;
Whether S43, decision node be complete, if node is complete, and then to S44, otherwise, to S45;
S44, the block of information of all on-line storage nodes is sent to the file request node, and (wherein the transfer of data response protocol comprises: send data answering control word Req1 for obtaining encoding block to the request of response file requesting node; Sending data answering control word Req2 is to response file requesting node request acquired information piece; Send data answering control word Req3 and obtain check block to the request of response file requesting node; Send data answering control word Req4 and obtain blocks of files to the request of response file requesting node);
S46, complete original is obtained in restructuring in order.
Visible Fig. 7 of file acquisition schematic diagram when this node is complete, because at the file memory phase, original is divided into m blocks of files by the number m of memory node, each blocks of files all comprises encoding block, block of information and check block, and block of information is namely undertaken forming behind the five equilibrium by the memory node number to original, therefore, when memory node is complete, only need block of information part with all memory nodes by Internet Transmission to the file request node, all block of informations that then will be received by the file request node are recombinated in order and can be obtained complete original.
S45, the blocks of files of all on-line storage nodes is sent to the file request node;
S451, the block of information of disappearance is obtained in decoding according to the codeword information in the blocks of files encoding block and check block;
S46, complete original is obtained in restructuring in order.
Visible Fig. 8 of file acquisition schematic diagram when this node is imperfect, when having one or several memory node to damage, can recover the block of information of disappearance by all the other memory nodes, only need the quantity k that memory node damages≤t/2 (t draws by the step 1 of above-mentioned self-adaptive encoding method) to recover fully the block of information of disappearance.Number such as memory node is m, each memory node has blocks of files, comprise blocks of files 1, blocks of files 2...... blocks of files m, when memory node 4 is damaged, namely lack blocks of files 4, this restoration methods is: the blocks of files of all on-line storage nodes is sent to the file request node by network, the blocks of files size of this document requesting node by storing in each blocks of files, coded block size, block of information size and check block size obtain respectively the encoding block in each blocks of files, block of information and check block, codeword information and the check block stored in the recycling encoding block are deciphered, thereby obtain the block of information of disappearance, the block of information that regains after existing block of information and the decoding is recombinated in order can obtain complete original.
In addition, utilize codeword information and the check block stored in the encoding block to decipher, thereby the block of information (exist k≤t/2 row deletion wrong in the correcting and eleting codes array, the k row deletion that then can recover all is wrong) that obtains disappearance mainly comprise the steps:
The status indication of the check block in the All Files piece that (1) the file request node is received is " available ";
(2) select at random a state to be the check block a of " available " I, jWhether the block of information that checks its verification is deleted, if wherein there is not block of information deleted, then the status indication with this check block is " useless ", if wherein there is and only has a block of information deleted, be " useless " with the status indication of this check block equally then, and this deleted block of information is recovered by following formula:
Figure GSA00000103270600091
I-(n-t) ≠ i ' wherein, s ≠ j ', i=n-t ... n-1, j=0 ... n-1;
(3) repeating step (2) is until the status indication of all check blocks is " useless ".
Below in conjunction with Fig. 5, Fig. 7, Fig. 8 and Fig. 9, describe file acquisition process second embodiment of distributed adaptive coding and storing method of the present invention in detail.As shown in Figure 5, the acquisition process of this document mainly comprises the steps:
S51, a certain node send the file requirement request, and this document requesting node can be a certain memory node in this distributed system, also can be the outer a certain nodes of distributed system;
S52, detection on-line storage node;
S53, judge whether memory node is complete, if node is complete, then to S54, otherwise, to S55;
S54, then the block of information with all on-line storage nodes is sent to the file request node;
S56, restructuring obtains complete original in order.
The process of file acquisition was identical with the first embodiment when this node was complete, visible Fig. 7 of file acquisition schematic diagram when this node is complete, soon to the file request node, recombinate in order and can obtain complete original by all block of informations that then will be received by the file request node by Internet Transmission for the block of information of all memory nodes part.
S55, detection file request node and the internodal network connection of each on-line storage;
S551, whether difference is larger to judge between file request node and each on-line storage node the network connection situation, if difference is larger, then to S552, if difference is little, then to S556;
S552, calculating recover the required minimum blocks of files number p of disappearance block of information, p is drawn by following mode: after storing in the system according to distributed adaptive coding and storing method of the present invention, if the quantity of memory node is m, then the quantity k that the need memory node a damages≤t/2 (t draws by the step 1 of above-mentioned self-adaptive encoding method) can recover fully to the block of information of disappearance, if therefore will recover all block of information, the maximum quantity that memory node damages is t/2, then recovers the number p=m-t/2 of the required minimum blocks of files of disappearance block of information;
S553, carry out network link ordering, namely the network connection situation according to each on-line storage node and file request node sorts to each node;
S554, select the on-line storage node of p optimum network connection;
S555, with the blocks of files of the p that chooses optimum on-line storage node and the block of information Transmit message requesting node of all the other on-line storage nodes;
Disappearance block of information is obtained in S557, decoding, namely by the blocks of files of p optimum on-line storage node the block of information of disappearance is recovered, the pointed decode procedure of its decode procedure and file acquisition process first embodiment of distributed adaptive coding and storing method of the present invention is identical;
S56, obtain complete original, the block of information that is about to the block of information of the disappearance deciphered by the blocks of files of p optimum on-line storage node and the transmission of all the other on-line storage nodes is recombinated in order and is obtained complete original.
This node is imperfect, and when network connection situation difference is larger between file request node and each on-line storage node, the visible Fig. 9 of the schematic diagram of file acquisition process.The blocks of files that does not need to obtain all on-line storage nodes because of the block of information that will recover to lack, and the blocks of files that only need to obtain p on-line storage node can be recovered the block of information of all disappearances, therefore, for NLB, select the blocks of files of on-line storage node of p network of p network connection situation optimum to come the block of information of disappearance is recovered, and to obtain complete original, the block of information that only need obtain again all the other on-line storage nodes is recombinated in order and is got final product.
The blocks of files Transmit message requesting node of S556, all on-line storage nodes;
Disappearance block of information is obtained in S557, decoding;
S56, obtain complete original.
This node is imperfect, and when network connection situation difference was little between file request node and each on-line storage node, the schematic diagram of its file acquisition process was seen Fig. 8, and the file acquisition process when imperfect with the node of enumerating among the first embodiment is identical.The blocks of files that is about to all on-line storage nodes is sent to the file request node by network, this document requesting node obtains respectively encoding block, block of information and check block in each blocks of files by blocks of files size, coded block size, block of information size and the check block size of storing in each blocks of files, codeword information and the check block stored in the recycling encoding block are deciphered, thereby obtain the block of information of disappearance, the block of information that regains after existing block of information and the decoding is recombinated in order can obtain complete original.
Below in conjunction with Fig. 6, Fig. 7 and Fig. 9, describe file acquisition process the 3rd embodiment of distributed adaptive coding and storing method of the present invention in detail.As shown in Figure 9, the acquisition process of this document mainly comprises the steps:
S61, a certain node send the file requirement request, and this document requesting node can be a certain memory node in this distributed system, also can be the outer a certain nodes of distributed system;
S62, detection on-line storage node;
S63, judge whether memory node is complete, if node is complete, then to S64, otherwise, to S65;
S64, the block of information of all on-line storage nodes is sent to the file request node;
S67, restructuring obtains complete original in order.
The process of file acquisition was identical with the second embodiment with the first embodiment when this node was complete, visible Fig. 7 of file acquisition schematic diagram when this node is complete, soon to the file request node, recombinate in order and can obtain complete original by all block of informations that then will be received by the file request node by Internet Transmission for the block of information of all memory nodes part.
S65, calculating recover the required minimum blocks of files number p of disappearance block of information, and p=m-t/2 is identical with the computational methods of enumerating among the second embodiment;
S66, detection file request node and the internodal network connection of each on-line storage;
S661, whether difference is larger to judge between file request node and each on-line storage node the network connection situation, if difference is larger, then to S662, if difference is little, then to S665;
S662, carry out network link ordering, namely the network connection situation according to each on-line storage node and file request node sorts to each node;
S663, select the on-line storage node of p optimum network connection;
S664, with the blocks of files of the p that chooses optimum on-line storage node and the block of information Transmit message requesting node of all the other on-line storage nodes;
Disappearance block of information is obtained in S666, decoding, namely by the blocks of files of p optimum on-line storage node the block of information of disappearance is recovered, the pointed decode procedure of its decode procedure and file acquisition process first embodiment of distributed adaptive coding and storing method of the present invention is identical;
S67, obtain complete original, the block of information that is about to the block of information of the disappearance deciphered by the blocks of files of p optimum on-line storage node and the transmission of all the other on-line storage nodes is recombinated in order and is obtained complete original.
This node is imperfect, and when network connection situation difference is larger between file request node and each on-line storage node, the visible Fig. 9 of the schematic diagram of file acquisition process.Imperfect with the node of enumerating among the second embodiment, and the file acquisition process when network connection situation difference is larger between file request node and each on-line storage node is roughly the same, select the blocks of files of on-line storage node of p network of p network connection situation optimum to come the block of information of disappearance is recovered, the block of information of obtaining again all the other on-line storage nodes is recombinated in order and can be obtained complete original.
S665, will be arbitrarily the blocks of files of p on-line storage node and the block of information Transmit message requesting node of all the other on-line storage nodes;
Disappearance block of information is obtained in S666, decoding, namely the blocks of files by any p on-line storage node is recovered the block of information of disappearance, and the pointed decode procedure of its decode procedure and file acquisition process first embodiment of distributed adaptive coding and storing method of the present invention is identical;
S67, obtain complete original, be about to the block of information of the disappearance deciphered by the blocks of files of any p on-line storage node and block of information that all the other on-line storage nodes transmit and recombinate in order and obtain complete original.
When node imperfect, and when network connection situation difference is little between file request node and each on-line storage node, the visible Fig. 9 of the schematic diagram of file acquisition process.Can recover the block of information of all disappearances because of the blocks of files that only need to obtain p on-line storage node, therefore, do not need to obtain the blocks of files of all on-line storage nodes, the blocks of files that only need obtain p on-line storage node gets final product, because network connection situation difference is little between file request node and each on-line storage node, so do not need to carry out the network link ordering, only need select the blocks of files of any p on-line storage node to decipher out the block of information of disappearance, then recombinate in order with the block of information of all the other on-line storage nodes and obtain complete original.
More than enumerated file acquisition process first, second, and third embodiment of distributed adaptive coding and storing method of the present invention.When memory node was complete, the obtain manner of these three embodiment was identical; And when memory node is imperfect, can the block of information of disappearance be recovered by different modes, the most basic method is to utilize the blocks of files of all on-line storage nodes to lack the recovery of block of information, so namely can take more network latency; And in fact do not need all blocks of files, recover and only need p blocks of files to lack block of information to all, therefore, whether difference is larger to judge between file request node and each on-line storage node the network connection situation, if difference is larger, then the blocks of files of p on-line storage node of selection optimum lacks the recovery of block of information, if difference is little, then select arbitrarily the blocks of files of p on-line storage node to lack the recovery of block of information, can realize NLB like this, network latency is minimized.
In addition, after decision node is imperfect, if more than one of the quantity of documents of storing in the distributed system, then exist this moment the quantity of documents of disappearance block of information more than one, for calculating and the memory load balance that respectively remains node in the realization system, can utilize remaining on-line storage node respectively the file that has disappearance block of information to be recovered, should be the file request node by the responsible on-line storage node that the file of a certain disappearance block of information is recovered this moment.Damaging front distributed system such as node has m memory node, is respectively k 0, k 1... k M-1, be that (l<m/4), then remain node is k to l when losing nodes l, k L+1... k M-1If the number of files that stored this moment is h, be respectively f in system 0, f 1F H-1The blocks of files of then losing is h * l, owing to need to carry out in a large number with exclusive disjunction and need to carry out quadratic distribution to h * l the blocks of files that recovers and store in file recovery process, for calculating and the memory load balance of each node in the realization system, then utilizing remaining k l, k L+1... k M-1Individual node, order is recovered the disappearance block of information of h file successively, such as k lTo f 0The block of information of disappearance is recovered k L+1To f 1The block of information of disappearance is recovered, the like.During this period, if when user or system node Transmit message requirement request, then the recovering process of this missing documents block of information is forced to be placed on the front end of all the other tasks, the disappearance block of information of this document is at first recovered.
The below is with k lTo f 0The block of information of disappearance reverts to example, enumerates the process of recovering disappearance block of information:
(1) detects file request node k lWith the internodal network connection of each on-line storage;
(2) calculate recovery file f 0The required minimum blocks of files number p of disappearance block of information, the computational methods of p are identical with the computational methods that file acquisition process second embodiment of distributed adaptive coding and storing method of the present invention enumerates;
(3) judge file request node k lAnd whether difference is larger for the network connection situation between each on-line storage node, if difference is larger, then carry out the network link ordering, each link is sent test packet carry out the data path test and appraisal, select the on-line storage node of p optimum network connection, with the blocks of files Transmit message requesting node of the individual optimum on-line storage node of p chosen; If difference is little, then choose the blocks of files Transmit message requesting node k of any p on-line storage node l
(4) disappearance block of information is obtained in decoding, namely the blocks of files by p on-line storage node is recovered the block of information of disappearance, and the pointed decode procedure of its decode procedure and file acquisition process first embodiment of distributed adaptive coding and storing method of the present invention is identical.
More than enumerated memory node k lTo f 0The method that the block of information of disappearance is recovered, the method that all the other memory nodes recover the block of information of File lose is identical therewith.
After memory node recovers the file of losing, according to mark and memory node and the blocks of files of losing, again utilize self-adaptive encoding method that the file after recovering is carried out secondary coding (codeword information is identical with the codeword information of encoding first), regain the check block of losing, according to unified file encapsulation format, carry out Reseal according to each blocks of files size of losing, can again recover the blocks of files of losing, the blocks of files that will again recover afterwards is stored in order at present still on the memory node of activity.Memory node k for example lAgain the file f that recovers 0L blocks of files after, equalization is stored in k in order respectively l, k L+1... k 2l-1On.
Utilize the above method of introducing, l the blocks of files that will restore at this memory node, sequential storage is on the memory node of the l take this memory node as starting point.H * the l that is resumed a like this blocks of files will evenly distribute, and be stored in the system still on m-l node of activity.When the memory node of damage is resumed work, the blocks of files that temporarily is stored on all the other nodes will clip on the node of new adding, thereby realize the sustainability fault-tolerant ability of file.
More than introduced distributed adaptive coding and storing method; but the present invention is not limited to above embodiment; any technical solution of the present invention that do not break away from is namely only carried out improvement or the change that those of ordinary skills know to it, all belongs within protection scope of the present invention.

Claims (5)

1. a distributed adaptive coding and storing method is applied to it is characterized in that in the distributed system, and the method comprises the steps:
Detect the number of memory node in the distributed system;
Carry out self adaptation according to the number of memory node in the system and adjust code word, file to the needs storage carries out adaptive coding, the number that specifically comprises the steps: memory node in (1) distributed system is m, constructing a parameter is (n-t, n-1, t, (n-t) * t/ (n-1)) correcting and eleting codes, make between its code word apart from equating, be that constructing variable is (n-t, n-1, t, (n-t) * t/ (n-1)) the numerical value of each row of matrix be that 1 number equates, numerical value of each row is the code word that 1 number also equates, saves as codeword information, wherein n 〉=m, n>t specifically comprises the steps: 1. to make n=m; 2. 1 and n between seek t so that (n-t) * t/ (n-1) is divided exactly establishment, and t ≠ 1, t ≠ n; If 3. do not have t so that (n-t) * t/ (n-1) divides exactly establishment, then allow n=n+1, and return above-mentioned steps 2., until have t so that (n-t) * t/ (n-1) divides exactly establishment; 4. calculate so that T when getting minimum value, namely this moment n, t is selected parameter; 5. the 0th row with (n-t) * (n-1) matrix A are 1 to all elements assignment that t-1 is listed as, and other element assignment are 0; 6. to row t≤j<n-1, the numerical value of compute matrix A j row is 1 number, if the numerical value of j row be 1 number less than (n-t) * t/ (n-1), inevitable
Figure FSB00000957021200012
a I, j '=1 so that in the j ' row numerical value be 1 quantity greater than (n-t) * t/ (n-1), wherein 0≤j '<n-1, and j ' ≠ j, then assignment a I, j=1, a I, j '=0: 7. repeat above-mentioned steps 6., until 1 number is equal to (n-t) * t/ (n-1) in the every row of matrix A; (2) with the some random division on the matrix A that constructs be t the set D that contained number of elements equates 0D T-1, check block is drawn by following formula:
Figure FSB00000957021200013
I=n-t ... n-1, j=0 ... n-1, wherein, d I-(n-t), sFor with the set D I-(n-t)In matrix A corresponding to element in point;
File behind the coding is carried out five equilibrium by the memory node number;
File behind the five equilibrium is packaged into blocks of files with unified file encapsulation format, and each blocks of files all comprises encoding block, block of information and check block, comprises codeword information in the described encoding block;
Blocks of files after the encapsulation is stored to each memory node in the system;
When a certain node sends the file requirement request, then detect the on-line storage node and judge whether memory node is complete;
If node is complete, then the block of information with all on-line storage nodes is sent to the file request node, and restructuring obtains original in order;
If node is imperfect, then utilize the blocks of files decoding of on-line storage node to obtain the block of information of disappearance, will decipher out the block of information of disappearance and the existing block of information acquisition original of recombinating in order; After the on-line storage node recovers the block of information of losing, again utilize self-adaptive encoding method to carry out secondary coding and the blocks of files of file encapsulation again to recover to lose, and the blocks of files that will again recover is stored on the on-line storage node in order, if wherein node is imperfect, the block of information of then utilizing the blocks of files decoding of on-line storage node to obtain disappearance specifically comprises following steps: after decision node is imperfect, if more than one of the quantity of documents of existence disappearance block of information, then utilize online memory node respectively the file that has disappearance block of information to be recovered, should be the file request node by the responsible on-line storage node that the file of a certain existence disappearance block of information is recovered this moment, quantity≤t/2 that memory node damages can recover fully to the block of information of disappearance, namely decipher the block of information of obtaining disappearance according to the codeword information in the blocks of files encoding block and check block, comprise the steps: that specifically (1) is " available " with the state initial markers of the check block in the All Files piece of file request node reception; (2) select at random a state to be the check block a of " available " I, jWhether the block of information that checks its verification is deleted, if wherein there is not block of information deleted, then the status indication with this check block is " useless ", if wherein there is and only has a block of information deleted, be " useless " with the status indication of this check block equally then, and this deleted block of information is recovered by following formula:
Figure FSB00000957021200021
I-(n-t) ≠ i ' wherein, s ≠ j ', i=n-t...n-1, j=0...n-1; (3) repeating step (2) is until the status indication of all check blocks is " useless ".
2. distributed adaptive coding and storing method according to claim 1, it is characterized in that, except codeword information, described encoding block also comprises: data block size, coded block size, block of information size, check block are big or small, data block label, extend information and grouping information, utilize coded block size, block of information size and check block size can obtain respectively encoding block, block of information and check block in the blocks of files.
3. distributed adaptive coding and storing method according to claim 2 is characterized in that, the All Files piece that described file request node receives specifically obtains as follows: the blocks of files of all on-line storage nodes is sent to the file request node.
4. distributed adaptive coding and storing method according to claim 2 is characterized in that, the All Files piece that described file request node receives specifically obtains as follows:
Detect file request node and the internodal network connection situation of each on-line storage;
Whether difference is larger to judge between file request node and each on-line storage node the network connection situation;
If the determination result is NO, then the blocks of files with all on-line storage nodes is sent to the file request node;
If judged result is yes, then calculate to recover the required minimum blocks of files number p of disappearance block of information, select and is connected p optimum on-line storage node with the file request meshed network, the blocks of files of the individual optimum on-line storage node of this p is sent to the file request node.
5. distributed adaptive coding and storing method according to claim 2 is characterized in that, the All Files piece that described file request node receives specifically obtains as follows:
Calculate and recover the required minimum blocks of files number p of disappearance block of information;
Detect file request node and the internodal network connection situation of each on-line storage, and whether difference is larger to judge between file request node and each on-line storage node the network connection situation;
If the determination result is NO, select any p on-line storage node, the blocks of files of this p on-line storage node is sent to the file request node;
If judged result is yes, select and be connected p optimum on-line storage node with the file request meshed network, the blocks of files of the individual optimum on-line storage node of this p is sent to the file request node.
CN2010101596517A 2010-04-29 2010-04-29 Distributed adaptive coding and storing method Expired - Fee Related CN101834899B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN2010101596517A CN101834899B (en) 2010-04-29 2010-04-29 Distributed adaptive coding and storing method
PCT/CN2011/070002 WO2011134285A1 (en) 2010-04-29 2011-01-01 Distributed self-adaptive coding and storage method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101596517A CN101834899B (en) 2010-04-29 2010-04-29 Distributed adaptive coding and storing method

Publications (2)

Publication Number Publication Date
CN101834899A CN101834899A (en) 2010-09-15
CN101834899B true CN101834899B (en) 2013-01-30

Family

ID=42718827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101596517A Expired - Fee Related CN101834899B (en) 2010-04-29 2010-04-29 Distributed adaptive coding and storing method

Country Status (2)

Country Link
CN (1) CN101834899B (en)
WO (1) WO2011134285A1 (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834899B (en) * 2010-04-29 2013-01-30 中科院成都信息技术有限公司 Distributed adaptive coding and storing method
CN103051676A (en) * 2012-11-26 2013-04-17 浪潮电子信息产业股份有限公司 Distributed data storage management method
CN103984607A (en) * 2013-02-08 2014-08-13 华为技术有限公司 Distributed storage method, device and system
US8862847B2 (en) 2013-02-08 2014-10-14 Huawei Technologies Co., Ltd. Distributed storage method, apparatus, and system for reducing a data loss that may result from a single-point failure
CN104184828B (en) 2014-09-09 2018-05-11 清华大学 Hybrid network system, communication means and network node
US10241689B1 (en) 2015-06-23 2019-03-26 Amazon Technologies, Inc. Surface-based logical storage units in multi-platter disks
CN106527986A (en) * 2016-11-03 2017-03-22 北京百度网讯科技有限公司 Method and device for storing data
CN106788455B (en) * 2016-11-29 2019-11-22 陕西尚品信息科技有限公司 A kind of building method of the optimal partial repairable system code based on packet
CN106788891A (en) * 2016-12-16 2017-05-31 陕西尚品信息科技有限公司 A kind of optimal partial suitable for distributed storage repairs code constructing method
CN107026912A (en) * 2017-05-12 2017-08-08 成都优孚达信息技术有限公司 Embedded communication equipment data transmission method
CN109799948B (en) * 2017-11-17 2023-05-16 航天信息股份有限公司 Data storage method and device
CN110888750A (en) * 2018-09-07 2020-03-17 阿里巴巴集团控股有限公司 Data processing method and device
CN110502365B (en) * 2019-07-11 2024-03-01 平安科技(深圳)有限公司 Data storage and recovery method and device and computer equipment
CN111045843B (en) * 2019-11-01 2021-09-28 河海大学 Distributed data processing method with fault tolerance capability
CN110798688A (en) * 2019-11-20 2020-02-14 珠海市长盛电线电缆有限公司 High-definition video compression coding system based on real-time transmission
CN112286718B (en) * 2020-10-28 2023-08-01 四川效率源信息安全技术股份有限公司 Method for restoring deleted data after enabling TRIM command by solid state disk controlled by PS3111
CN112286719B (en) * 2020-10-28 2023-08-25 四川效率源信息安全技术股份有限公司 Data recovery method for solid state disk after TRIM deletion data is started
CN112286717B (en) * 2020-10-28 2023-08-01 四川效率源信息安全技术股份有限公司 Data recovery method after enabling TRIM command for solid state disk
CN113434299B (en) * 2021-07-05 2024-02-06 广西师范大学 Coding distributed computing method based on MapReduce framework
CN113536356A (en) * 2021-07-30 2021-10-22 海宁奕斯伟集成电路设计有限公司 Data verification method and distributed storage system
CN113553626A (en) * 2021-08-04 2021-10-26 中科曙光国际信息产业有限公司 Data integrity detection method, device, equipment and storage medium
CN115993941B (en) * 2023-03-23 2023-06-02 陕西中安数联信息技术有限公司 Distributed data storage error correction method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488104A (en) * 2009-02-26 2009-07-22 北京世纪互联宽带数据中心有限公司 System and method for implementing high-efficiency security memory
CN101635669A (en) * 2008-07-25 2010-01-27 中国科学院声学研究所 Method for acquiring data fragments in data-sharing systems

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7681105B1 (en) * 2004-08-09 2010-03-16 Bakbone Software, Inc. Method for lock-free clustered erasure coding and recovery of data across a plurality of data stores in a network
CN100452713C (en) * 2006-06-30 2009-01-14 清华大学 Network data concast transmission method based on distributed coding storage
US8442989B2 (en) * 2006-09-05 2013-05-14 Thomson Licensing Method for assigning multimedia data to distributed storage devices
CN101316274B (en) * 2008-05-12 2010-12-01 华中科技大学 Data disaster tolerance system suitable for WAN
CN101630282B (en) * 2009-07-29 2012-07-04 国网电力科学研究院 Data backup method based on Erasure coding and copying technology
CN101834899B (en) * 2010-04-29 2013-01-30 中科院成都信息技术有限公司 Distributed adaptive coding and storing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101635669A (en) * 2008-07-25 2010-01-27 中国科学院声学研究所 Method for acquiring data fragments in data-sharing systems
CN101488104A (en) * 2009-02-26 2009-07-22 北京世纪互联宽带数据中心有限公司 System and method for implementing high-efficiency security memory

Also Published As

Publication number Publication date
CN101834899A (en) 2010-09-15
WO2011134285A1 (en) 2011-11-03

Similar Documents

Publication Publication Date Title
CN101834899B (en) Distributed adaptive coding and storing method
US20210160003A1 (en) Networking Coding System in a Network Layer
Sohn et al. Capacity of clustered distributed storage
CN100362782C (en) Method for recovering drop-out data unit
CN101840377A (en) Data storage method based on RS (Reed-Solomon) erasure codes
CN106708653B (en) Mixed tax big data security protection method based on erasure code and multiple copies
DE102009030545A1 (en) Linkage and lane level packetization scheme for encoding in serial links
JP2013156644A (en) Systematic encoding and decoding of chain coding reaction
CN101834700A (en) Unidirectional reliable transmission method and transceiving device based on data packets
CN101582698A (en) Protection of data from erasures using subsymbole based
CN105391515A (en) Network coding for content-centric network
CN103124182A (en) File download and streaming system
WO2021139751A1 (en) Data processing method, configuration method, and communication device
US20140152476A1 (en) Data encoding methods, data decoding methods, data reconstruction methods, data encoding devices, data decoding devices, and data reconstruction devices
CN103916483A (en) Self-adaptation data storage and reconstruction method for coding redundancy storage system
WO2013164228A1 (en) Method of storing a data item in a distributed data storage system, corresponding storage device failure repair method and corresponding devices
CN105356892B (en) The method and system of network code
US20110051729A1 (en) Methods and apparatuses relating to pseudo random network coding design
CN113391946A (en) Coding and decoding method for erasure code in distributed storage
CN112130772A (en) Block chain safe storage method based on sparse random erasure code technology
CN111385055B (en) Data transmission method and device
CN114047878A (en) Erasure code low-overhead storage system and method for block chain domain name resolution
WO2013159341A1 (en) Coding, decoding and data repairing method based on homomorphic self-repairing code and storage system thereof
WO2017041232A1 (en) Encoding and decoding framework for binary cyclic code
CN108628697B (en) Binary-based node repairing method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee

Owner name: CHENGDU INFORMATION TECHNOLOGY OF CHINESE ACADEMY

Free format text: FORMER NAME: CHENGDU INFORMATION TECHNOLOGY CO., LTD., CAS

CP01 Change in the name or title of a patent holder

Address after: 610041, No. 11, building 5, high tech building, East Road, Chengdu hi tech Zone, Sichuan

Patentee after: CHENGDU INFORMATION TECHNOLOGY OF CHINESE ACADEMY OF SCIENCE Co.,Ltd.

Address before: 610041, No. 11, building 5, high tech building, East Road, Chengdu hi tech Zone, Sichuan

Patentee before: Chengdu Information Technology Co.,Ltd. CAS

C56 Change in the name or address of the patentee
CP02 Change in the address of a patent holder

Address after: 1803, room 18, building 1, building 360, crystal Road, No. 610017, Hui Lu, Chengdu hi tech Zone, Sichuan

Patentee after: CHENGDU INFORMATION TECHNOLOGY OF CHINESE ACADEMY OF SCIENCE Co.,Ltd.

Address before: 610041, No. 11, building 5, high tech building, East Road, Chengdu hi tech Zone, Sichuan

Patentee before: CHENGDU INFORMATION TECHNOLOGY OF CHINESE ACADEMY OF SCIENCE Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130130