CN102387179A - Distributed file system and nodes, saving method and saving control method thereof - Google Patents

Distributed file system and nodes, saving method and saving control method thereof Download PDF

Info

Publication number
CN102387179A
CN102387179A CN2010102723384A CN201010272338A CN102387179A CN 102387179 A CN102387179 A CN 102387179A CN 2010102723384 A CN2010102723384 A CN 2010102723384A CN 201010272338 A CN201010272338 A CN 201010272338A CN 102387179 A CN102387179 A CN 102387179A
Authority
CN
China
Prior art keywords
storage node
data object
checking
node
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010102723384A
Other languages
Chinese (zh)
Other versions
CN102387179B (en
Inventor
李亚飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201010272338.4A priority Critical patent/CN102387179B/en
Publication of CN102387179A publication Critical patent/CN102387179A/en
Application granted granted Critical
Publication of CN102387179B publication Critical patent/CN102387179B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a distributed file system and nodes, a saving method and a saving control method thereof. The saving method includes cutting a file to be stored to be M data objects according to a preset cutting policy, verifying the M data objects by using verification algorithm to obtain K verification data objects corresponding to the file to be saved and K is equal to and larger than 1, and saving the M data objects and the K verification data objects on S available saving nodes of N saving nodes. S is smaller than or equal to N. The distributed file system and the nodes, the saving method and the saving control method of the distributed file system can improve use ratio of saving space.

Description

Distributed file system and node thereof, storage means and storage controlling method
Technical field
The present invention relates to, relate in particular to a kind of distributed file system and node thereof, storage means and storage controlling method.
Background technology
In large-scale group system, the realization of storage subsystem is a very important aspect.As a rule, the reliability of data generally is the problem that the user is concerned about most, and the realization of different storage organizations has all utilized a large amount of resources to be used for guaranteeing data reliability.In distributed file system, this point is even more important, in case because corrupted data, all node cisco unity malfunction all just in the cluster.But when considering reliability, also to consider the factor of performance.Because why adopt distributed file system to replace NFS or local file system, exactly for the more parallel data visit of ability can be provided.
Distributed file system is a kind ofly in cluster, file to be carried out distributed storage, and realizes the file system of concurrent reading and concurrent writing.With traditional document system ratio, its data content does not leave this locality in, but leaves on the storage node of being responsible for storage specially on the network.In general, distributed file system has the metadata management node, and storage node and service station are formed; Wherein, The metadata management node has one or more, and storage node has more than two, and all provide the node of other services all to can be used as the service station of distributed file system in the cluster.
Existing solution is to utilize distributed file system that data are carried out the reliability that many parts of backups guarantee data.
In the process that realizes embodiment of the invention technical scheme, the inventor finds to exist at least in the above-mentioned prior art following problem:
Serious waste of resources, a whole set of storage system can only be utilized half the memory space at most.
Summary of the invention
The technical problem that the present invention will solve provides a kind of distributed file system and node, storage means and storage controlling method, and the memory space utilance is high.
In order to address the above problem, the invention provides a kind of storage means of distributed file system, said distributed file system comprises N storage node; Said method comprises:
According to predetermined segmentation strategy one file to be preserved is cut into M data object;
Adopt checking algorithm that said M data object carried out verification, obtain K checking data object waiting to preserve file corresponding to said, said K is more than or equal to 1;
To wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on S the available storage node with said, wherein, S be less than or equal to N.
Further, the said step that said M data object and said K checking data object are kept in the said N storage node on S the available storage node comprises:
S in the said N storage node available storage node is divided into first storage node and second storage node;
Said M data object is kept on said first storage node;
Said K checking data object is kept on said second storage node.
The present invention also provides a kind of storage controlling method of distributed file system, and said distributed file system comprises N storage node, service station and metadata management node; Said method comprises:
Receive the solicited message that said service station sends, the described request information pointer waits to preserve M data object and K the corresponding checking data object requests storage node that the file cutting forms to first; Said checking data adopts checking algorithm that said M data object carried out the result that verification obtains to liking, and said K is more than or equal to 1;
According to described request information, obtain the state of a said N storage node, from a said N storage node, find out S available storage node, wherein, S is less than or equal to N;
According to the predetermined strategy that stores said M data object and K checking data object are distributed to said S available storage node, produce an allocation result;
Feed back said allocation result.
Further, saidly store the step that strategy distributes to the individual available storage node of said S with said M data object and K checking data object and comprise according to predetermined:
According to said predetermined storage strategy said S available storage node is divided into first storage node and second storage node;
According to said predetermined storage strategy said M data object distributed to said first storage node, said K checking data object distributed to said second storage node.
The present invention also provides a kind of distributed file system, comprising:
N storage node is used to store data;
The metadata management node;
Service station; Be used for a file to be preserved is cut into M data object according to predetermined segmentation strategy; Adopt checking algorithm that said M data object carried out verification, obtain K checking data object waiting to preserve file corresponding to said, said K is more than or equal to 1; Be directed against the said storage node of waiting to preserve file to the request of said metadata management node;
Said metadata management node is used for the request according to said service station; Obtain the state of said storage node; S available storage node is returned to said service station, and will wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on the individual available storage node of S with said according to the predetermined said service station of policy control that stores; Wherein, S is less than or equal to N.
Further; Said service station also is used for when when said storage node reads file; If preserving the storage node of data object occurs unusual; Then adopt checking algorithm that data object and the said checking data object that is read carried out verification, obtain occurring the data object of depositing on the unusual storage node.
Further, said metadata management node is kept at said M data object and said K checking data object in the said N storage node and is meant on the individual available storage node of S according to the predetermined said service station of policy control that stores:
Said metadata management node is divided into first storage node and second storage node according to the predetermined strategy that stores with the S in the said N storage node available storage node; According to the predetermined strategy that stores said M data object distributed to said first storage node, said K checking data object distributed to said second storage node, produce an allocation result; Control said service station according to said allocation result said M data object is kept on said first storage node, said K checking data object is kept on said second storage node.
The present invention also provides a kind of metadata management node of distributed file system, and said distributed file system also comprises N storage node, service station; Said metadata management node comprises:
Select module; Be used for after receiving that said service station waits to preserve the information of the file cutting forms M data object and K checking data object requests storage node of correspondence to first; Obtain the state of a said N storage node; From a said N storage node, find out S available storage node, wherein, S is less than or equal to N; Said checking data adopts checking algorithm that said M data object carried out the result that verification obtains to liking, and said K is more than or equal to 1;
Distribution module is used for according to the predetermined strategy that stores said M data object and K checking data object being distributed to said S available storage node, produces an allocation result;
Feedback module is used to feed back said allocation result.
Further, said distribution module is distributed to the individual available storage node of said S with said M data object and K checking data object and is meant according to the predetermined strategy that stores:
Said distribution module is divided into first storage node and second storage node according to said predetermined storage strategy with said S available storage node; Said M data object distributed to said first storage node, said K checking data object distributed to said second storage node.
The present invention also provides a kind of service station of distributed file system, and said distributed file system comprises N storage node; Said service station comprises:
Cut apart module, be used for a file to be preserved is cut into M data object according to predetermined segmentation strategy;
The verification module is used to adopt checking algorithm that said M data object carried out verification, obtains K checking data object corresponding to said file, and said K is more than or equal to 1;
Request module is used for to the request of said metadata management node said to the storage node of waiting to preserve file;
Preserve module; Be used for after asking S available storage node; To wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on S the available storage node with said, wherein, S be less than or equal to N.
One embodiment of the present of invention have the following advantages at least: can improve the memory space utilance, and not rely on special hardware, use local storage or low end plate battle array can build the very high file system of reliability; An alternative embodiment of the invention is kept at the verification data file data file that cutting forms with file on the different storage nodes, can further improve reliability; Another embodiment of the present invention adopts the XOR algorithm to obtain the checking data object, simplifies implementation.
Description of drawings
Fig. 1 is the schematic block diagram of the distributed file system among the embodiment three;
Fig. 2 is the schematic flow sheet of the distributed file system storage file among the embodiment three.
Embodiment
To combine accompanying drawing and embodiment that technical scheme of the present invention is explained in more detail below.
Embodiment one, and a kind of storage means of distributed file system, said distributed file system comprise N storage node; Said method comprises:
According to predetermined segmentation strategy a file to be preserved is cut into M data object;
Adopt checking algorithm that said M data object carried out verification, obtain K checking data object corresponding to said file, said K is more than or equal to 1;
To wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on S the available storage node with said, wherein, S be less than or equal to N.
In the present embodiment, said predetermined segmentation strategy can be the number cutting according to storage node, also can be the size cutting according to the data object of setting; Or according to other strategy cutting.
In the present embodiment, can number,, perhaps represent different data objects with other sign that can distinguish each data object to distinguish each data object to M the data object that cuts into; Can number (or with sign represent) in addition for the checking data object, when having only a checking data object, also can use a special numbering or letter, represent such as " 0 " or " X ".When storage during a plurality of file, can use the numbering (or sign) of data object/checking data object add can a file of unique expression information (such as file name etc.) represent the data object/checking data object of a file body.In addition, can set up ID (sign) to distinguish each storage node for each storage node.
In the present embodiment; If there is the part storage node unusual (storage node itself unusually or the data exception of its preservation) to occur in the storage node that M is preserved said data object; Then can utilize remaining normal data object and said checking data object to obtain the data object of depositing on the unusual storage node; Therefore can improve the reliability of storage, increase recovery capability.
In the present embodiment, can also record data object, checking data object and preserve the corresponding relation between their storage node; Can be record while preserving; Or preserving record afterwards; Can also be that elder generation confirms and writes down this corresponding relation before preservation, preserve said data object/checking data object according to this corresponding relation then.
In the present embodiment, can store said corresponding relation according to preset data format; Such as; Data object is obtained a hash value through after Hash (hash) conversion; Node id with this hash value and storage node stores then, and institute's stored relation is exactly the hash value of data object, checking data object and the corresponding relation between the storage node id value; During practical application, also can be to preserve the numbering of data object/checking data object and the corresponding relation between the storage node ID, or preserve corresponding relation with other form.
In a kind of execution mode of present embodiment, data object is preserved on different available storage nodes with checking data object branch; The said step that said M data object and said K checking data object are kept in the said N storage node on S the available storage node specifically can comprise:
S in the said N storage node available storage node is divided into first storage node and second storage node;
Said M data object is kept on said first storage node;
Said K checking data object is kept on said second storage node.
In this execution mode, can storage node that preserve said checking data object be called parity check nodes.
In the present embodiment, said predetermined segmentation strategy can be the number cutting according to storage node; In the concrete example, be to make M+K=N; If S=N then is that the number of available storage node and total number that data object adds the checking data object equate, can preserve according to man-to-man mode.If S<N then can preserve according to the predetermined strategy that stores.
In the present embodiment, a plurality of if said checking data object has, then can part for storage on said second storage node, part for storage is on said first storage node; Preserving a plurality of checking data objects is the equal of that the checking data object is backed up, and therefore can further improve reliability.If be kept on certain first storage node,,, also can't read the checking data object then can both can't read the data object that this first storage node is preserved just in case this first storage node is unusual; Therefore the verification data object is kept on second storage node, can further improves reliability.Certainly, also can take the mode that a plurality of checking data objects are kept at respectively on different said first storage nodes is improved reliability.Need to prove,, occur therefore generally speaking, still having improved reliability unusually than prior art owing to be not necessarily this first storage node even only said checking data file is kept on said first storage node.
In a kind of execution mode of present embodiment, said checking algorithm is XOR (XOR) algorithm; At this moment, under the unusual situation of storage node, can carry out XOR at one time, obtain the data object of preserving on this unusual storage node checking data object and other data object.In other embodiments, also can adopt other checking algorithm, corresponding can under the unusual situation of a plurality of memory nodes, the recovery.
In the present embodiment,, can adopt identical or different checking algorithm,, then write down each file corresponding check algorithm if adopt different checking algorithms for different files.
In the present embodiment, different files are corresponding to separately checking data object and parity check nodes, and different file corresponding check nodes can be identical or different; The pairing checking data object of each file is to carry out verification with the data object that this document corresponding check algorithm cuts into this document to obtain; When storage node occurs needing to recover a file unusually, adopt this document corresponding check algorithm, other data object that checking data object and this document are cut into carries out verification, restores the data object of preserving on the unusual storage node.
In the present embodiment,, also can adopt the different check algorithm to obtain the checking data object respectively for identical file; At this moment, said employing checking algorithm carries out verification to said M data object, obtains at least one step corresponding to the checking data object of said file and specifically can comprise:
Adopt different checking algorithms that said M data object carried out verification respectively, obtain each checking algorithm corresponding check data object;
Write down the corresponding relation between resulting checking data object and the checking algorithm, can distinguish with unique identification between the different checking data objects; The storage node of all right each checking data object of recorded and stored; Each checking data object can be kept on the identical or different storage node.
Under this situation; When storage node occurs needing to recover a file unusually; Can adopt arbitrary checking algorithm, other data object that this checking algorithm corresponding check data object and this document are cut into carries out verification, obtains occurring the data object of preserving on the unusual storage node; Also can adopt a plurality of checking algorithms; Other data object that respectively this checking algorithm corresponding check data object and this document is cut into carries out verification; The data object that then each time verification is obtained is compared, to verify the correctness of resulting data object.If certain storage node of preserving the checking data object also occurs unusually, then can utilize other checking data object and checking algorithm to obtain occurring the data object of preserving on the unusual storage node.
Other realizes that details can be referring to all the other embodiment.
Embodiment two, a kind of storage controlling method of distributed file system, and said distributed file system comprises N storage node, service station and metadata management node; Said method comprises:
Receive the solicited message that said service station sends, the described request information pointer waits to preserve M data object and K the corresponding checking data object requests storage node that the file cutting forms to first; Said checking data adopts checking algorithm that said M data object carried out the result that verification obtains to liking, and said K is more than or equal to 1;
According to described request information, obtain the state of a said N storage node, from a said N storage node, find out S available storage node, wherein, S is less than or equal to N;
According to the predetermined strategy that stores said M data object and K checking data object are distributed to said S available storage node, produce an allocation result;
Feed back said allocation result.
In a kind of execution mode of present embodiment, saidly store the step that strategy distributes to the individual available storage node of said S with said M data object and K checking data object and specifically can comprise according to predetermined:
According to said predetermined storage strategy said S available storage node is divided into first storage node and second storage node;
According to said predetermined storage strategy said M data object distributed to said first storage node, said K checking data object distributed to said second storage node.
In the concrete example; Said predetermined storage strategy can be from said S available storage node, to tell K as second storage node earlier; Then K checking data object distributed to this K second storage node according to man-to-man mode, the storage node that remaining is available is as first storage node; Also can be from said S available storage node, to tell M as first storage node earlier, then M data object distributed to this M first storage node according to man-to-man mode, the storage node that remaining is available be as second storage node.Can also be according to predetermined ratio, or the ratio of M and K is distributed first, second storage node.
In another concrete example; When the number of data object/checking data object during more than the first/the second storage node; Can be as far as possible with data object/checking data object mean allocation; When 20 data objects are distributed to 10 first storage nodes, distribute 2 for each first storage node; When for another example 15 data objects being distributed to 10 first storage nodes, respectively distribute 2 for 5 first storage nodes, give other 5 first storage nodes and respectively distribute 1.Can also distribute according to the size of each storage node remaining space, give overabsorption data object/checking data object on the big storage node in space.Do not get rid of during practical application according to other strategy yet and distribute.
Other realizes that details can be referring to all the other embodiment.
Embodiment three, and a kind of distributed file system is as shown in Figure 1, comprising:
N storage node is used to store data; Can regard a memory node cluster as;
The metadata management node;
Service station; Be used for a file to be preserved is cut into M data object according to predetermined segmentation strategy; Adopt checking algorithm that said M data object carried out verification, obtain K checking data object waiting to preserve file corresponding to said, said K is more than or equal to 1; Be directed against the said storage node of waiting to preserve file to the request of said metadata management node; When a plurality of service station, can regard a service node cluster as;
Said metadata management node is used for the request according to said service station; Obtain the state of said storage node; S available storage node is returned to said service station, and will wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on the individual available storage node of S with said according to the predetermined said service station of policy control that stores; Wherein, S is less than or equal to N.
In the present embodiment, can number,, perhaps represent different data objects with other sign that can distinguish each data object to distinguish each data object to M the data object that cuts into; Can number (or with sign represent) in addition for the checking data object, when having only a checking data object, also can use a special numbering or letter, represent such as " 0 " or " X ".When storage during a plurality of file, can use the numbering (or sign) of data object/checking data object add can a file of unique expression information (such as file name etc.) represent the data object/checking data object of a file body.In addition, can set up ID (sign) to distinguish each storage node for each storage node.
In the present embodiment; Said service station also is used for when when said storage node reads the data object of file; Unusual (storage node itself unusually or the data exception of its preservation) appears if preserve the storage node of data object; Then adopt checking algorithm that data object and the said checking data object that is read carried out verification, obtain occurring the data object of depositing on the unusual storage node.
In the storage framework of SAN; The framework that also can adopt a plurality of storage nodes to articulate a dish battle array is usually realized the parallel data visit, but in order to solve data reliability, need in the dish battle array, add expensive controller and realize the RAID function; And in order to improve the performance of dish battle array; Need a plurality of controllers in the general dish battle array, also make the structure of dish battle array complicated more on the other hand, its price also can be higher; And in the framework of SAN, need to buy special optical fiber switch or high speed Ethernet exchange machine and build storage networking, cost an arm and a leg.And the employing distributed file system and after solving reliability with the method for data check, can be given up this special hardware of dish battle array, directly uses the local hard drive of storage node just can guarantee the reliability of data, and higher reading and writing data performance also can be provided.
In a kind of execution mode of present embodiment, data object is preserved on different available storage nodes with checking data object branch; The predetermined segmentation strategy of said service station institute basis cuts for the number N according to the storage node in the said distributed file system;
Said metadata management node is kept at said M data object and said K checking data object in the said N storage node and can be meant on the individual available storage node of S according to the predetermined said service station of policy control that stores:
Said metadata management node is divided into first storage node and second storage node according to the predetermined strategy that stores with the S in the said N storage node available storage node; According to the predetermined strategy that stores said M data object distributed to said first storage node, said K checking data object distributed to said second storage node, produce an allocation result; Control said service station according to said allocation result said M data object is kept on said first storage node, said K checking data object is kept on said second storage node.
In this execution mode, can storage node that preserve said checking data object be called parity check nodes.
In the present embodiment, said predetermined segmentation strategy can be the number cutting according to storage node; In the concrete example, be to make M+K=N; If S=N then is that the number of available storage node and total number that data object adds the checking data object equate, can preserve according to man-to-man mode.If S<N then can preserve according to the predetermined strategy that stores.
In this execution mode; Said metadata management node can only return allocation result and give said service station; Such as having several in S the available storage node are first storage nodes; Several is second storage nodes, each first preserve several data objects/checking data object on second storage node; Service station is kept at data object/checking data object in the said available storage node according to this allocation result, and then the concrete data object/checking data object of notice metadata management node record and preserve the corresponding relation between their storage node; Said metadata management node also can be directly to be assigned on the concrete storage node each data object/checking data object and record, and the notification service node is directly preserved data object/checking data object by allocation result and got final product then.
Described like this metadata management node just can feed back each data object of preservation that write down and the storage node of said checking data object after receiving that service station reads the request of said file.
In this execution mode, said metadata management node can be stored said corresponding relation according to preset data format; Such as; Data object is obtained a hash value through after Hash (hash) conversion; Node id with this hash value and storage node stores then, and institute's stored relation is exactly the hash value of data object, checking data object and the corresponding relation between the storage node id value; During practical application, also can be to preserve the numbering of data object/checking data object and the corresponding relation between the storage node ID, or preserve corresponding relation with other form.
In a kind of execution mode of present embodiment, the checking algorithm that said service station adopts is XOR (XOR) algorithm; At this moment, a storage node of preserving data object occurs under the unusual situation at one time, and said service station can carry out XOR to checking data object and other data object, obtains the data object of preserving on this unusual storage node.In other embodiments, said service station also can adopt other checking algorithm, corresponding can under the unusual situation of a plurality of memory nodes, the recovery.
In the present embodiment, for different files, said service station can adopt identical or different checking algorithm, if adopt different checking algorithms, then said metadata management node also is used to write down each file corresponding check algorithm.
In the present embodiment, different files are corresponding to separately checking data object and parity check nodes, and different file corresponding check nodes can be identical or different; The pairing checking data object of each file is that data object that said service station cuts into this document with this document corresponding check algorithm carries out verification and obtains; When storage node occurs needing to recover a file unusually; Said service station adopts this document corresponding check algorithm; Other data object that checking data object and this document are cut into carries out verification, restores the data object of preserving on the unusual storage node.
In the present embodiment, for identical file, said service station also can adopt the different check algorithm to obtain the checking data object respectively; At this moment, said service station adopts checking algorithm that said M data object carried out verification, obtains at least one checking data object corresponding to said file and specifically can be meant:
Said service station adopts different checking algorithms that said M data object carried out verification respectively, obtains each checking algorithm corresponding check data object;
Said service station or said metadata management node write down the corresponding relation between resulting checking data object and the checking algorithm; Can distinguish with unique identification between the different checking data objects;
The storage node of all right each the checking data object of recorded and stored of said metadata management node record; Said service station can be kept at each checking data object on the identical or different storage node.
Under this situation; If it is unusual also to be used for when reading a file, having the storage node of preserving data object to occur; Then can in the checking algorithm that is adopted, select one or more; Use each checking algorithm respectively, other data object that this checking algorithm corresponding check data object and this document are cut into carries out verification, obtains occurring the data object of preserving on the unusual storage node; When selecting a plurality of checking algorithm, can also the data object that each time verification obtains be compared, to verify the correctness of resulting data object.
In the concrete example, the detailed process of the distributed file system storage file of present embodiment is as shown in Figure 2, comprising:
(1) service station is to the present a paper application of write operation of metadata management node, and metadata management node viewing files system mode is returned the message that allows the service station written document.
(2) service station is cut into slices to file, generates a plurality of data objects.
(3) service station generates the checking data object according to a plurality of file objects.
(4) storage node of service station pixel data management node application written document (comprising data object and checking data object) needs.The metadata management node is stored the corresponding relation of data object, checking data object and storage node according to the suitable data form, return available storage node and give said service station with this corresponding relation;
(5) service station is submitted data object/checking data object according to said corresponding relation to storage node, and storage node is with data object/checking data object storage; Inform the metadata management node after the written document success, the metadata management node is preserved said corresponding relation and is used when reading file.
Other realizes that details can be referring to all the other embodiment.
Embodiment four, a kind of metadata management node of distributed file system, and said distributed file system comprises N storage node; Said metadata management node comprises:
Select module; Be used for after receiving that said service station waits to preserve the information of the file cutting forms M data object and K checking data object requests storage node of correspondence to first; Obtain the state of a said N storage node; From a said N storage node, find out S available storage node, wherein, S is less than or equal to N; Said checking data adopts checking algorithm that said M data object carried out the result that verification obtains to liking, and said K is more than or equal to 1;
Distribution module is used for according to the predetermined strategy that stores said M data object and K checking data object being distributed to said S available storage node, produces an allocation result;
Feedback module is used to feed back said allocation result.
In the present embodiment, said distribution module is distributed to the individual available storage node of said S with said M data object and K checking data object and specifically can be meant according to the predetermined strategy that stores:
Said distribution module is divided into first storage node and second storage node according to said predetermined storage strategy with said S available storage node; Said M data object distributed to said first storage node, said K checking data object distributed to said second storage node.
Other realizes that details can be referring to all the other embodiment.
Embodiment five, a kind of service station of distributed file system, and said distributed file system comprises N storage node; Comprise:
Cut apart module, be used for a file to be preserved is cut into M data object according to predetermined segmentation strategy;
The verification module is used to adopt checking algorithm that said M data object carried out verification, obtains K checking data object corresponding to said file, and said K is more than or equal to 1;
Request module is used for to the request of said metadata management node to the said storage node of waiting to preserve file;
Preserve module; Be used for after asking S available storage node; To wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on S the available storage node with said, wherein, S be less than or equal to N.
In the present embodiment, described service station can also comprise:
Read module; Be used for reading the data object and the checking data object of file from said storage node; If preserving the storage node of data object when reading occurs then data object that is read and said checking data object being sent to said verification module unusually;
Said verification module also is used to adopt checking algorithm that data object and the said checking data object that is read carried out verification, obtains occurring the data object of depositing on the unusual storage node.
In a kind of execution mode of present embodiment, said preservation module will be waited to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node and specifically can be meant on S available storage node with said:
Said preservation module is divided into first storage node and second storage node with the S in the said N storage node available storage node; Said M data object is kept on said first storage node; Said K checking data object is kept on said second storage node.
In this execution mode, described predetermined segmentation strategy cuts for the number N according to the storage node in the said distributed file system; M can be less than N, greater than 1
In the present embodiment, said verification module adopts checking algorithm that said M data object carried out verification, obtains at least one checking data object corresponding to said file and specifically can be meant:
Said verification module adopts different checking algorithms that said M data object carried out verification respectively, obtains each checking algorithm corresponding check data object; Write down the corresponding relation between resulting checking data object and the checking algorithm;
Said preservation module records is preserved the storage node of each checking data object.
In the present embodiment; If there when reading a file, have the storage node of preserving data object to occur to be unusual; Said verification module can be selected one or more in the checking algorithm that is adopted; Use each checking algorithm respectively, other data object that this checking algorithm corresponding check data object and this document are cut into carries out verification, obtains occurring the data object of preserving on the unusual storage node; When selecting a plurality of checking algorithm, the data object that each time verification obtains is compared, can further verify the correctness of resulting data object.
In the present embodiment, the checking algorithm that said verification module adopts can but be not limited to the XOR algorithm.
Other realizes that details can be referring to all the other embodiment.
Certainly; The present invention also can have other various embodiments; Under the situation that does not deviate from spirit of the present invention and essence thereof; Those of ordinary skill in the art work as can make various corresponding changes and distortion according to the present invention, but these corresponding changes and distortion all should belong to the protection range of claim of the present invention.

Claims (10)

1. the storage means of a distributed file system, said distributed file system comprises N storage node; Said method comprises:
According to predetermined segmentation strategy one file to be preserved is cut into M data object;
Adopt checking algorithm that said M data object carried out verification, obtain K checking data object waiting to preserve file corresponding to said, said K is more than or equal to 1;
To wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on S the available storage node with said, wherein, S be less than or equal to N.
2. the method for claim 1 is characterized in that, the said step that said M data object and said K checking data object are kept in the said N storage node on S the available storage node comprises:
S in the said N storage node available storage node is divided into first storage node and second storage node;
Said M data object is kept on said first storage node;
Said K checking data object is kept on said second storage node.
3. the storage controlling method of a distributed file system, said distributed file system comprises N storage node, service station and metadata management node; Said method comprises:
Receive the solicited message that said service station sends, the described request information pointer waits to preserve M data object and K the corresponding checking data object requests storage node that the file cutting forms to first; Said checking data adopts checking algorithm that said M data object carried out the result that verification obtains to liking, and said K is more than or equal to 1;
According to described request information, obtain the state of a said N storage node, from a said N storage node, find out S available storage node, wherein, S is less than or equal to N;
According to the predetermined strategy that stores said M data object and K checking data object are distributed to said S available storage node, produce an allocation result;
Feed back said allocation result.
4. method as claimed in claim 3 is characterized in that, saidly stores the step that strategy distributes to the individual available storage node of said S with said M data object and K checking data object and comprises according to predetermined:
According to said predetermined storage strategy said S available storage node is divided into first storage node and second storage node;
According to said predetermined storage strategy said M data object distributed to said first storage node, said K checking data object distributed to said second storage node.
5. a distributed file system is characterized in that, comprising:
N storage node is used to store data;
The metadata management node;
Service station; Be used for a file to be preserved is cut into M data object according to predetermined segmentation strategy; Adopt checking algorithm that said M data object carried out verification, obtain K checking data object waiting to preserve file corresponding to said, said K is more than or equal to 1; Be directed against the said storage node of waiting to preserve file to the request of said metadata management node;
Said metadata management node is used for the request according to said service station; Obtain the state of said storage node; S available storage node is returned to said service station, and will wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on the individual available storage node of S with said according to the predetermined said service station of policy control that stores; Wherein, S is less than or equal to N.
6. system as claimed in claim 5 is characterized in that:
Said service station also is used for when when said storage node reads file; If preserving the storage node of data object occurs unusual; Then adopt checking algorithm that data object and the said checking data object that is read carried out verification, obtain occurring the data object of depositing on the unusual storage node.
7. system as claimed in claim 5; It is characterized in that said metadata management node is kept at said M data object and said K checking data object in the said N storage node and is meant on the individual available storage node of S according to the predetermined said service station of policy control that stores:
Said metadata management node is divided into first storage node and second storage node according to the predetermined strategy that stores with the S in the said N storage node available storage node; According to the predetermined strategy that stores said M data object distributed to said first storage node, said K checking data object distributed to said second storage node, produce an allocation result; Control said service station according to said allocation result said M data object is kept on said first storage node, said K checking data object is kept on said second storage node.
8. the metadata management node of a distributed file system, said distributed file system also comprises N storage node, service station; It is characterized in that said metadata management node comprises:
Select module; Be used for after receiving that said service station waits to preserve the information of the file cutting forms M data object and K checking data object requests storage node of correspondence to first; Obtain the state of a said N storage node; From a said N storage node, find out S available storage node, wherein, S is less than or equal to N; Said checking data adopts checking algorithm that said M data object carried out the result that verification obtains to liking, and said K is more than or equal to 1;
Distribution module is used for according to the predetermined strategy that stores said M data object and K checking data object being distributed to said S available storage node, produces an allocation result;
Feedback module is used to feed back said allocation result.
9. metadata management node as claimed in claim 8 is characterized in that, said distribution module is distributed to the individual available storage node of said S with said M data object and K checking data object and is meant according to the predetermined strategy that stores:
Said distribution module is divided into first storage node and second storage node according to said predetermined storage strategy with said S available storage node; Said M data object distributed to said first storage node, said K checking data object distributed to said second storage node.
10. the service station of a distributed file system, said distributed file system comprises N storage node; It is characterized in that said service station comprises:
Cut apart module, be used for a file to be preserved is cut into M data object according to predetermined segmentation strategy;
The verification module is used to adopt checking algorithm that said M data object carried out verification, obtains K checking data object corresponding to said file, and said K is more than or equal to 1;
Request module is used for to the request of said metadata management node said to the storage node of waiting to preserve file;
Preserve module; Be used for after asking S available storage node; To wait to preserve said M corresponding data object of file and said K checking data object and be kept in the said N storage node on S the available storage node with said, wherein, S be less than or equal to N.
CN201010272338.4A 2010-09-02 2010-09-02 Distributed file system and node, storage method and storage controlling method Active CN102387179B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010272338.4A CN102387179B (en) 2010-09-02 2010-09-02 Distributed file system and node, storage method and storage controlling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010272338.4A CN102387179B (en) 2010-09-02 2010-09-02 Distributed file system and node, storage method and storage controlling method

Publications (2)

Publication Number Publication Date
CN102387179A true CN102387179A (en) 2012-03-21
CN102387179B CN102387179B (en) 2016-08-10

Family

ID=45826145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010272338.4A Active CN102387179B (en) 2010-09-02 2010-09-02 Distributed file system and node, storage method and storage controlling method

Country Status (1)

Country Link
CN (1) CN102387179B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103067525A (en) * 2013-01-18 2013-04-24 广东工业大学 Cloud storage data backup method based on characteristic codes
CN103379139A (en) * 2012-04-17 2013-10-30 百度在线网络技术(北京)有限公司 A verification method and a verification system for distributed cache content, and apparatuses
CN103581319A (en) * 2013-11-04 2014-02-12 汉柏科技有限公司 Method for device management of cloud computing based on meshing
CN104883381A (en) * 2014-05-27 2015-09-02 陈杰 Data access method and system for distributed storage
WO2017000094A1 (en) * 2015-06-27 2017-01-05 华为技术有限公司 Data storage method, device and system
CN107133334A (en) * 2017-05-15 2017-09-05 成都优孚达信息技术有限公司 Method of data synchronization based on high bandwidth storage system
CN109101531A (en) * 2018-06-22 2018-12-28 联想(北京)有限公司 Document handling method, apparatus and system
CN109407977A (en) * 2018-09-25 2019-03-01 佛山科学技术学院 A kind of big data distributed storage management method and system
CN110569213A (en) * 2018-05-18 2019-12-13 北京果仁宝软件技术有限责任公司 File access method, device and equipment
CN110837660A (en) * 2019-11-05 2020-02-25 广东紫晶信息存储技术股份有限公司 Data storage method and system and data verification method and system
CN111352577A (en) * 2018-12-24 2020-06-30 杭州海康威视系统技术有限公司 Object storage method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1472963A (en) * 2002-07-30 2004-02-04 深圳市中兴通讯股份有限公司 Distributive video interactive system and its data recording and accessing method
CN101316273A (en) * 2008-05-12 2008-12-03 华中科技大学 Distributed safety memory system
CN101504670A (en) * 2009-03-04 2009-08-12 成都市华为赛门铁克科技有限公司 Data operation method, system, client terminal and data server

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1472963A (en) * 2002-07-30 2004-02-04 深圳市中兴通讯股份有限公司 Distributive video interactive system and its data recording and accessing method
CN101316273A (en) * 2008-05-12 2008-12-03 华中科技大学 Distributed safety memory system
CN101504670A (en) * 2009-03-04 2009-08-12 成都市华为赛门铁克科技有限公司 Data operation method, system, client terminal and data server

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103379139B (en) * 2012-04-17 2017-07-25 百度在线网络技术(北京)有限公司 Method of calibration, system and the device of distributed caching content
CN103379139A (en) * 2012-04-17 2013-10-30 百度在线网络技术(北京)有限公司 A verification method and a verification system for distributed cache content, and apparatuses
CN103067525A (en) * 2013-01-18 2013-04-24 广东工业大学 Cloud storage data backup method based on characteristic codes
CN103067525B (en) * 2013-01-18 2015-11-25 广东工业大学 A kind of cloud storing data backup method of feature based code
CN103581319A (en) * 2013-11-04 2014-02-12 汉柏科技有限公司 Method for device management of cloud computing based on meshing
CN104883381A (en) * 2014-05-27 2015-09-02 陈杰 Data access method and system for distributed storage
CN104883381B (en) * 2014-05-27 2018-09-04 陈杰 The data access method and system of distributed storage
WO2017000094A1 (en) * 2015-06-27 2017-01-05 华为技术有限公司 Data storage method, device and system
CN107113323A (en) * 2015-06-27 2017-08-29 华为技术有限公司 A kind of date storage method, device and system
CN107113323B (en) * 2015-06-27 2020-02-21 华为技术有限公司 Data storage method, device and system
CN107133334A (en) * 2017-05-15 2017-09-05 成都优孚达信息技术有限公司 Method of data synchronization based on high bandwidth storage system
CN107133334B (en) * 2017-05-15 2020-01-14 成都优孚达信息技术有限公司 Data synchronization method based on high-bandwidth storage system
CN110569213A (en) * 2018-05-18 2019-12-13 北京果仁宝软件技术有限责任公司 File access method, device and equipment
CN109101531A (en) * 2018-06-22 2018-12-28 联想(北京)有限公司 Document handling method, apparatus and system
CN109101531B (en) * 2018-06-22 2022-05-31 联想(北京)有限公司 File processing method, device and system
CN109407977A (en) * 2018-09-25 2019-03-01 佛山科学技术学院 A kind of big data distributed storage management method and system
CN109407977B (en) * 2018-09-25 2021-08-31 佛山科学技术学院 Big data distributed storage management method and system
CN111352577A (en) * 2018-12-24 2020-06-30 杭州海康威视系统技术有限公司 Object storage method and device
CN111352577B (en) * 2018-12-24 2023-03-14 杭州海康威视系统技术有限公司 Object storage method and device
CN110837660A (en) * 2019-11-05 2020-02-25 广东紫晶信息存储技术股份有限公司 Data storage method and system and data verification method and system

Also Published As

Publication number Publication date
CN102387179B (en) 2016-08-10

Similar Documents

Publication Publication Date Title
CN102387179A (en) Distributed file system and nodes, saving method and saving control method thereof
CN102521072B (en) Virtual tape library equipment and data recovery method
CN101854388B (en) Method and system concurrently accessing a large amount of small documents in cluster storage
US20100161564A1 (en) Cluster data management system and method for data recovery using parallel processing in cluster data management system
CN103942112B (en) Disk tolerance method, apparatus and system
US20100161565A1 (en) Cluster data management system and method for data restoration using shared redo log in cluster data management system
CN101840377A (en) Data storage method based on RS (Reed-Solomon) erasure codes
CN107436725A (en) A kind of data are write, read method, apparatus and distributed objects storage cluster
CN104735110B (en) Metadata management method and system
CN103209210B (en) Method for improving erasure code based storage cluster recovery performance
CN103929500A (en) Method for data fragmentation of distributed storage system
WO2015100627A1 (en) Data processing method and device in distributed file storage system
CN102024016A (en) Rapid data restoration method for distributed file system (DFS)
CN102662992A (en) Method and device for storing and accessing massive small files
CN103118133A (en) Mixed cloud storage method based on file access frequency
US8560884B2 (en) Application recovery in a file system
CN102411637A (en) Metadata management method of distributed file system
CN102411639A (en) Multi-copy storage management method and system of metadata
CN101984400B (en) RAID control method, device and system
WO2005043531A2 (en) Methods of reading and writing data
CN103327085A (en) Distributed data processing method, data center and distributed data system
CN104184812A (en) Multi-point data transmission method based on private cloud
CN102982182A (en) Data storage planning method and device
CN103473258A (en) Cloud storage file system
CN105187502A (en) Method and system based on distributed elastic block storage

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant