Summary of the invention
In view of this, the method, device and the server that provide a kind of data to write, revise and recover is provided the purpose of the embodiment of the invention, to guarantee the data consistency of same object copy on different memory nodes.
For achieving the above object, the embodiment of the invention provides following technical scheme:
A kind of method for writing data is used for comprising the distributed file system of at least two object servers, and this method comprises:
When receiving the write operation requests that client sends, layout information is sent to described client, comprise data to be written in the write operation requests in the described layout information with the object server information that is written into;
Indicate described client described data to be written to be write first object server that writes down in the described layout information according to described layout information, after described client is finished write operation, indicate described first object server described data to be written to be write second object server that writes down in the described layout information;
Indicate described client to send and write the results verification request, and receive the result that writes that described first object server and described second object server return to described first object server and described second object server.
A kind of data write server, comprising:
Transmitting element is used for when receiving the write operation requests that client sends layout information being returned to described client, comprises data to be written in the write operation requests in the described layout information with the object server information that is written into;
First indicating member, be used for indicating described client that data to be written are write first object server that described layout information writes down according to described layout information, after described client is finished write operation, indicate described first object server described data to be written to be write second object server that writes down in the described layout information.
Second indicating member is used to indicate described client to send to described first object server and described second object server and writes the results verification request, and receives the result that writes that described first object server and described second object server return.
A kind of data modification method is used for comprising the distributed file system of at least two object servers, and this method comprises:
Whether at least two object servers that detection waits to revise the data place are normal condition;
When described at least two object servers are normal condition, the data to be revised on described at least two object servers are locked, and guarantee to lock successfully;
Data to be revised on described at least two object servers are made amendment;
Releasing is to described at least two lock-out states of waiting to revise data.
A kind of data modifying apparatus is used for comprising the distributed file system of at least two object servers, and this device comprises:
Detecting unit is used to detect whether at least two object servers waiting to revise the data place are normal condition;
Lock cell is used for when described object server is normal condition, the data to be revised on described at least two object servers is locked, and guarantee to lock successfully;
Revise the unit, be used for the data to be revised on described at least two object servers are made amendment;
Separate lock unit, be used to remove the lock-out state of waiting to revise data to described at least two.
A kind of data reconstruction method is used for comprising the distributed file system of at least two object servers, and this method comprises:
Detect the operation conditions of described at least two object servers by the heartbeat inspection;
In the time of after in finding described at least two object servers of fault, having the fault object server, determine object data to be recovered according to layout information;
Send recovery message to the destination object server, make the destination object server recover the described data recovered for the treatment of, write down the sign of described destination object server in the described recovery message and treat data recovered;
Described destination object server according to described recovery message to described when treating that restore data is recovered successfully, according to described destination object server identification and the data that are resumed described layout information is upgraded.
A kind of data are recovered server, are used for comprising the distributed file system of at least two object servers, and this server comprises:
Detecting unit is used for the operation conditions by described at least two object servers of heartbeat inspection detection;
Determining unit is used for determining to treat data recovered according to layout information when there is the fault object server in described at least two object servers;
Transmitting element is used for sending recovery message to the destination object server, makes the destination object server recover the described data recovered for the treatment of, writes down the sign of described destination object server in the described recovery message and treats data recovered;
Updating block, be used for described destination object server according to described recovery message to described when treating that restore data is recovered successfully, according to described destination object server identification and the data that are resumed described layout information is upgraded.
As seen, the scheme that the embodiment of the invention the provided data from the object server respectively writes, revises and recovers three basic operations, consistance when guaranteeing that by a series of method a plurality of copies of same object data are stored on the different object servers simultaneously, greatly reduced the inconsistent possibility of data between copy, fundamentally prevent the situation that single copy occurs, improved the reliability of distributed file system greatly.
Embodiment
The embodiment of the invention discloses the methods, devices and systems that a kind of data write, revise and recover, for making purpose of the present invention, technical scheme and advantage clearer, below with reference to the accompanying drawing embodiment that develops simultaneously, the present invention is described in further detail.
In distributed file system,, data object need be carried out redundancy backup in order to guarantee the reliability of data.A kind of typical method is to a plurality of copies of object tools, then these copies is stored on the different object servers.This method can guarantee the reliability of data, but also proposes a problem, i.e. data consistency problem between the copy.Have only the data between each copy to be consistent, the existence of the copy that these are redundant is only valuable.Because that copy data is write fashionable fault etc. is numerous, there is inconsistent risk in the data of a plurality of copies in the distributed system but in fact.The method that the embodiment of the invention is passed through is in order to realize in the distributed system that the consistance of data proposes in the different object servers.
The inventor is by discovering prior art, influence in the distributed system to same data in the different object servers inconsistent appear at mainly that data write, in the process such as data modification and data recovery, below, the method that the embodiment of the invention provided is described in detail at these several processes respectively.
Referring to Fig. 1, method for writing data that one embodiment of the invention provides is used for comprising the distributed file system of at least two object servers, and this method comprises:
S101, when receiving the write operation requests that client sends, layout information is returned to described client.
The related ablation process of the method that the embodiment of the invention provided is all finished according to layout information.And this layout information is to be formed by the write operation requests of meta data server according to client in an embodiment of the present invention.Be that the write operation that the embodiment of the invention provides all is to be realized by layout information control by meta data server.
In the method that the embodiment of the invention provided, layout information has consequence, be equivalent to the DATA DISTRIBUTION map in the distributed file system, write down the distribution situation of each data in the distributed file system, also comprise data to be written with the object server information that is written into certainly;
For example in a distributed file system three object servers are arranged, be numbered 1~3, so for a data object Data1 to be written, the information that may write down in the layout information is Data1~object server 1, object server 3.This just represents and this data object of Data1 need be written in object server 1 and the object server 3.
S102, indicate described client described data to be written to be write first object server that writes down in the described layout information according to described layout information, after described client is finished write operation, indicate described first object server described data to be written to be write second object server that writes down in the described layout information.S103, the described client of indication send to described first object server and described second object server and write the results verification request, and receive the result that writes that described first object server and described second object server return.
The ablation process of the embodiment of the invention writes for order, data to be written will be written in two object servers in the current distributed file system at least, after client is finished the work that writes in first object server, again data to be written are written in second object server by first object server.
Here, represent at first to write the object server of data, represent to write the server of data with second object server by first object server by client with first object server.
As seen, the method that the embodiment of the invention provided is applied to comprise at least in the distributed file system of two object servers, this method order in first object server and second object server writes data and writes data affirmation process afterwards and guarantees write operation success simultaneously on first object server and second object server, thereby fundamentally avoided the appearance of single copy data, guaranteed the consistance of the copy of the same data that first object server and second object server are preserved in the write operation process.
In the practical application, when described first object server finish on described second object server write work after, can also for example write work on the 3rd object server at other object servers, be equivalent to be written in parallel to.
Ablation process with above-mentioned Data1 is an example, is first object server with object server 1, and object server 2 and 3 is second object server and the 3rd object server, and then client writes Data1 at object server 1 earlier; Object server 1 writes Data1 respectively on object server 2 and object server 3.
In other embodiments, when first object server finish on second object server write work after, also can continue at other object servers by second object server, write work as continuing to carry out data on the 3rd object server.
Ablation process with above-mentioned Data1 is an example, when client writes Data1 at object server 1; Object server 1 writes Data1 on object server 2, object server 2 continues to write Data1. on object server 3 and realized the serial ablation process.
Need to prove, the method that the embodiment of the invention provided, no matter be data to have been write N object server (N is more than or equal to 3) by parallel mode or employing serial mode, described client all will send to described N object server according to indication write the results verification request, and receives the result that writes that described N object server returns.
In the practical application, the situation that first object server or second object server write failure may appear, in the method that the embodiment of the invention provided, when described first object server or described second object server return when writing data failure, meta data server indicates described first object server and second object server to re-execute described data write operation.
Alternatively, when having the N object server,, then re-execute N the write operation on the object server if in the N object server, write failure.
As seen, the method that the embodiment of the invention provided is applied to comprise at least in the distributed file system of two object servers, and this method writes data by order at least two object servers and writes data affirmation process afterwards and guarantees write operation success simultaneously at least two object servers.If unsuccessful situation is then all rewriteeing on the two-server at least, fundamentally avoided the appearance of isolated copy.
Fig. 2 is the application synoptic diagram of the present invention that one embodiment of the invention provided, and comprises client, meta data server, object server 1 and object server 2 among the figure.
Meta data server is the Core server in this example, and all write activities are all cooperated with different object servers according to layout information indication client by meta data server to be finished, and specifically comprises:
S201, client send data to meta data server and write request, and these data write record data object to be written in the request.
S202, meta data server return to client with layout information.
Comprise the information of these data to be written in the layout information, for example sign of object server or the like with the object server that is written into.For convenience, hypothesis data to be written need be written into object server 1 and object server 2 in the embodiment of the invention.
S203, client to first given object server write data, and is waited for the performance of write operation according to layout information.
S204, after client is finished write operation at first object server, first object server writes data into second object server, and waits for the performance of write operation;
S205, second object server return to first object server with the performance of write operation.
S206, first object server return to client with all results of write operation.
S207, client confirm to write the data success to first server.
S208, first object server confirm to write the data success to second object server.
S209, second object server return the result who writes success and return to first object server;
The result that S210, first object server will write success returns to client.
By the embodiment of the invention as can be seen, the method that the embodiment of the invention provided successfully guarantees the consistance of the copy of the same data of preserving on the different object servers in the write operation process by same object at the write operation of at least two different object servers.
Another embodiment of the present invention also provides a kind of data to write server, and referring to Fig. 3, this server comprises:
First transmitting element 301 is used for when receiving the write operation requests of client transmission layout information being returned to described client, comprises data to be written in the described layout information with the object server information that is written into;
First indicating member 302, be used for indicating described client that data to be written are write first object server that described layout information writes down according to described layout information, and indicate described first object server after described client is finished write operation, described data to be written are write second object server that writes down in the described layout information.
Second indicating member 303 is used to indicate described client to send to described first object server and described second object server and writes the results verification request, and receives the result that writes that described first object server and described second object server return.
Alternatively, in another embodiment of the present invention, server shown in Figure 3 also comprises
The 3rd indicating member 304 is used to indicate described first object server after described client is finished write operation, and described data to be written are write the N object server that writes down in the described layout information, and N is the natural number greater than 3;
Perhaps, be used to indicate described second object server after described first object server is finished write operation, described data to be written are write the N object server that writes down in the described layout information, N is the natural number greater than 3.
Accordingly, described second indicating member 303 also is used to indicate described client to send to described N object server and writes the results verification request, and receives the result that writes that described N object server returns.
Alternatively, among other embodiment of the present invention, server shown in Figure 3 can also comprise:
First control module is used for returning when writing data failure at described first object server or described second object server, indicates described first object server and second object server to re-execute described data write operation.
The server that the embodiment of the invention provided is controlled client and the object server corresponding with data to be written according to layout information, the mode of indicating each client and object server to write in order writes data on first object server and second object server, and confirm the execution result of write operation, on two different object servers, write the inconsistent situation of data in the copy of successfully avoiding same data correspondence on the different object servers at least by 2 copies that guarantee same target.
Referring to Fig. 4, one embodiment of the invention also provides a kind of data modification method, and this method is used for comprising the distributed file system of at least two object servers, comprising:
S401, detect whether at least two object servers waiting to revise the data place are normal condition;
S402, when described object server is normal condition, the data to be revised on described at least two object servers are locked, and guarantee to lock successfully;
In the embodiment of the invention, treating the modification data and lock, is in order to prevent in the same moment appearance that has a plurality of main bodys that the same object that is positioned on the different server is all made amendment and operated and lead to errors.With on two servers simultaneously the record an electrical form be example.When if the party A-subscriber operates the electrical form on this S1 server, the party B-subscriber also continues operation to this electrical form on the S2 server, will cause destroying the consistance of same electrical form on the different server so.So when the party A-subscriber will operate the electrical form on the S1 server, the electrical form on the S2 server is locked, make it for lock-out state, thereby avoid a plurality of users simultaneously same file to be operated.
Treat that to revise the data success that locks and will guarantee to lock necessary to the consistance that guarantees data.If do not lock, the situation that the same data in the same object server are revised by a plurality of clients simultaneously then may appear, and conflict appears, cause the data modification failure the most at last.
S403, the data to be revised on described at least two object servers are made amendment;
S404 removes described two lock-out states of waiting to revise data at least.
The amending method of the data trnascription that the embodiment of the invention provided, when definite at least two object servers are normal condition, data to be revised on described at least two object servers are locked, under lock-out state, the data to be revised at least two object servers are made amendment, 2 copies that guaranteed same target are revised successfully on two different object servers at least, fundamentally avoid the appearance of single copy data, guaranteed in the data modification process consistance of the copy of the same data of preserving at least two object servers.
Alternatively, referring to Fig. 5, in the above-described embodiments, the data to be revised on described at least two object servers are made amendment to be comprised:
S501, the data to be revised on described at least two object servers are copied to client;
S502, data described to be revised are made amendment in client;
S503, amended data are write described at least two object servers.
When the data write step of execution in step S503, can carry out according to the method for writing data that the embodiment of the invention provides previously.
After retouching operation is finished, need to remove lock-out state to data, make other terminal to operate to this data.
Alternatively, in the embodiment of the invention, the data to be revised on described at least two object servers are locked and can carry out synchronously, comprising:
Check whether treating on described at least two object servers revised data is locked state;
When the data to be revised on described at least two object servers are when not locked, the data to be revised on described at least two object servers are locked.
In other embodiments, the data to be revised on described at least two object servers are locked also can carry out in proper order, comprising:
After locking for data to be revised on first object server in described at least two object servers, again the data to be revised on first object server at least two object servers are locked.
When mode that employing locks in proper order is that data to be revised on the object server are when locking, the situation that deadlock may occur, referring to Fig. 6, promptly behind obj-1 on the client-1 lock object server-1, when preparing the obj-1 on the lock object server-2, the obj-1 on the object server-2 is locked by other clients at this moment.
If deadlock situation, the solution that one embodiment of the invention provided is:
After locking for data to be revised on first object server in described at least two object servers, detect treating on second object server in described at least two object servers and revise data and whether be in locked state;
If then the lock up-to-date on the data of waiting to revise on described second object server is carried out release, and the client that notice successfully locks at first locks to data described to be revised; Otherwise, the data to be revised on second object server in described at least two object servers are locked.
Compare by the priority of timestamp in the embodiment of the invention added lock on the identical file on the different object servers.For example the unlock method of deadlock situation is described below:
For example, an object 1 is arranged on the object server 1, also there is object 1. on the object server 2 and the object on the object server 11 locked (be designated as lock 1) back when preparing the object on the object server 21 locked as user A, find its locked (being designated as lock 2) by detecting by object server 2, obviously, if do not do corresponding processing, object 1 on object server 1 and the object server all can't be processed, so cause the appearance of deadlock like this, the embodiment of the invention is carried out release by the timestamp of same object data on the more different object servers. supposes, the timestamp of lock 1 is 10:01, the time lock of lock 2 is 10:02, illustrate that promptly user A locks to object 1 prior to user B. then according to the judgement of timestamp, up-to-date lock-the lock 2 of timestamp is released, and the client of the success that locks at first is retained. and notify the described client that successfully locks at first that the object on the object server 21 is locked and get final product.
The amending method of the data trnascription that the embodiment of the invention provided, make amendment by treating under the lock-out state of revising data, 2 copies that guaranteed same target are revised the inconsistent situation of data in the copy of successfully avoiding same data correspondence on the different object servers at least on two different object servers.
Referring to Fig. 7, one embodiment of the invention also provides a kind of data modifying apparatus, and this device is used for comprising the distributed file system of at least two object servers, comprising:
First detecting unit 701 is used to detect whether at least two object servers waiting to revise the data place are normal condition;
Lock cell 702 is used for when described object server is normal condition, the data to be revised on described at least two object servers is locked, and guarantee to lock successfully;
Revise unit 703, be used for the data to be revised on described at least two object servers are made amendment;
Separate lock unit 704, be used to remove the lock-out state of waiting to revise data to described at least two.
Alternatively, referring to Fig. 8, treating on each object server revised the locking of data and can carry out synchronously, particularly, described lock cell 702 comprises:
First checks subelement 801, is used to check whether treating on described at least two object servers revised data is locked state;
The first locking subelement 802 is used for being when not locked when the data to be revised on described at least two object servers, and the data to be revised on described at least two object servers are locked.
In another embodiment of the present invention, referring to Fig. 9, treating on each object server revised the locking of data and can carry out in proper order, particularly, described lock cell 702 comprises:
The second locking subelement 901 is used for locking to the data to be revised on first object server on described at least two object servers;
Second checks subelement 902, is used for detecting waiting whether revise data is in locked state on second object server of described at least two object servers;
The 3rd locking subelement 903, when being used for data to be revised on described second object server and being in locked state the lock up-to-date on the data of waiting to revise on described second object server is carried out release, and the client that notice successfully locks at first locks to data described to be revised;
The 4th locking subelement 904 locks to the data to be revised on second object server of described two object servers when being used for treating on described second object server and revising data and be not in locked state at least.
Referring to Figure 10, in one embodiment of the invention, described modification unit 703 comprises:
Replicon unit 1001 is used for the data to be revised on described at least two object servers are copied to client;
Revise subelement 1002, be used for data described to be revised being made amendment in client;
Write subelement 1003, be used for amended data are write described at least two object servers.
The modifier of the data trnascription that the embodiment of the invention provided, make amendment by treating under the lock-out state of revising data, 2 copies that guaranteed same target are revised the inconsistent situation of data in the copy of successfully avoiding same data correspondence on the different object servers at least on two different object servers.
One embodiment of the invention also provides a kind of data reconstruction method, is used for comprising the distributed file system of at least two object servers, and referring to Figure 11, this method comprises:
S1101, detect the operation conditions of described at least two object servers by the heartbeat inspection;
S1102, after finding the fault object server, determine object to be recovered according to layout information;
Write down the information of copy of the object of this fault object server stores in the layout information, because fault has taken place this object server, so the copy unloading that needs all objects that will store on this object server is to other object servers.
S1103, send to recover message, the described destination server sign of record and treat data recovered in the described recovery message to the destination object server;
In the embodiment of the invention copy will be called the destination object server by the object server of unloading.Destination server can be other object servers that do not break down in the current distributed file system, also can be an object server outside the current distributed file system.
S1104, described destination object server according to described recovery message to described when treating that restore data is recovered successfully, according to described destination object server identification and the data that are resumed described layout information is upgraded.
Described destination object server will treat that according to described recovery message restore data is written to this locality exactly to the described recovery operation essence of restore data for the treatment of.
In other embodiments of the invention, may carry out not success of restore data first on first destination server, at this moment, method shown in Figure 11 also comprises:
, send to described destination object server again and recover message when treating that restore data is recovered failure described according to described recovery message at described destination object server.
In the practical application, can control the number of times that sends recovering information again to described destination object server by default number of retries.
Effectively avoided operating situation by retry mechanism,, saved the processing time, improved treatment effeciency by automatic retry once the return results of failing.
The data reconstruction method that the embodiment of the invention provided, when being checked through the fault object server, in time in destination server, recover the copy of the object preserved in the fault object server again, simultaneously layout information is upgraded, thereby avoided occurring copy inconsistent situation on different object servers of same target because of the individual objects server failure.
Referring to Figure 12, another embodiment of the present invention also provides a kind of data to recover server, and this server is used for comprising the distributed file system of at least two object servers, comprising:
Second detecting unit 1201 is used for the operation conditions by described at least two object servers of heartbeat inspection detection;
Determining unit 1202 is used for determining object to be recovered according to the information of layout information and described fault object server after finding the fault object server;
Second transmitting element 1203 is used for sending recovery message to the destination object server, and the described destination server of record identifies and treat data recovered in the described recovery message;
Updating block 1204, be used for described destination object server according to described recovery message to described when treating that restore data is recovered successfully, according to described destination object server identification and the data that are resumed described layout information is upgraded.
Alternatively, if carry out restore data on first destination server during success, server shown in Figure 12 also comprises:
Second control module, be used for described destination object server according to described recovery message to described when treating that restore data is recovered failure, control described transmitting element and repeat to the destination object server and send the action that recovers message.
The data that the embodiment of the invention provided are recovered server, when being checked through the fault object server, in time in destination server, recover the copy of the object preserved in the fault object server again, simultaneously layout information is upgraded, thereby avoided occurring copy inconsistent situation on different object servers of same target because of the individual objects server failure.
The present invention can describe in the general context of the computer executable instructions of being carried out by computing machine, for example program module.Usually, program module comprises the routine carrying out particular task or realize particular abstract, program, object, assembly, data structure or the like.Also can in distributed computing environment, put into practice the present invention, in these distributed computing environment, by by communication network connected teleprocessing equipment execute the task.In distributed computing environment, program module can be arranged in the local and remote computer-readable storage medium that comprises memory device.
The above only is a preferred implementation of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.