CN103095767B - Distributed cache system and data reconstruction method based on distributed cache system - Google Patents

Distributed cache system and data reconstruction method based on distributed cache system Download PDF

Info

Publication number
CN103095767B
CN103095767B CN201110343592.3A CN201110343592A CN103095767B CN 103095767 B CN103095767 B CN 103095767B CN 201110343592 A CN201110343592 A CN 201110343592A CN 103095767 B CN103095767 B CN 103095767B
Authority
CN
China
Prior art keywords
data
information
service node
malfunctioning
main service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110343592.3A
Other languages
Chinese (zh)
Other versions
CN103095767A (en
Inventor
李豪伟
陈典强
郭斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201110343592.3A priority Critical patent/CN103095767B/en
Publication of CN103095767A publication Critical patent/CN103095767A/en
Application granted granted Critical
Publication of CN103095767B publication Critical patent/CN103095767B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of distributed cache system and based on the data reconstruction method of distributed cache system, in the above-mentioned methods, after eliminating failure from one or more malfunctioning nodes in service node and reactivating, one or more malfunctioning nodes receive the first data information from main service node, and execute data reconstruction operations corresponding with the first data information;When the first data information has been sent, one or more malfunctioning nodes receive the second data information recorded from main service node, and execute data reconstruction operations corresponding with the second data information.The technical solution provided according to the present invention has reached the data that malfunctioning node works normally node with other and has been consistent, substantially increased the effect of distributed cache system availability.

Description

Distributed cache system and data reconstruction method based on distributed cache system
Technical field
The present invention relates to the communications fields, in particular to a kind of distributed cache system and are based on distributed caching system The data reconstruction method of system.
Background technique
Cloud computing (Cloud Computing) is grid computing (Grid Computing), distributed computing (Distributed Computing), parallel computation (Parallel Computing), effectiveness calculate (Utility Computing), network storage (Network Storage Technologies), virtualization (Virtualization), load The product of traditional computers technologies such as balanced (Load Balance) and network technical development fusion.It is intended to through network more The computational entity of a advantage of lower cost is integrated into the system with powerful calculating ability.Distributed caching is cloud computing model A field in farmland, effect are to provide the ability of distributed storage service and the high-speed read-write access of mass data.It should Distributed cache system is made of several server nodes and client interconnection.In general, the data of write-in are not It may be only stored on individual server node, but save the copy of the same data on more nodes, be mutually backups.Number It is constituted according to by key (Key) and value (Value) two parts, wherein Key is equivalent to the index of data, and Value is representated by Key Data content.Key and Value is one-to-one relationship in logic.Server node is responsible for storing and managing in memory and disk Data are managed, and in multiple copies of multiple server node storing datas, after being used to guarantee section server node delay machine, entirely System remains to be continued as using other copy datas using offer normal service;Client can do data to server node The operations such as write-in, reading, update, deletion.
In distributed cache system, after some server node fault delay machine, data can be lost, during the node delay machine Newly generated data can not be stored, it is inconsistent to will result in the copy data that different server node is stored in this way, will cause For application access to the data of mistake, this is a more insoluble technical problem.
Summary of the invention
After the part server nodes break down delay machine of distributed cache system in the related technology, number can be lost According to inconsistent with other server nodes preservation data after failed server node debugging again addition system Problem, the present invention provides a kind of distributed cache system and based on the data reconstruction method of distributed cache system, at least It solves the above problems.
According to an aspect of the invention, there is provided a kind of data reconstruction method based on distributed cache system.
Data reconstruction method according to the present invention based on distributed cache system includes: from one in service node Or after multiple malfunctioning nodes are eliminated failures and reactivated, one or more malfunctioning nodes receive the from main service node One data information, and execute data reconstruction operations corresponding with the first data information, wherein the first data information includes: one Or the request of multiple malfunctioning nodes data and/or one or more malfunctioning nodes break down after to the data during reactivating Modification information;When the first data information has been sent, what one or more malfunctioning nodes receptions were recorded from main service node Second data information, and execute data reconstruction operations corresponding with the second data information, wherein the second data information includes: from One or more malfunctioning nodes reactivate beginning, the data modification information of data and/or record that main service node is got.
In the above-mentioned methods, one or more malfunctioning nodes receive from main service node the first data information it Before, further includes: main service node records after one or more malfunctioning nodes break down to the data change during reactivating Information;Main service node receives the request of the first data information of acquisition from one or more malfunctioning nodes.
In the above-mentioned methods, main service node receives the first data information of acquisition from one or more malfunctioning nodes Request after, further includes: data of the main service node record since reactivating one or more malfunctioning nodes change letter Breath.
In the above-mentioned methods, one or more malfunctioning nodes execute data reconstruction operations packet corresponding with the first data information Include: one or more malfunctioning nodes save the data in the first data information received, and/or according to the first data Data modification information in information executes write-in and/or delete operation.
In the above-mentioned methods, one or more malfunctioning nodes execute data reconstruction operations packet corresponding with the second data information Include: one or more malfunctioning nodes save the data in the second data information received, and/or according to the second data Data modification information in information executes write-in and/or delete operation.
In the above-mentioned methods, the second data letter recorded from main service node is received in one or more malfunctioning nodes After breath, further includes: main service node sends the message that the second data information is sent to one or more malfunctioning nodes, and It will be recorded as the first moment current time;Main service node receives the recovery normal service from one or more malfunctioning nodes Response message, and will be recorded as the second moment current time;Main service node judges from the first moment to during the second moment, Whether there is also the data for being not sent to one or more malfunctioning nodes and/or the data modification informations of record;If so, then Continue the data modification information that data and/or record are sent to one or more malfunctioning nodes, if it is not, directly to one Or multiple malfunctioning nodes send the message that data reconstruction finishes.
In the above-mentioned methods, the data of one or more malfunctioning node requests include at least one of: internal storage data, magnetic Disk data.
In the above-mentioned methods, to the data modification information during reactivating after one or more malfunctioning nodes break down It include: data in magnetic disk modification information.
According to another aspect of the present invention, a kind of distributed cache system is provided.
Distributed cache system according to the present invention include: a main service node and at least one from service node;From Service node includes: the first receiving module, for laying equal stress on from one or more malfunctioning nodes elimination failure in service node After new enabling, the first data information from main service node is received, wherein the first data information includes: one or more The data of malfunctioning node request and/or one or more malfunctioning nodes, which change after breaking down to the data during reactivating, to be believed Breath;First reconstructed module, for executing data reconstruction operations corresponding with the first data information;Second receiving module is used for When first data information has been sent, the second data information recorded from main service node is received, wherein the second data information It include: that the data for the data and/or record that main service node is got become since reactivating one or more malfunctioning nodes More information;Second reconstructed module, for executing data reconstruction operations corresponding with the second data information.
In above system, main service node includes: the first logging modle, for recording one or more malfunctioning node hairs To the data modification information during reactivating after raw failure;Third receiving module, for receiving from one or more events Hinder the request of the first data information of acquisition of node.
In above system, main service node further include: the second logging modle, for recording from one or more failure sections Point reactivates the data modification information of beginning.
In above system, the first reconstructed module includes: the first storage unit, the first data information for that will receive In data saved;First execution unit, for according in the first data information data modification information execute write-in and/ Or delete operation.
In above system, the second reconstructed module includes: the second storage unit, the second data information for that will receive In data saved;Second execution unit, for according in the second data information data modification information execute write-in and/ Or delete operation.
In above system, main service node further include: the first sending module, for being sent out to one or more malfunctioning nodes The message for sending the second data information to be sent;Third logging modle, for that will be recorded as the first moment current time;4th connects Module is received, for receiving the response message of the recovery normal service from one or more malfunctioning nodes;4th logging modle, For that will be recorded as the second moment current time;Judgment module judges for main service node from the first moment to the second moment Period, if there is also the data for being not sent to one or more malfunctioning nodes and/or the data modification informations of record;Second Sending module, for when judgment module output is is, continuing to send the first moment to one or more malfunctioning nodes to second The data modification information of the data and/or record that are got during moment;Judgment module output be it is no when, directly to one or Multiple malfunctioning nodes send the message that data reconstruction finishes.
Through the invention, number is lost after the part server nodes break down delay machine for solving distributed cache system According to after failed server node debugging again addition system, the data saved with other server nodes are inconsistent The problem of, and then reached the data that malfunctioning node works normally node with other and be consistent, it substantially increases distributed slow The effect of deposit system availability.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the distributed cache system signal according to an embodiment of the present invention being made of server node and client Figure;
Fig. 2 is the flow chart of the data reconstruction method according to an embodiment of the present invention based on distributed cache system;
Fig. 3 is the flow chart of the data reconstruction method according to the preferred embodiment of the invention based on distributed cache system;
Fig. 4 is the structural block diagram of distributed cache system according to an embodiment of the present invention;
Fig. 5 is the structural block diagram of distributed cache system according to the preferred embodiment of the invention.
Specific embodiment
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings and in combination with Examples.It should be noted that not conflicting In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.
Fig. 1 is the distributed cache system signal according to an embodiment of the present invention being made of server node and client Figure.As shown in Figure 1, the copy of multiple server node storing datas and data is configured in above-mentioned distributed cache system, visitor Multiple cluster service nodes in family end and distributed cache system establish connection, each server section in cluster service node Connection and normal operation are established between point mutually.It, in logic can be according to certain priority to the Key of some specific data A few server nodes in server cluster are regarded as Collaboration Server (being referred to as main service node) and multiple Replica server (is referred to as from service node), and different Key may have different Collaboration Server and replica server. Synergist is responsible for handling the request from client, and writes data into other several replica servers.
It should be noted that the selection of Collaboration Server is needed according to network condition at that time.
Fig. 2 is the flow chart of the data reconstruction method according to an embodiment of the present invention based on distributed cache system.Such as Fig. 2 Shown, this method mainly includes following processing:
Step S202: after eliminating failure from one or more malfunctioning nodes in service node and reactivating, one Or multiple malfunctioning nodes receive the first data information from main service node, and execute number corresponding with the first data information According to reconstructed operation, wherein the first data information includes: the data and/or one or more of one or more malfunctioning node requests Malfunctioning node break down after to the data modification information during reactivating;
Step S204: when the first data information has been sent, one or more malfunctioning nodes are received to be saved from main service Second data information of point record, and execute data reconstruction operations corresponding with the second data information, wherein the second data information It include: that the data for the data and/or record that main service node is got become since reactivating one or more malfunctioning nodes More information.
In the related technology, after the part server nodes break down delay machine of distributed cache system, data can be lost, In failed server node debugging again after addition system, save that data are inconsistent to ask with other server nodes Topic, using method as shown in Figure 2, loses after the part server nodes break down delay machine for solving distributed cache system Data, after failed server node debugging again addition system, the data saved with other server nodes are different The problem of cause, and then reached the data that malfunctioning node works normally node with other and be consistent, substantially increase distribution The effect of caching system availability.
Preferably, said one or multiple malfunctioning nodes receive from main service node the first data information it Before, can also include following processing:
(1) main service node records after one or more malfunctioning nodes break down to the data change during reactivating Information;
In a preferred implementation process, malfunctioning node storage it is any one record, at least there are one or multiple number of copies According to being stored on other service nodes, after malfunctioning node breaks down, the data that main service node is sent to malfunctioning node become More request (such as: write-in data delete data) it is performed both by failure, main service node can remember the information of these change failures Record is got off, convenient for allowing malfunctioning node to restore data information;
(2) main service node receives the request of the first data information of acquisition from one or more malfunctioning nodes;
In a preferred implementation process, the data of one or more malfunctioning nodes request can include but is not limited to it is following at least One of: internal storage data, data in magnetic disk.
In a preferred implementation process, to the data change during reactivating after one or more malfunctioning nodes break down Information can include but is not limited to: data in magnetic disk modification information.
For example, being sent to main service node if malfunctioning node Request-rebuild internal storage data and obtaining full memory copy Request of data;If malfunctioning node Request-rebuild data in magnetic disk, first judge whether malfunctioning node because of disk failure replaces magnetic Disk causes data all to be lost;If it is, whole copy datas of the disk storage please be obtain to main service node;If no It is the copy data modification information then only generated after breaking down to main service node request malfunctioning node.
It should be noted that only in disk failure, and when being replaced new disk, data in magnetic disk just can all be lost It loses, in such cases, needs to obtain whole data in magnetic disk to other nodes, carry out data reconstruction;And if disk itself does not have Faulty, data file still has, such as only abnormal program termination service, then only needs to obtain to other nodes in failure After nodes break down, the data modification information that other nodes receive, and execute these data change operations, guaranteed with this and Other service node copy datas are consistent, to greatly improve the speed of malfunctioning node data reconstruction.
Preferably, the first data information of acquisition from one or more malfunctioning nodes is received in above-mentioned main service node Request after, can also include following processing: main service node record be since reactivating one or more malfunctioning nodes Data modification information.
It should be noted that malfunctioning node is received from other service nodes and stored copies data need a process, this Section Annual distribution formula caching system still can receive write-in and/or removal request from UE, and malfunctioning node can be according to main clothes Data modification information of the nodes records of being engaged in log, the operation that main service node is sent to during reconstruct is reformed one time, is protected Above-mentioned data modification information is demonstrate,proved all to be executed in malfunctioning node, thereby ensure that malfunctioning node reconstruct after copy data and Other service nodes are completely the same.
Preferably, said one or multiple malfunctioning nodes execute data reconstruction operations corresponding with the first data information can be with Further comprise following processing: one or more malfunctioning nodes save the data in the first data information received, And/or write-in and/or delete operation are executed according to the data modification information in the first data information.
Preferably, said one or multiple malfunctioning nodes execution data reconstruction operations corresponding with the second data information can also To further comprise following processing: one or more malfunctioning nodes protect the data in the second data information received It deposits, and/or executes write-in and/or delete operation according to the data modification information in the second data information.
Preferably, the second data information recorded from main service node is received in said one or multiple malfunctioning nodes Later, can also include following processing:
(1) main service node sends the message that the second data information is sent to one or more malfunctioning nodes, and will Current time is recorded as the first moment (T1);
(2) main service node receives the response message of the recovery normal service from one or more malfunctioning nodes, and It will be recorded as the second moment (T2) current time.
(3) during main service node judges from T1 to T2, if there is also be not sent to one or more malfunctioning nodes Data and/or record data modification information;If so, then continue to one or more malfunctioning nodes send data and/or The data modification information of record, if it is not, directly sending the message that data reconstruction finishes to one or more malfunctioning nodes.
Above-mentioned preferred embodiment is described further below with reference to Fig. 3.
Fig. 3 is the flow chart of the data reconstruction method in distributed cache system according to the preferred embodiment of the invention.Such as Shown in Fig. 3, this method may include following processing step:
Step S302: malfunctioning node failure exits distributed cache system;
Step S304: after breaking down from one or more of service node, in distributed cache system with failure Node is stored with identical data and serves as the server node of synergist role (that is, above-mentioned main service node), start recording The data modification information of malfunctioning node processing failure, including deletion record information and new write-in data information, are convenient for malfunctioning node According to the data change during the Information recovering failure;
It should be noted that having another from service node to serve as collaboration after main service node breaks down The server node of device role becomes new main service node, receives the data modification information from UE.
Step S306: after reactivating, first without service, the current state of node is set for malfunctioning node debugging It is set to and is carrying out data reconstruction;
Step S308: malfunctioning node sends message to the synergist server node for being stored with its identical copies data, asks Seek data modification information after obtaining copy data and failure;
Step S310: synergist server node, after the request for receiving malfunctioning node, starting log record, start recording The write-in and/or delete operation from UE of subsequent server node processing;
Step S312: synergist server node, response malfunctioning node obtains request of data, if malfunctioning node will obtain Full memory data, then traversal obtains data stored in memory one by one, and is sent to malfunctioning node, until traversal finishes;If therefore Barrier node will obtain whole disk copy datas, read data file one by one and send malfunctioning node for the data of reading, directly It is finished to traversal;If malfunctioning node will obtain the later copy data modification information of fault time point, according to fault time, Data modification information after read failure occurs one by one, and the data of reading are sent to malfunctioning node;
Step S314: malfunctioning node receives all data, and is stored and processed;
Step S316: synergist server node reads log information;
Step S318: synergist server node sends during data reconstruction the log information that records to malfunctioning node;
Step S320: malfunctioning node receives log information, executes write-in and/or delete operation;
Step S322: synergist server node, after having sent current log message, record current time is T1;
Step S324: the notification message that log information has been sent is sent to malfunctioning node;
Step S326: malfunctioning node receives after log information is sent message, starts to provide normal service, processing is read Out, the request such as write-in and/or deletion data;
Step S328: this node is sent to synergist server node and has started the notification message operated normally;
Step S330: after synergist server node receives the message that malfunctioning node has started normal operation, record is current Moment is T2;
Step S332: whether synergist server node judges the T1 moment to still there is new log information to produce during the T2 moment It is raw, if so, step S334 is then continued to execute, if it is not, going to step S338;
Step S334: log information is sent to malfunctioning node, until log information is sent;
Step S336: malfunctioning node continues to and handles the letter of the log from uptime longest server node Breath, until receiving the message that reconstruct data have been sent;
Step S338: synergist server node sends the message that reconstruct data have been sent;
Step S340: malfunctioning node removes reconstituted state label, completes data reconstruction.
Fig. 4 is the structural block diagram of distributed cache system according to an embodiment of the present invention.As shown in figure 4, the distribution is slow Deposit system may include: main service node 10 and at least one from service node 20;It include: first to connect from service node 20 Module 200 is received, for after eliminating failure from one or more malfunctioning nodes in service node and reactivating, reception to be come from In the first data information of main service node, wherein the first data information includes: the data of one or more malfunctioning node requests And/or one or more malfunctioning nodes break down after to the data modification information during reactivating;First reconstructed module 202, for executing data reconstruction operations corresponding with the first data information;Second receiving module 204, for believing in the first data When breath has been sent, the second data information recorded from main service node is received, wherein the second data information includes: from one A or multiple malfunctioning nodes reactivate beginning, the data modification information of data and/or record that main service node is got;The Two reconstructed modules 206, for executing data reconstruction operations corresponding with the second data information.
Using distributed cache system as shown in Figure 4, solves the part server node hair of distributed cache system Data are lost after raw failure delay machine, after failed server node debugging again addition system, with other server sections The inconsistent problem of the data that point saves, and then realize the data that malfunctioning node works normally node with other and be consistent, Substantially increase the availability of distributed cache system.
Preferably, as shown in figure 5, main service node 10 may include: the first logging modle 100, for record one or Multiple malfunctioning nodes break down after to the data modification information during reactivating;Third receiving module 102 comes for receiving From the request of the first data information of acquisition in one or more malfunctioning nodes.
Preferably, as shown in figure 5, main service node 10 can also include: the second logging modle 104, for recording from one A or multiple malfunctioning nodes reactivate the data modification information of beginning.
Preferably, as shown in figure 5, above-mentioned may further include from the first reconstructed module 202 of service node 20: first Storage unit (not shown), the data in the first data information for will receive save;First execution unit (not shown), for executing write-in and/or delete operation according to the data modification information in the first data information.
Preferably, as shown in figure 5, above-mentioned may further include from the second reconstructed module 206 of service node 20: second Storage unit (not shown), the data in the second data information for will receive save;Second execution unit (not shown), for executing write-in and/or delete operation according to the data modification information in the second data information.
Preferably, it as shown in figure 5, main service node 10 can also include: the first sending module 106, is used for described one A or multiple malfunctioning nodes send the message that second data information is sent;Third logging modle 108, being used for will be current Moment was recorded as the first moment;4th receiving module 110, for receiving the recovery from one or more of malfunctioning nodes The response message of normal service;4th logging modle 112, for that will be recorded as the second moment current time;Judgment module 114, Judge for the main service node from first moment to during second moment, if there is also be not sent to The data of one or more of malfunctioning nodes and/or the data modification information of record;Second sending module 116, in institute When stating judgment module output when being, to continue to send first moment to one or more of malfunctioning nodes to described second The data modification information of the data and/or record that are got between setting a date;When judgment module output is no, directly to described One or more malfunctioning nodes send the message that data reconstruction finishes.
Preferably, said one or the data of multiple malfunctioning nodes request can include but is not limited at least one of: Internal storage data, data in magnetic disk.
It preferably, can to the data modification information during reactivating after said one or multiple malfunctioning nodes break down To include but is not limited to: data in magnetic disk modification information.
It can be seen from the above description that the present invention realizes following technical effect: the part of distributed cache system Server node, which breaks down, loses data after delay machine, after failed server node debugging again addition system, with The data that other server nodes save are consistent, substantially increase the availability of distributed cache system.
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored It is performed by computing device in the storage device, and in some cases, it can be to be different from shown in sequence execution herein Out or description the step of, perhaps they are fabricated to each integrated circuit modules or by them multiple modules or Step is fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific hardware and softwares to combine.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (14)

1. a kind of data reconstruction method based on distributed cache system, the distributed cache system includes: a main service Node and at least one from service node characterized by comprising
It is one or more of after one or more malfunctioning nodes from service node are eliminated failure and reactivated Malfunctioning node receives the first data information from the main service node, and executes corresponding with first data information Data reconstruction operations, wherein first data information includes: the data of one or more of malfunctioning nodes request and/or One or more of malfunctioning nodes break down after to the data modification information during reactivating;
When first data information has been sent, one or more of malfunctioning nodes are received from the main service node Second data information of record, and execute data reconstruction operations corresponding with second data information, wherein second number It is believed that breath include: since one or more of malfunctioning nodes reactivate ing, the data that the main service node is got with/ Or the data modification information of record.
2. the method according to claim 1, wherein one or more of malfunctioning nodes are received from described Before first data information of main service node, further includes:
The main service node, which records, to be become after one or more of malfunctioning nodes break down to the data during reactivating More information;
The main service node receives asking for acquisition first data information from one or more of malfunctioning nodes It asks.
3. according to the method described in claim 2, it is characterized in that, the main service node is received from one or more After the request of acquisition first data information of a malfunctioning node, further includes:
The main service node records the data modification information since reactivating one or more of malfunctioning nodes.
4. the method according to claim 1, wherein one or more of malfunctioning nodes execute and described first The corresponding data reconstruction operations of data information include:
One or more of malfunctioning nodes save the data in first data information received, and/or press Write-in and/or delete operation are executed according to the data modification information in first data information.
5. the method according to claim 1, wherein one or more of malfunctioning nodes execute and described second The corresponding data reconstruction operations of data information include:
One or more of malfunctioning nodes save the data in second data information received, and/or press Write-in and/or delete operation are executed according to the data modification information in second data information.
6. the method according to claim 1, wherein receiving in one or more of malfunctioning nodes from institute After second data information for stating main service node record, further includes:
The main service node sends the message that second data information is sent to one or more of malfunctioning nodes, And it will be recorded as the first moment current time;
The main service node receives the response message of the recovery normal service from one or more of malfunctioning nodes, and It will be recorded as the second moment current time;
The main service node judges from first moment to during second moment, if there is also being not sent to State the data of one or more malfunctioning nodes and/or the data modification information of record;If so, then continuing to one or more A malfunctioning node sends the data modification information of the data and/or record, if it is not, directly to one or more of Malfunctioning node sends the message that data reconstruction finishes.
7. method according to any one of claim 1 to 6, which is characterized in that one or more of malfunctioning nodes are asked The data asked include at least one of: internal storage data, data in magnetic disk.
8. method according to any one of claim 1 to 6, which is characterized in that one or more of malfunctioning node hairs It to the data modification information during reactivating include: data in magnetic disk modification information after raw failure.
9. a kind of distributed cache system, which is characterized in that the distributed cache system includes: main service node and extremely Lack one from service node;
It is described to include: from service node
First receiving module, for eliminating failure in one or more malfunctioning nodes from service node and reactivating Afterwards, the first data information from the main service node is received, wherein first data information includes: one Or multiple malfunctioning nodes requests data and/or one or more of malfunctioning nodes break down after to during reactivating Data modification information;
First reconstructed module, for executing data reconstruction operations corresponding with first data information;
Second receiving module is recorded for receiving when first data information has been sent from the main service node The second data information, wherein second data information includes: reactivating out from one or more of malfunctioning nodes Begin, the data modification information of data and/or record that the main service node is got;
Second reconstructed module, for executing data reconstruction operations corresponding with second data information.
10. system according to claim 9, which is characterized in that the main service node includes:
First logging modle, for recording after one or more of malfunctioning nodes break down to the data during reactivating Modification information;
Third receiving module, for receiving acquisition first data information from one or more of malfunctioning nodes Request.
11. system according to claim 10, which is characterized in that the main service node further include:
Second logging modle, for recording the data modification information since reactivating one or more of malfunctioning nodes.
12. system according to claim 9, which is characterized in that first reconstructed module includes:
First storage unit, for saving the data in first data information received;
First execution unit is grasped for executing write-in according to the data modification information in first data information and/or deleting Make.
13. system according to claim 9, which is characterized in that second reconstructed module includes:
Second storage unit, for saving the data in second data information received;
Second execution unit is grasped for executing write-in according to the data modification information in second data information and/or deleting Make.
14. system according to claim 9, which is characterized in that the main service node further include:
First sending module disappears for send that second data information is sent to one or more of malfunctioning nodes Breath;
Third logging modle, for that will be recorded as the first moment current time;
4th receiving module, the response for receiving the recovery normal service from one or more of malfunctioning nodes disappear Breath;
4th logging modle, for that will be recorded as the second moment current time;
Judgment module judges for the main service node from first moment to during second moment, if also deposit In the data modification information of the data and/or record that are not sent to one or more of malfunctioning nodes;
Second sending module, for continuing to send out to one or more of malfunctioning nodes when judgment module output is is Send first moment to the data modification information of the data and/or record got during second moment;Sentence described When disconnected module output is no, the message that data reconstruction finishes directly is sent to one or more of malfunctioning nodes.
CN201110343592.3A 2011-11-03 2011-11-03 Distributed cache system and data reconstruction method based on distributed cache system Active CN103095767B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110343592.3A CN103095767B (en) 2011-11-03 2011-11-03 Distributed cache system and data reconstruction method based on distributed cache system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110343592.3A CN103095767B (en) 2011-11-03 2011-11-03 Distributed cache system and data reconstruction method based on distributed cache system

Publications (2)

Publication Number Publication Date
CN103095767A CN103095767A (en) 2013-05-08
CN103095767B true CN103095767B (en) 2019-04-23

Family

ID=48207895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110343592.3A Active CN103095767B (en) 2011-11-03 2011-11-03 Distributed cache system and data reconstruction method based on distributed cache system

Country Status (1)

Country Link
CN (1) CN103095767B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016051512A1 (en) * 2014-09-30 2016-04-07 株式会社日立製作所 Distributed storage system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1617139A (en) * 2003-11-15 2005-05-18 鸿富锦精密工业(深圳)有限公司 Electronic sending file synchronous system and method
CN102024044A (en) * 2010-12-08 2011-04-20 华为技术有限公司 Distributed file system
CN102025758A (en) * 2009-09-18 2011-04-20 成都市华为赛门铁克科技有限公司 Method, device and system fore recovering data copy in distributed system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1617139A (en) * 2003-11-15 2005-05-18 鸿富锦精密工业(深圳)有限公司 Electronic sending file synchronous system and method
CN102025758A (en) * 2009-09-18 2011-04-20 成都市华为赛门铁克科技有限公司 Method, device and system fore recovering data copy in distributed system
CN102024044A (en) * 2010-12-08 2011-04-20 华为技术有限公司 Distributed file system

Also Published As

Publication number Publication date
CN103095767A (en) 2013-05-08

Similar Documents

Publication Publication Date Title
US11397648B2 (en) Virtual machine recovery method and virtual machine management device
CN110309218B (en) Data exchange system and data writing method
US20150213100A1 (en) Data synchronization method and system
US9189348B2 (en) High availability database management system and database management method using same
CN107426265A (en) The synchronous method and apparatus of data consistency
CN103942252B (en) A kind of method and system for recovering data
WO2018098972A1 (en) Log recovery method, storage device and storage node
US9098439B2 (en) Providing a fault tolerant system in a loosely-coupled cluster environment using application checkpoints and logs
CN103516736A (en) Data recovery method of distributed cache system and a data recovery device of distributed cache system
CN103929500A (en) Method for data fragmentation of distributed storage system
CN102867035B (en) A kind of distributed file system cluster high availability method and device
CN106933843B (en) Database heartbeat detection method and device
CN109491609B (en) Cache data processing method, device and equipment and readable storage medium
CN103516549B (en) A kind of file system metadata log mechanism based on shared object storage
US8762347B1 (en) Method and apparatus for processing transactional file system operations to enable point in time consistent file data recreation
CN109582686B (en) Method, device, system and application for ensuring consistency of distributed metadata management
WO2019020081A1 (en) Distributed system and fault recovery method and apparatus thereof, product, and storage medium
US20120084260A1 (en) Log-shipping data replication with early log record fetching
WO2015184925A1 (en) Data processing method for distributed file system and distributed file system
CN105049258B (en) The data transmission method of network disaster tolerance system
CN110825562B (en) Data backup method, device, system and storage medium
CN104965835B (en) A kind of file read/write method and device of distributed file system
US20090063486A1 (en) Data replication using a shared resource
WO2017014814A1 (en) Replicating memory volumes
Kończak et al. Recovery algorithms for paxos-based state machine replication

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant