CN102841840B

CN102841840B - The message logging restoration methods that Effect-based operation reorders and message number is checked

Info

Publication number: CN102841840B
Application number: CN201210239710.0A
Authority: CN
Inventors: 高胜法; 蔡静; 冯振
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2012-07-11
Filing date: 2012-07-11
Publication date: 2015-09-09
Anticipated expiration: 2032-07-11
Also published as: CN102841840A

Abstract

The invention discloses a kind of Effect-based operation to reorder and message number inspection message logging restoration methods.The present invention adopts message rearrangement sequence method, transmission process send message time with improve this message of logical timer indirect labelling reception order and this order is kept at transmission process this locality store in.When message sink process failures, first obtain from transmission process under recovering process controls and preserved and be not saved to the message of journal file and the logical timer of message, then according to the logical timer of message, the message not being saved to journal file is resequenced.Finally the message after sequence is resend to crashed process, crashed process is receipt message, processing messages again, thus realizes the recurrence of message.Thus the recovery algorithms both improved when the runnability of system process when non-fault in turn simplify process failures.

Description

Message log recovery method based on message reordering and message number inspection

Technical Field

The invention relates to a distributed system, in particular to a message log recovery method based on message reordering and message number inspection.

Background

Message log based recovery protocols rely on a piecewise deterministic assumption (PWD). According to this assumption, the execution of a process is divided into several state intervals (state intervals), each state interval starting with the execution of an indeterminate event as the beginning of this interval, followed by the execution of several determinate events. According to the PWD assumption, the message receive event is an uncertainty event, while the message send event and the internal events of the process are deterministic events; therefore, a state interval is often started with the execution of a message receive event followed by the execution of several process internal events and message send events.

If a process' state interval depends on an indeterminate event (e.g., a message receipt event) and the event cannot be regenerated in the recovery process, then the process is called an orphan process. Under the assumption of segment certainty (PWD), if process p receives message m_iPost-send message m_jTo process q, process q receives m_jThe latter state interval depends on the process p receiving m_iThe latter state interval. If process p is sending message m_jLater not transmitting the received message m_iThe necessary information is saved to a log file, and then p fails, m_iIs not recoverable in the recovery process; m dependent on p process_iThe process q receiving the event becomes an orphan process. All log recovery protocols require that the global state of the system cannot contain any orphan processes when the system rollback (rollback) recovers.

Traditional pessimistic and optimistic message log failover protocols have to balance between two conflicting objectives: or the process saves the necessary information to the log file and the communication between the processes is asynchronously carried out to improve the performance of the process without failure, in which case there may be an orphan process; or save all necessary information when a process is running without a fault to quickly recover a faulty process when a fault occurs, in which case there are no orphan processes.

The necessary information of a message can be represented as a quadruple < m.source, m.ssn, m.dest, m.rsn >, wherein m.source represents the sending process id of the message m, m.ssn represents the sending order of the message m, m.dest represents the receiving process id of the message m, and m.rsn represents the receiving order of the message m.

Currently, two main message logging protocols exist in distributed systems, namely optimistic message logging protocol and pessimistic message logging protocol.

Optimistic message logs, processes holding information necessary for messages to log files and communications between processes are asynchronous, and orphan processes may exist in the system. In such protocolsAny process p saves the received message m_iNecessary information of (A), (B)<m.source,m.ssn,m.dest,m.rsn>) Allowing message m to be sent until log file_jTo other processes q. When process p fails, because of the received message m_iMay not be saved to the log file and p may have sent message m_jTo q, so process q may become an orphan process. Under an optimistic log protocol, the process does not need to maintain the synchronism of the communication between the log file and the process and the necessary information of the message, so the process has good performance when running without faults; but requires a complex recovery plan to eliminate orphan processes when failing back.

Pessimistic message logs, the saving and sending of messages is synchronized, allowing a process to send a message only after the necessary information (< m.source, m.ssn, m.dest, m.rsn >) for all committed messages is written to the solid memory. At the time of fault recovery, because the pessimistic message log does not necessarily have an orphan process, the fault process only needs to reprocess the previously processed and saved message, thereby realizing the recovery of the process state. However, when the process runs without faults, the storage and sending synchronization of necessary information of the message needs to be maintained, so that the performance of the system when the process runs without faults is necessarily greatly reduced.

In the optimistic message log protocol, since the necessary information of messages < m.source, m.ssn, m.dest, > may fail to be saved to the log file when a process fails, the reception order of messages m.rsn under the conventional message protocol is unrecoverable, and thus the processes of the system have to rollback to resend and receive these unsaved messages. The optimistic message log may result in loss of message receipt order in the message receiving process due to asynchrony of message save to log file and process communication.

Compared with the existing message log recovery method:

different message log recovery protocols have different performance evaluation indicators, and the following five indicators can be used to evaluate the performance of one recovery protocol:

1. ckpt, number of required checkpoints per process.

2. Add, an extra amount of information carried by an application message.

3. Num, number of system messages that need to be exchanged to recover each failed process.

4. Rol, the backoff distance of the process.

5. Roll, number of processes to rollback during recovery.

The message log recovery protocol proposed in this application based on message number checksum message reordering is denoted MNCMR for simplicity. In the MNCMR protocol, each process only needs to asynchronously save one checkpoint, so n.ckpt is equals 1. The data carried by each application message is j and LC_jAdd is therefore 2. Num is 2n +2w for the amount of information that needs to be exchanged for each failure; where n is the total number of processes in the system and w is the number of messages that failed to be saved to the message log file due to the failure. When one or more processes fail, only the failed process rolls back, and the non-failed process continues to execute, so the MNCMR protocol has the smallest dis. The number of processes needing to be rolled back by the MNCMR protocol during recovery is equal to the total number of failed processes, and the index of the processes is the same as that of the pessimistic protocol. Besides, the MNCMR protocol has the advantages of both pessimistic and optimistic protocols: when the process is executed without faults, each process saves the message to the log file like an optimistic protocol and is asynchronously communicated with the process, so that the process has good performance when the process is executed without faults; in the process fault recovery stage, the recovery algorithm is simple, so that the method has the advantage of simple recovery algorithm of a pessimistic protocol. Furthermore, compared with the existing protocol, the fault-free process in the MNCMR protocol does not back off or stop waiting when the process fails, but continues to execute, and the characteristic is similar to a forward recovery algorithm, so that the process in the system has higher operation efficiency.

Since the eighties of the last century, a large number of message log recovery protocolsPublished in journal magazines at home and abroad, and several typical protocols are selected below to be compared with the MNCMR protocol. Sistla and Welch [1 ]]Two message-based optimistic log recovery protocols are proposed, one protocol carrying a transfer dependency vector (denoted prasad.1) for the transmitted messages, and the other protocol carrying only the current state interval value (denoted prasad.2) for the transmitting process for the transmitted application messages. The extra information amount needed by each application message in the Prasad.1 protocol is o (n), and the exchange of o (n) is needed for each failure²) The system message of (1). The amount of extra information required for each application message in prasad.2 protocol is o (1), and for each failure, o (n) needs to be exchanged³) The system message of (1). In Strom and Yeminii [2 ]]In the optimistic message logging protocol, each sent application message carries a transfer dependency vector, the vector has n components, and n is the number of processes of the system. Each process periodically broadcasts this delivery dependency vector or appends the vector to the message sent when the process executes without failure.

Table 1 shows the comparison result between the MNCMR protocol and the above protocols, so it is easy to see that the MNCMR protocol is superior to other protocols in each index.

TABLE 1

Disclosure of Invention

The present invention is directed to solve the above problems and to provide a method for checking message log recovery based on message reordering and message number, which has the advantage of enabling a message log protocol to have both optimistic and pessimistic message log protocols.

In order to achieve the purpose, the invention adopts the following technical scheme:

a method for recovering log based on message reordering and message number check adopts messageReordering method, storing the receiving order of the message in the sending process; when a message receiving process fails, firstly acquiring messages which are stored and not stored in a log file and a corresponding logic clock of a message sending process from the sending process under the control of a recovery process, and then reordering the messages which are not stored in the log file according to the logic clock of the messages; finally, the sequenced messages are sent to the fault process again, the fault process receives the messages again and processes the messages, thereby realizing the replay of the messages_iThe working steps are as follows:

step 1, fork is an integer variable, initialize U_ik0 and T_ikIs 0, respectively represents a process p_iSend to process p_kTotal number of messages of 0 and process p_kReceiving a process p_iThe total number of messages of (2) is 0, and meanwhile lsn is made equal to 0.

Step 2, if the timing time is up, switching to step 3; otherwise, the step 4 is carried out.

Step 3, converting the process p_iSaving the state of (A) into a determinant log file mlog, and saving T_i，U_i，LC_k(k ═ 1,2 … n) to local storage, and the old checkpoint is deleted.

Step 4, if the process p_iTo process p_j(j ═ 1,2 … n) to send a message, then proceed to step 5; otherwise, go to step 9.

Step 5, if F_iIs 1, represents a process p_iIf the fault occurs and the recovery is not yet carried out, the step 9 is carried out; otherwise process p_iAnd (6) normally running, and turning to the step 6.

Step 6, if F_jIs 1, represents a process p_jIf a failure occurs and has not yet been recovered, wait until F_jUntil 0; otherwise, represents process p_jAnd (5) normal operation, wherein the step 7 is carried out.

Step 7, process p_iTo process p_jSending messages, LC_iIs increased by 1, U_ijIs increased by 1, will<LC_i，j，m>Added to the message log file dfile.

Step 8, adding U_ijIs saved to local storage and forwarded to process p_jSending a message AM (i, LC)_i,m)。

Step 9. if process p_iReceiving a process p_jThe sent message is switched to 10; otherwise, go to step 14.

Step 10, Process p_iReceiving a process p_jTransmitted message AM (j, LC)_jM), it needs to be determined that this message is by process p_jWhether sent or process p_jSent by the recovery process.

Step 11, if AM_j＞LC_jThen this indicates that the message is by process p_jSent during fault-free operation, so step 12 is carried out; otherwise the message is passed by process p_jThe recovery process of (2) is sent, and the process proceeds to step 14.

Step 12, using AM_jValue of (1) update LC_j. The received message is handed to the application process for processing and is sent to<j，LC_j，i，lsn>And recording and saving the record in the memory. Because of the receipt of the message, LC_iBy adding 1 to the value of (1) and using LC_iAnd LC_jMaximum value update LC of_i。

Step 13, executing other deterministic events.

Step 14, determining Process p_iIf it is idle, if process p_iIf the system is idle, the step 15 is carried out; otherwise, go to step 16.

Step 15, utilizing the idle time to store the record in the memory<j,LC_j,i,lsn>Store to mlog. Because of the received message, T_ijThe value is increased by 1. Will T_i，LC_k(k is 1,2 … n) is saved inIn the hard disk, lsn is incremented by 1.

Step 16, if process p is received_kIf the error flag clear message sys _ clear f (k) is 1,2 … n, the process proceeds to step 17; otherwise, go to step 18.

Step 17, assigning a value of 0 to F_kIndicating that process p is now occurring_kAnd (5) normally running.

Step 18, if process p is received_k(k is 1,2 … n) message sys _ setf (k) with error flag set to 1, then proceed to step 19; otherwise, go to step 20.

Step 19, assigning a value of 1 to F_kAt this point, process p is indicated_kFailed and not recovered.

Step 20, Process p_iIs it a post-failure restart? If the process p_iAnd if the failure is the restart, the step 21 is carried out, otherwise, the step 1 is carried out.

Step 21, using the pre-stored T_i，U_iAnd LC_kValue update T of (k ═ 1,2 … n)_i，U_iAnd LC_k(k is 1,2 … n), and lsn is 0, F_i1. And (5) transferring to the step 1.

When the common process fails, the recovery process comprises three stages:

stage one:

the recovery process takes the saved messages from the message log file dfile in the order they were sent and stores them in internal memory. Recovery (i) first retrieves the triple by calling RequestDeterminant (i, lsn) through a procedure<j,LC_j,lsn>Then GetM (LC) through procedure call_j) Obtaining<j,LC_j,m>Where m represents the content of the message. Finally, recovery (i) converting the triplets<j,LC_j,lsn>Stored in a local memory ARRAY ARRAY.

And a second stage:

the recovery process retrieves from the message log file dfile messages that failed to be saved to mlog due to process failure and stores them in memory.

Recovery (i) first get T through remote procedure call_ijAnd U_ji。T_ijProcess p is recorded_iReceived in process p_jAnd the number of messages, U, already stored in mlog_jiProcess p is recorded_jSent to process p_iThe number of messages. According to U_jiAnd T_ijMeaning of (1), U_jiAnd T_ijDifference value U of_ji—T_ijRepresents a process p_jIs sent to p_iAnd due to p_iFailures fail to save to mlog number of messages. Then, p_iThe call is located at p_jLocal process remote procedure call GeUnLogM (i, U)_ji-T_ij) And obtaining the message received firstly according to the difference value. Recovery (i) by repeatedly calling GeUnLogM (i, U)_ji-T_ij) And acquiring all messages which cannot be stored to mlog due to process failure and storing the messages in the local memory ARRAY ARRAY.

And a third stage: recovery (i) reorder by logical clock for all messages that fail to be saved to mlog due to process failure. Finally, all the messages which are saved and not saved to mlog are sent to p again_iProcess, p_iThe process re-receives and processes these messages until it runs to the point in time before the failure.

The logic clock of stage three is described as follows:

logic clock LC of process p_pIs an integer variable for counting the sending and receiving events of a message; LC (liquid Crystal)_pSatisfies the following conditions:

1. its initial value is zero;

2. whenever a message is sent, LC_pAdding one;

3. after receiving a message every time a process q receives the message and storing the necessary information of the message in a log file, the LC_pPlus one, then LC_p←max(LC_p,LC_q+1) Wherein LC_qLogic clock representing process q, max represents fetch LC_pAnd LC_q+1Maximum value of (2).

The specific working steps of the recovery process are as follows:

step 1, restart p from the latest checkpoint_iAnd will mark process p_iThe message of sys _ setf (i) in the failure state is sent to each process, and lsn and NUM are set to 0.

And 2, judging whether the determinant file mlog is empty or not. If mlog is not empty, then 3 is carried out; otherwise go to 6.

Step 3, obtaining a determinant record from a determinant file mlog by calling RequestDetermiant (i, lsn)<LC_j,j,lsn>. Mix LC_jThe value of (d) is assigned to the LC, lsn value is increased by 1. Obtaining messages using remote invocation GetM (LC)<LC_j,j,m>This message is sent by the LC in the message log file dfile_jA unique identification.

Step 4, judging the message<LC_j,j,lsn>Whether it is empty. If the message<LC_j,j,lsn>If the result is empty, the operation goes to 2, and if the result is not empty, the operation goes to 5.

Step 5, the received message is processed<LC_j,j,lsn>And storing the data into an ARRAY ARRAY of the memory. Namely, it isCARRAY[NUM].j＝j,ARRAY[NUM]Lsn ═ lsn; . And adding 1 to the NUM value, and turning to 2 to judge whether mlog is empty or not.

Step 6, process p_iThe value of the total number of messages NUM sent and saved in the mlog file is saved to NUM'.

Step 7, forObtaining a process p by remotely invoking GetU (i, j)_jSend to process p_iThe total number of messages.

Step 8, remotely calling GetT (i) to obtain process p_iThe number of messages received for each of the other processes.

Step 9, forDetermine if Uji exists>Tij. If Uji>Tij, indicating Process p_jSend to process p_iIs greater than the number of messages of process p_iReceived process p_jThe number of messages, at this time 10; otherwise go to 13.

Step 10, because Uji>Tij, so there is a halfway message. By remotely calling GetUnlogM (j, U)_ji-T_ij) Message and recording the obtained message in<LC_j,j,m>In (1).

Step 11, judging the obtained message<LC_j,j,m>Whether it is empty. If it is<LC_j,j,m>If the message is empty, the process goes to 9, otherwise, the process goes to 12.

And step 12, storing the obtained message into an ARRAY ARRAY. Since a message is received, T_ijThe value is incremented by 1 and the message number NUM value is incremented by 1. And after the execution is finished, the operation is shifted to 9, and whether the message stored in the log file exists or not is continuously judged.

Step 13, according to ARRAY [ k ]].LC_jIn ascending order of (a) are arranged sets of ARRAY, where k is from NUM' to NUM-1. I.e. process p_iLC of received messages sent by other processes according to messages_jSorting the ascending values of (a).

Step 14, process p_iSending a message AM (ARRAY k)].j,ARRAY[k].LC_j,ARRAY[k]M), where k is 0,1 … (NUM-1).

Step 15, to other processes p_k(k ≠ 1,2 … n, k ≠ i) sending a marking process p_iMessage sys _ clear f (i) that normal operation has been resumed. The recovery process ends.

Step 10 in the recovery process:

GeUnLogM(i,U_ji-T_ij) The process flow of (2) is as follows:

1. the dfile file is opened in a read-only manner.

2. The pointer p is moved to point to the first record stored (dfile is a sequential file, last record located at the end of the file).

3. It is determined whether the pointer p points to the end of the file. If the end of the file is not reached, the file is shifted to 5, otherwise the file is shifted to 10.

5. Reading the record pointed by p and storing the record into the triple<LC_j,l,m>。

6. And judging whether the process identifier l is equal to i or not. If l equals i to 7, otherwise 4.

7. difference variable minus one (difference ═ U)_ji-T_ij)。

8. And judging whether the difference is 0. If difference is 0, it indicates that pj recorded in step 5 is first sent to p_iSends the record to recovery (i) process, turning to 11. If difference is not 0, go to 4.

4. The pointer p is moved to point to the next record. Turn to 3.

10. Closing the file, returning an invalid triple < NULL, NULL < NULL >, and turning to 11 and ending.

11. And (6) ending.

The invention has the beneficial effects that:

1. it is sometimes not possible to illustrate the exact sequence of replayed messages required in previous optimistic message logging protocols, because the transmission of messages is sometimes related to parameters such as channel delay, cpu speed, etc.

2. The theory and the method for reordering the messages are provided, and the problem that the message log fault-tolerant technology is troubled by the out-of-sequence of the messages which are not stored in the optimistic message log protocol for decades is solved.

3. The MNCMR message log recovery protocol proposed by the patent achieves a perfect combination of pessimistic and optimistic message logs. The protocol has the characteristic of an optimistic protocol when the process runs without faults, and the message is stored to the log file in the process and is asynchronously communicated with the process, so that the process is ensured to have good running performance; when a process fails, the recovery algorithm of the failed process is simple and easy to implement, and the characteristic enables the process to have the advantage of pessimistic protocol.

4. The adoption of the message reordering and message number checking technology enables the fault-free process under the MNCMR protocol to be continuously executed when some processes have faults, and the characteristic enables the performance of the process under the MNCMR protocol in the fault recovery phase to be better than that of the processes under the existing all message log recovery protocol in the fault recovery phase.

Drawings

FIG. 1 changes in channel delay resulting in a change in message reception order;

FIG. 2 is always a prior occurrence relationship example;

FIG. 3 is a process for improved logic clocking;

FIG. 4S (m)_j) Indirect preceding occurrence of R (m)_i)；

FIG. 5 messages are sent, received and saved to a log file;

FIG. 6 obtains message content from the message log dfile;

FIG. 7 is a GetUnLogM (i, difference) flow diagram of the process;

FIG. 8 is a general process flow;

fig. 9 shows a recovery process flow.

Detailed Description

The invention is further described with reference to the following figures and examples.

Basic principle of message reordering

Under the PWD assumption, the message reception events of a process have some randomness, i.e. the reception of messages has some uncertainty in time and order. As shown in FIG. 1, assume that the distributed system consists of processes p, q, and r. Wherein,_p,0、_q,0and_r,0initial state intervals representing p, q, and r, respectively;_q,1and_q,2respectively representing the reception of a message m by a process q₁And m₂The latter state interval; t is t_pqAnd t_rqRepresenting the communication channel delay between processes p and q and r, respectively. Under the optimistic message log protocol, if the process q is sending the message m₁And m₂The necessary information before saving to the log file fails at "x". After failure of process q, processes p, q, and r have to restart to resend and receive m₁And m₂. Obviously, the order in which process q replays (replay) messages should be m₁、m₂However due to channel delay t_pqAnd t_rqNot a fixed constant, if t_pq>t_rqThe order in which process q receives messages may become m₂、m₁. The example of fig. 1 illustrates that, although the optimistic message logging protocol requires that a failed process accurately send and receive messages that are not saved to the log file at replay, in some cases (e.g., process channel latency changes, process restart times in the system are ragged, etc.) the order in which actual processes send and receive messages may not coincide with the order before the failure. However, the final result of the system should be consistent with each repeated execution, which means that the execution result of the system may not be related to the receiving order of some messages.

Always prior relationship (always happy before) relationship:

assuming that the channel of the process is a FIFO reliable channel, e_iAnd e_jRespectively represent messages m_iAnd m_jIs sent andan event is received. If e is executed at any one time in the system_iAlways occurs before e_jRegardless of the delay of the channel, the speed of the cpu, etc., e is called_iAlways in the first place e_jAnd is recorded as:

in FIG. 2, R (m)₁)、R(m₂)、R(m₃) And R (m)₄) Respectively represent messages m₁、m₂、m₃And m₄Receive event of, S (m)₁)、S(m₂)、S(m₃) And S (m)₄) Respectively represent messages m₁、m₂、m₃And m₄The sending event of (1). Under the segment determination (PWD) assumption, m is received due to q₁Then m is inevitably sent₂R receiving m₂Then m is inevitably sent₃I.e. by Thus, it is possible to provideR(m₁) Always occurs first in R (m)₃) Indicates R (m)₃) Logically dependent on R (m)₁)，R(m₃) And R (m)₁) The relationship between them is a logical dependency that is independent of other factors of the system. Since the reception event of a message is a deterministic event, it is possible to determine the time of arrival of the messageUnder the assumption of a FIFO channel,

as shown in fig. 2, since m₃And m₄Reaches process q with different channel delays, so R(m₃) Does not always occur in advance of R (m)₄). If event e_iDoes not always have to occur prior to event e_jBut is related to channel delay, cpu speed, etc., then is called e_iNot always occurring in advance of e_jIs marked asIn the context of figure 2, it is shown, the actual message receipt sequence of process q is therefore m₁、m₃、m₄Or m₁、m₄、m₃。

Equivalent message sequence:

theorem of equivalent sequence of messages

Suppose S is a certain message sequence of process p and S' is a new sequence formed by rearranging the messages in S. The elements in S' satisfy: 1. all messages present in S are still present in S'; 2. if the receive events of some messages have an always-before-occurrence relationship in S, this relationship is still maintained in S'. Under the process channel FIFO and reliable channel assumptions, S and S' are equivalent sequences during process p replay.

And (3) proving that: under theoretic assumptions, although some messages in S are reordered in S ', always the pre-occurring relationship between these messages remains unchanged in S'; the order of receipt of the messages in S' is thus the actual order that may occur in the replay of the process. If S and S 'are not equivalent sequences during the replay of the process p, the execution of the message receiving event in S in the process p is not equivalent to the execution of the message receiving event in S' in the process p, i.e., each execution of the same process is inconsistent, which contradicts the consistency property of the process execution.

As shown in FIG. 2, m₁、m₃And m₄And m₁、m₄And m₃Is an equivalent sequence, i.e. process p replay m₁、m₃And m₄And replay m₁、m₄And m₃The computations performed by the post-process are equivalent.

Improved logic clock:

process p improved logic clock LC_pIs an integer variable for counting the sending and receiving events of a message. LC (liquid Crystal)_pSatisfies the following conditions:

1. its initial value is zero;

2. whenever a message is sent, LC_pAdding one;

As shown in fig. 3, LC after p sends m1_pSending m4 post LC as 1_p2. q after receiving and storing m1, LC_qLC after sending m2 ═ 2_qAfter receiving and storing m3, LC_qAfter receiving and storing m4, LC_q＝7。

Basic theorem of reordering according to transmit logic clock

If the segmentation determination assumes PWD is true andLCp (S (m)_i))<LCq(S(m_j))。

Wherein R (m)_i) And R (m)_j) Messages m each representing a process k_iAnd m_jLCp (S (m)_i) Means that process p sends message m_iThe latter logic clock, LCq (S (m)_j) Means that process q sends message m_jThe latter logic clock.

And (3) proving that: due to the fact thatR(m_j) Logic depends on R (m)_i)，S(m_j) Is determining an event, therefore Otherwise, assume R (m)_i) Does not always occur first in S (m)_j) This means that R (m)_i) And S (m)_j) Can occur in any order, or R (m)_i) Occurs first in S (m)_j) Or S (m)_j) Occurs first in R (m)_i). If S (m)_j) Occurs first in R (m)_i) Due to S (m)_j) Always occurs first in R (m)_j) Thus S (m)_j) Can only indirectly occur first in R (m)_i). As shown in fig. 4, there must be at least one message m_kIs located at m_jAnd m_iSuch that S (m)_j)→S(m_k),S(m_k)→R(m_k),R(m_k)→S(m_i),S(m_i)→R(m_i) Where "→" indicates a preceding occurrence. In this case, m_jAnd m_iMust arrive at process k, m via different transmission channels_jAnd m_iMay have different channel delays, so R (m)_i) May not always occur in the first place at R (m)_j) This contradicts the theorem assumption, and thusAccording to the definition of improving the logic clock, LCp (R (m)_i) Must be less than LCq (S (m))_j) LCp (R (m))_i)<LCq(S(m_j)). And because of Therefore LCp (S (m)_i))<LCp(R(m_i)). LCp (S (m) can be obtained_i))<LCp(R(m_i))<LCq(S(m_j)),LCp(S(m_i))<LCq(S(m_j))。

As shown in figure 4 of the drawings,LCp(S(m₁))＝1,LCr(S(m₃))＝5,LCp(S(m₁))<LCr(S(m₃))。

the above theorem indicates that the order of message reception in the message reception sequence of any process can be determined by the logic clock of the message sending process. If the sending process of the message sends the message, storing the logic clock related to the message and the content of the message to a stable storage; after the receiving process of the message fails, the message which cannot be saved to the log file due to the failure can determine the sequence of the messages through the clock stored in the sending process, and the messages are reordered according to the sequence. For example, in FIG. 4, p process sends m₁Post-storage<m₁,LC_p＝1>P Process sends m₄Post-storage<m₄,LC_p＝2>(ii) a r Process sends m₃Post-storage<m₃,LC_r＝5>(ii) a If process p fails at "X", it can be based on the doublet<m₁,LC_p＝1>、<m₃,LC_r＝5>And<m₄,LC_p＝2>the logic clock in (1) reorders the sequence of process p replay messages into<m₁,LC_p＝1>、<m₄,LC_p＝2>And<m₃,LC_r＝5>。

distributed system by process p_iI is 1,2 … n and a recovery unit recovery (i).

The message log system consists of a message log dfile and a determinant log mlog.

The message log dfile is a sequential file stored in the local storage of the message sending process. The local storage media can respectively adopt an internal memory or a hard disk according to different fault-tolerant capability requirements of the system, if the fault-tolerant capability of the system is designed to allow only one process to make an error, the local storage adopts the internal memory to store the dfile file, and if a plurality of processes are allowed to make errors, the dfile file needs to be stored in the hard disk. A dfile file consists of several records, each record being a triplet:<LC_j,i,m>wherein LC_jIndicating an improved logical clock after the process pj sends the message m, i indicating the process identity of the receiving process, and m indicating the content of the message. As shown in FIG. 5, process pj sends a message<j,LC_j,m>First, firstly, the<LC_j,i,m>Storing the dfile.

The determinant log mlog holds the determinants for all messages:<j,LC_j,i,lsn>where j denotes the process identification of the sending process of the message m, LC_jIndicating the logical clock after the sending process sends m, i indicates the receiving process p_iLsn indicates the secondary sequence number of the receiving process save message. lsn having an initial value of zero, process p_iFor each decision factor stored for a message lsn is incremented. As shown in FIG. 5, process p_iReceiving messages<j,LC_j,m>The message is then first submitted to the application (application) and the determinant is then saved to mlog.

As shown in fig. 6, the saved messages are acquired from the message log file dfile in the order of transmission and stored in the internal memory. Recovery (i) first retrieves the triple by calling RequestDeterminant (i, lsn) through a procedure<j,LC_j,lsn>Then GetM (LC) through procedure call_j) Obtaining<j,LC_j,m>Where m represents the content of the message. Finally, recovery (i) converting the triplets<j,LC_j,lsn>Stored in a local memory ARRAY ARRAY.

As shown in FIG. 7, GeUnLogM (i, U)_ji-T_ij) The process flow of (2) is as follows:

1. the dfile file is opened in a read-only manner.

7. difference variable minus one (difference ═ U)_ji-T_ij)。

8. And judging whether the difference is 0. If difference is 0, it indicates that p is recorded in step 5_jIs sent to p first_iSends the record to recovery (i) process, turning to 11. If difference is not 0, go to 4.

4. The pointer p is moved to point to the next record. Turn to 3.

11. And (6) ending.

As shown in FIG. 8, a normal process p_iThe operation flow is as follows:

1. for thek is an integer variable, initialize U_ik0 and T_ikIs 0, respectively represents a process p_iSend to process p_kTotal number of messages of 0 and process p_kReceiving a process p_iThe total number of messages of (2) is 0, and meanwhile lsn is made equal to 0.

2. If the timing time is up, turning to 3; otherwise go to 4.

3. Will process p_iSaving the state of (A) into a determinant log file mlog, and saving T_i，U_i，LC_k(k ═ 1,2 … n) to local storage, and the old checkpoint is deleted.

4. If the process p_iTo process p_j(j ═ 1,2 … n) send message, go to 5; otherwise go to 9.

5. If F_iIs 1, represents a process p_iWhen the fault occurs and the fault is not recovered, the process is shifted to 9; otherwise process p_iAnd (6) normal operation is carried out.

6. If F_jIs 1, represents a process p_jIf a failure occurs and has not yet been recovered, wait until F_jUntil 0; otherwise, represents process p_jNormal operation, now go to 7.

7. Due to process p_iTo process p_jSending messages, LC_iIs increased by 1, U_ijIs increased by 1, will<LC_i，j，m>Added to the message log file dfile.

8. Will U_ijIs saved to local storage and forwarded to process p_jSending a message AM (i, LC)_i,m)。

9. If the process p_iReceiving a process p_jThe sent message is switched to 10; otherwise go to 14.

10. Process p_iReceiving a process p_jTransmitted message AM (j, LC)_jM), it needs to be determined that this message is by process p_jWhether sent or process p_jSent by the recovery process.

11. If AM_j＞LC_jThen this indicates that the message is by process p_jSent during faultless operation, and therefore shifted to 12; otherwise the message is passed by process p_jThe recovery process of (2) sends a transition to (14).

12. LC using AM_jValue of (1) update LC_j. Will receiveThe received message is handed over to the application process for processing and will<j，LC_j，i，lsn>And recording and saving the record in the memory. Because of the receipt of the message, LC_iBy adding 1 to the value of (1) and using LC_iAnd LC_jMaximum value update LC of_i。

13. Other deterministic events are executed.

14. Determining a Process p_iIf it is idle, if process p_iTurning to 15 when the device is idle; otherwise, 16 is entered.

15. Using idle time to store the record in memory<j,LC_j,i,lsn>Store to mlog. Because of the received message, T_ijThe value is increased by 1. Will T_i，LC_kThe value of (k 1,2 … n) is saved to the hard disk, and the value of lsn is incremented by 1.

16. If process p is received_kAn error flag clear message sys _ clear f (k) of (k 1,2 … n), which is then shifted to 17; otherwise go to 18.

17. Assigning a value of 0 to F_kIndicating that process p is now occurring_kAnd (5) normally running.

18. If process p is received_k(k 1,2 … n) message sys _ setf (k) with error flag set to 1, then proceed to 19; otherwise go to 20.

19. Assigning a value of 1 to F_kAt this point, process p is indicated_kFailed and not recovered.

20. Process p_iIs it a post-failure restart? If the process p_iAnd switching to 21 for restarting after the fault, and otherwise, switching to 1.

21. With pre-stored T_i，U_iAnd LC_kValue update T of (k ═ 1,2 … n)_i，U_iAnd LC_k(k is 1,2 … n), and lsn is 0, F_i1. And (4) turning to 1.

As shown in FIG. 9, the recovery process recovery (i) operation flow is as follows:

1. restart p from the latest checkpoint_iAnd will mark process p_iThe message of sys _ setf (i) in the failure state is sent to each process, and lsn and NUM are set to 0.

2. And judging whether the determinant file mlog is empty or not. If mlog is not empty, then 3 is carried out; otherwise go to 6.

3. Obtaining a determinant record from a determinant file mlog by calling RequestDetermiant (i, lsn)<LC_j,j,lsn>. Mix LC_jThe value of (d) is assigned to the LC, lsn value is increased by 1. Obtaining messages using remote invocation GetM (LC)<LC_j,j,m>This message is sent by the LC in the message log file dfile_jA unique identification.

4. Determining messages<LC_j,j,lsn>Whether it is empty. If the message<LC_j,j,lsn>If the result is empty, the operation goes to 2, and if the result is not empty, the operation goes to 5.

5. Message to be received<LC_j,j,lsn>And storing the data into an ARRAY ARRAY of the memory. I.e. ARRAY [ NUM ]].LC_j＝LC_j,ARRAY[NUM].j＝j,ARRAY[NUM]Lsn ═ lsn. And adding 1 to the NUM value, and turning to 2 to judge whether mlog is empty or not.

6. Will process p_iThe value of the total number of messages NUM sent and saved in the mlog file is saved to NUM'.

7. For theObtaining a process p by remotely invoking GetU (i, j)_jSend to process p_iThe total number of messages.

8. Remote invocation GetT (i) get Process p_iThe number of messages received for each of the other processes.

9. For theJudging whether U exists_ji＞T_ij. If U is_ji＞T_ijIndicates a process p_jSend to process p_iIs greater than the number of messages of process p_jReceived process p_jThe number of messages, at this time 10; otherwise go to 13.

10. Because of U_ji＞T_ijSo there is a midway message. By remotely calling GetUnlogM (j, U)_ji-T_ij) Message and recording the obtained message in<LC_j,j,m>In (1).

11. Determining the acquired message<LC_j,j,m>Whether it is empty. If it is<LC_j,j,m>If the message is empty, the process goes to 9, otherwise, the process goes to 12.

12. The obtained message is stored in an ARRAY ARRAY. Since a message is received, T_ijThe value is incremented by 1 and the message number NUM value is incremented by 1. And after the execution is finished, the operation is shifted to 9, and whether the message stored in the log file exists or not is continuously judged.

13. According to ARRAY [ k ]].LC_jIn ascending order of (a) are arranged sets of ARRAY, where k is from NUM' to NUM-1. I.e. process p_iLC of received messages sent by other processes according to messages_jSorting the ascending values of (a).

14. To process p_iSending a message AM (ARRAY k)].j,ARRAY[k].LC_j,ARRAY[k]M), where k is 0,1 … (NUM-1).

15. To other respective processes p_k(k ≠ 1,2 … n, k ≠ i) sending a marking process p_iMessage sys _ clear f (i) that normal operation has been resumed. The recovery process ends.

Recovery (i) recovery algorithm correctness proof:

first, the recoverability principle of the process state interval is explained: a process state interval is recoverable, and if any failure occurs in the process in the future, the process may be re-executed for the interval.

Theorem 1, if one or more processes fail, the failed process must be restored to the state before the failure under the action of recovery (i) process.

And (3) proving that: since messages of a process that fail to be saved to mlog have their order of receipt and message content saved in the local stable store of the messaging process, the process is recoverable according to the process state interval recoverability principle described above. Theorem 1 is demonstrated in detail below in two cases.

1. Assume that only one process fails. After the failed process is detected, the failed process will start from the latest checkpoint. The recovery process first obtains the information of the message stored in the determinant log mlog, and then obtains the logic clock and the content of the message from the sequence file dfile according to the information and stores the logic clock and the content in the memory space of the recovery process. Since the record in mlog is determined by the value of lsn variable only and the record in dfile is determined by the logic clock only, eventually all messages that have been saved in mlog must have their logic clock and contents transferred to the memory space ARRAY of the recovery process. For messages that fail to be saved to mlog due to process failure, the logical clock and contents of all such messages executed by the remote procedure call getunrogm (i, difference) will also be transferred into the memory space ARRAY of the recovery process, since the number of such messages can be determined by the number of message sends and the number of message receives and saved. After the messages which cannot be stored to mlog are sorted again according to the logic clock, the recovery process sends the stored messages to the fault process again, the fault process receives and processes the messages again, and finally the state interval before the fault process fails is reached, namely one fault process is recoverable.

2. If multiple processes fail, because each failed process is recovered by the recovery process independently, the failed processes can be recovered by the recovery process when multiple processes fail.

Theorem 2 under the action of the recovery process, the global state of the system after all the failed processes are recovered is a consistent global state.

The proof is as described by the algorithm of the recovery process, if one or more processes are failed, if the event that the failed process sends a message to the failed process, the failed process stops waiting for recovery, otherwise, the execution is continued.

Case 1, assume that the non-failed process did not send a message to the failed process. After the failed process is recovered, the number of messages sent to the failed process by the failed process is not changed, and no orphan messages can exist between the processes.

Case 2, assume that the failed process stops at the event that a message is sent to the failed process. After the failed process is recovered, the number of messages sent to the failed process by the failed process is not changed, and no orphan messages can exist between the processes.

And combining the two conditions, wherein the system global state after the fault process is recovered is a consistent global state according to the meaning of global state consistency.

However, when all the failed processes are recovered, there may be a case where a plurality of non-failed processes send messages to one recovered failed process. In fact, since the sending process sends messages to the recovered process via different channels, there is no always a preceding relationship between the events of receipt of these messages for the receiving process, and these events can be performed in any order.

Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims

1. A message log recovery method based on message reordering and message number inspection is characterized in that a message reordering method is adopted to store the receiving sequence of messages in a sending process; when a message receiving process fails, firstly acquiring messages which are stored and not stored in a log file and a logic clock corresponding to the message sending process from a sending process under the control of a recovery process, and then reordering the messages which are not stored in the log file according to the logic clock of the messages; finally, the sequenced messages are sent to the fault process again, and the fault process receives the messages again and processes the messages, thereby realizing the replay of the messages, and the required working steps of the common process of the method are as follows:

step 1, fork is an integer variable, initialize U_ik0 and T_ikIs 0, respectively represents a process p_iSend to process p_kTotal number of messages of 0 and process p_kReceiving a process p_iThe total number of messages of (2) is 0, and lsn is made equal to 0; lsn denotes the sequence number of the receive process save message; 1,2 … n; k is 1,2 … n;

step 2, if the timing time is up, turning to step 3; otherwise, turning to 4;

step 3, converting the process p_iSaving the state of (A) into a determinant file mlog, and saving T_i，U_i，LC_kK is 1,2 … n to local storage, delete old checkpoint; 1,2 … n;

step 4, if the process p_iTo process p_jIf j is 1,2 … n sends message, then go to 5; otherwise, turning to 9;

step 5, if F_iIs 1, represents a process p_iWhen the fault occurs and the fault is not recovered, the process is shifted to 9; otherwise process p_iNormally operating, and turning to 6;

step 6, if F_jIs 1, represents a process p_jIf a failure occurs and has not yet been recovered, wait until F_jUntil 0; otherwise, represents process p_jNormal operation, at this time, 7 is carried out;

step 7, process p_iTo process p_jSending messages, LC_iIs increased by 1, U_ijIs increased by 1, will<LC_i，j，m>Adding the file into a message log file dfile;

step 8, adding U_ijIs saved to local storage and forwarded to process p_jSending a message AM (i, LC)_iM), m represents the content of the message;

step 9. if process p_iReceiving a process p_jThe sent message is switched to 10; otherwise, turning into 14;

step 10, Process p_iReceiving a process p_jTransmitted message AM (j, LC)_jM), it needs to be determined that this message is by process p_jWhether sent or process p_jSent by the recovery process of (2);

step 11, if AM_j＞LC_jThen this indicates that the message is by process p_jSent during faultless operation, and therefore shifted to 12; otherwise the message is passed by process p_jThe recovery process of (2) is sent, and the process is switched into (14);

step 12, using AM_jValue of (1) update LC_j(ii) a The received message is handed to the application process for processing and is sent to<j，LC_j，i，lsn>Recording and storing the record in a memory; because of the received message, the LC_iBy adding 1 to the value of (1) and using LC_iAnd LC_jMaximum value update LC of_i(ii) a The determinant file mlog holds the determinants of all messages:<j,LC_j,i,lsn>where j denotes the process identification of the sending process of the message m, LC_jIndicating the logical clock after the sending process sends m, i indicates the receiving process p_iLsn denotes the sequence number of the received process save message, lsn has an initial value of zero, process p_iLsn plus one for each determinant that holds a message;

step 13, executing other deterministic events;

step 14, determining Process p_iIf it is idle, if process p_iTurning to 15 when the device is idle; otherwise, 16 is switched in;

step 15, utilizing the idle time to store the record in the memory<j,LC_j,i,lsn>Storing into mlog; because of the received message, T_ijThe value is increased by 1; will T_i，LC_kThe value of k is 1,2 … n is saved in the hard disk, and the value of lsn is added with 1;

step 16, if process p is received_kIf k is 1,2 … n error flag clear message sys _ clear f (k), then go to 17; otherwise, turning to 18;

step 17, assigning a value of 0 to F_kIndicating that process p is now occurring_kNormal operation is carried out;

step 18, if process p is received_kIf k is 1,2 … n, message sys _ setf (k) with error flag set to 1, then go to 19; otherwise, 20 is switched in;

step 19, assigning a value of 1 to F_kAt this point, process p is indicated_kFailed and not recovered;

step 20, judging the process p_iWhether the failure is followed by restarting; if the process p_iIf the failure occurs, the operation is switched to 21, otherwise, the operation is switched to 1;

step 21, using the pre-stored T_i，U_iAnd LC_kValue of k 1,2 … n updates T_i，U_iAnd LC_kAnd let lsn be 0, F_i1 is ═ 1; and (4) turning to 1.

2. The message log recovery method based on message reordering and message number checking as claimed in claim 1, wherein the recovery process is entered when the normal process encounters the failure, wherein the recovery process comprises three phases:

stage one:

obtaining the saved messages from the message log file dfile according to the sending sequence and storing the saved messages in an internal memory; recovery (i) first retrieves the triple by calling RequestDeterminant (i, lsn) through a procedure<j,LC_j,lsn>Then GetM (LC) through procedure call_j) Obtaining<j,LC_j,m>Wherein m represents the content of the message; finally, recovery (i) converting the triplets<j,LC_j,lsn>Storing in a local memory ARRAY ARRAY;

and a second stage:

obtaining the messages which cannot be stored to mlog due to process failure from the message log file dfile and storing the messages in an internal memory;

recovery (i) first get T through remote procedure call_ijAnd U_ji；T_ijProcess p is recorded_iReceived in process p_jAnd the number of messages, U, already stored in mlog_jiProcess p is recorded_jSent to process p_iThe number of messages of (2); according to U_jiAnd T_ijMeaning of (1), U_jiAnd T_ijDifference value U of_ji-T_ijRepresents a process p_jIs sent to p_iAnd due to p_iFailure to save to mlog number of messages; then, p_iThe call is located at p_jLocal process remote procedure call GetUnLogM (j, U)_ji-T_ij) Obtaining the message received firstly according to the difference value; recovery (i) by repeatedly calling GetUnLogM (j, U)_ji-T_ij) Acquiring all messages which cannot be stored to mlog due to process faults and storing the messages in a local memory ARRAY ARRAY;

and a third stage: for all messages that fail to be saved to mlog due to process failure, recovery (i) reorder by logical clock; finally, all the messages which are saved and not saved to mlog are sent to p again_iProcess, p_iThe process re-receives and processes these messages until it runs to the point in time before the failure.

3. The message log recovery method based on message reordering and message number checking as claimed in claim 2, wherein the logic clock of stage three of the recovery process is described as follows:

1) its initial value is zero;

2) whenever a message is sent, LC_pAdding one;

3) after receiving a message every time a process q receives the message and storing the necessary information of the message in a log file, the LC_pPlus one, then LC_p←max(LC_p,LC_q+1), wherein LC_qLogic clock representing process q, max represents fetch LC_pAnd LC_qMaximum value in + 1.

4. The message log recovery method based on message reordering and message number checking as claimed in claim 2, wherein the recovery process comprises the following steps:

step 1, restart p from the latest checkpoint_iAnd will mark process p_iSending a message of sys _ SetF (i) in a fault state to each process, and setting lsn and the value of the number NUM of messages as 0;

step 2, judging whether the determinant file mlog is empty or not; if mlog is not empty, then 3 is carried out; otherwise, turning to 6;

step 3, obtaining a determinant record from a determinant file mlog by calling RequestDetermiant (i, lsn)<LC_j,j,lsn>(ii) a Mix LC_jThe value of (d) is assigned to LC, lsn value is increased by 1; obtaining messages using remote invocation GetM (LC)<LC_j,j,m>This message is sent by the LC in the message log file dfile_jA unique identifier;

step 4, judging the message<LC_j,j,lsn>Whether it is empty; if the message<LC_j,j,lsn>If the value is null, turning to 2, otherwise, turning to 5;

step 5, the received message is processed<LC_j,j,lsn>Storing the data into an ARRAY ARRAY of a memory; i.e. ARRAY [ NUM ]].LC_j＝LC_j,ARRAY[NUM].j＝j,ARRAY[NUM]Lsn ═ lsn; adding 1 to the NUM value, and turning to 2 to judge whether mlog is empty;

step 6, process p_iThe value of the total number NUM of the messages which are sent and stored in the mlog file is stored into NUM';

step 7, forGet process p by remote invocation GetU (i, j) < 1,2 … n_jSend to process p_iThe total number of messages of (a);

step 8, remotely calling GetT (i) to obtain process p_iThe number of messages received for each of the other processes;

step 9, forj is 1,2 … n, and it is determined whether or not U is present_ji>T_ij(ii) a If U is_ji>T_ijIndicates a process p_jSend to process p_iIs greater than the number of messages of process p_iReceived process p_jThe number of messages, at this time 10; otherwise, turning to 13;

step 10, because U_ji>T_ijSo there is a midway message; by remotely calling GetUnlogM (j, U)_ji-T_ij) Message and recording the obtained message in<LC_j,j,m>Performing the following steps;

step 11, judging the obtained message<LC_j,j,m>Whether it is empty; if it is<LC_j,j,m>If the message is empty, the step is switched to 9, otherwise, the step is switched to 12;

step 12, storing the obtained message into an ARRAY ARRAY; since a message is received, T_ijAdding 1 to the value, and adding 1 to the NUM value of the message number; after the execution is finished, the operation is switched to 9, and whether the message stored in the log file exists or not is continuously judged;

step 13, according to ARRAY [ k ]].LC_jWherein k is from NUM' to NUM-1; i.e. process p_iLC of received messages sent by other processes according to messages_jSorting the ascending values;

step 14, process p_iSending a message AM (ARRAY k)].j,ARRAY[k].LC_j,ARRAY[k]M), where k is 0,1 … (NUM-1);

step 15, to other processes p_kK 1,2 … n, k ≠ i sending tag process p_iMessage sys _ clear f (i) that normal operation has been resumed; the recovery process ends.

5. The message log recovery method based on message reordering and message number checking as claimed in claim 4, wherein the step 10 of recovering the process is:

GetUnLogM(j,U_ji-T_ij) The process flow of (2) is as follows:

(1) opening the dfile file in a read-only mode;

(2) moving a pointer p to point to the record stored firstly, wherein dfile is a sequential file, and the record stored finally is positioned at the tail of the file;

(3) judging whether the pointer p points to the end of the file, if not, turning to (5), otherwise, turning to (10);

(4) moving the pointer p to point to the previous record;

(5) reading the record pointed by p and storing the record into the triple<LC_j,l,m>；

(6) Judging whether the process identifier l is equal to i or not; if l is equal to i to turn (7), otherwise, turning (4);

(7) difference variable minus one, difference ═ U_ji-T_ij；

(8) Judging whether difference is 0; if difference is 0, it indicates that p is recorded in step (5)_jIs sent to p first_iSending the record to a recovery (i) process, turning to (9); if difference is not 0, turning to (4);

(9) close the file, return < j, LCj, m >, go (11);

(10) closing the file, returning an invalid triple (NULL, NULL, NULL), and turning to (11) to end;

(11) and (6) ending.