CN105955837B

CN105955837B - A kind of virtual machine fault tolerant memory synchronous method and system

Info

Publication number: CN105955837B
Application number: CN201610341484.5A
Authority: CN
Inventors: 史骁; 唐宏伟; 王晖; 赵晓芳
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2015-05-21
Filing date: 2016-05-20
Publication date: 2019-07-30
Anticipated expiration: 2036-05-20
Also published as: CN105955837A

Abstract

The present invention provides a kind of virtual machine fault tolerant memory synchronous method and system, which comprises by each fault-tolerant divided stages is multiple isometric state synchronized stages；Before the fault-tolerant stage terminates, at the end of the state synchronized stage, is selected and synchronized from dirty memory pages；And at the end of the fault-tolerant stage, the synchronous dirty memory pages not synchronized also within the fault-tolerant stage.The present invention can shorten the whole downtime of original virtual machine, improve the utilization rate of fault-tolerant dedicated network, reduce the fault-tolerant overhead of virtual machine, and shorten the operating lag of Client application in virtual machine.

Description

A kind of virtual machine fault tolerant memory synchronous method and system

Technical field

The present invention relates to virtualization technologies, more particularly, to the memory that can be applied in virtual machine real-time fault tolerance technology Status data simultaneous techniques.

Background technique

Virtual machine real-time fault tolerance technology can ensure the high availability of the Client application in virtual machine, and existing virtual machine is real When fault-toleranr technique be based primarily upon " lock-step-reproduction " or " internal storage state data duplication-synchronization " strategy.At present, latter The use degree of strategy is wider, the basic principle is that: periodically by the internal storage state data of original virtual machine to backup virtual machine It is synchronous, to realize the internal storage state consistency of backup virtual machine and original virtual machine；Backup virtual machine can pass through fault testing mechanism The behaviors such as collapse, the exception of system discovery original virtual machine, and the internal storage state synchronized is loaded to push up by recovery mechanism It is continued to execute for original virtual machine.Specifically, before original virtual machine breaks down, the virtual machine monitoring of original virtual machine one end Device periodically triggers checkpointing (it should be understood that when checkpointing refers to one checkpoint of every arrival, by be synchronized one The data copy of side is to target side), suspend original virtual machine after triggering checkpointing；The memory that will change in a upper period Status data is cached；Then restore the execution of original virtual machine, and asynchronously transmits internal storage state number to backup virtual machine According to.In this way, the execution of original virtual machine can be divided into three phases: synchronous regime stage, state synchronized stage and conjecture are held Row order section, Fig. 1 describe the relationship between these stages.

However, above-mentioned internal storage state data synchronization technology have the defects that it is certain.In short, due to memory to be synchronized For the data volume of state there are fluctuation, the internal storage state data simultaneously operating of each checkpoint will lead to the volume of monitor of virtual machine Overhead is excessively high, to influence virtual machine integrity service performance.Specifically, its issuable influence includes:

(1) downtime of original virtual machine is longer.When internal storage state data volume to be synchronized is larger, cache-time It will extend, so that the original virtual machine downtime after checkpointing extends, this causes client in original virtual machine to answer The decline of response speed and efficiency of service.

(2) fault-tolerant dedicated network utilization rate is low.In virtual machine implementation procedure, fault-tolerant dedicated network facility will be in idle State causes bandwidth waste；And after checkpointing, the concentration of transmissions of internal storage state data will lead to network again and happen suddenly Large-scale data transmission.This has seriously affected the efficiency of transmission of network, and the problems such as may cause network congestion, to transmission Efficiency and network performance generate adverse effect.In addition, this case can more be disliked with the raising of virtual machine application load Change.

(3) fault-tolerant to cause higher overhead.Original virtual machine carries out on a large scale while asynchronous execution Data transmission, causes host resource additionally to be occupied, and original virtual machine may be made to generate wave in the performance in conjecture execution stage It is dynamic.

Summary of the invention

To solve above-mentioned problems of the prior art, the present invention provides a kind of virtual machine fault tolerant memory synchronous method, Include:

Step 1), by each fault-tolerant divided stages be multiple isometric state synchronized stages；Wherein, the fault-tolerant stage corresponds to Interval during fault-tolerant between two checkpoints；

Step 2), before the fault-tolerant stage terminates, at the end of the state synchronized stage, selected simultaneously from dirty memory pages It is synchronous；Wherein, the dirty memory pages refer to the memory pages that its data is changed；

Step 3), at the end of the fault-tolerant stage, the synchronous also not synchronous dirty memory pages within the fault-tolerant stage.

In one embodiment, it in step 2), selects to be no more than in the prediction of NextState synchronous phase using temperature The dirty memory pages of predetermined threshold synchronize.

In one embodiment, in each state synchronized stage, if memory pages are accessed, the page is improved Face uses temperature in the prediction of NextState synchronous phase；Otherwise the memory pages are reduced in the prediction of NextState synchronous phase Use temperature.Further, more wheel retrievals are carried out in each state synchronized stage, in the retrieval of every wheel, if detecting internally The access of the page is deposited, then the memory pages is improved using slow turn-on mode and uses temperature in the prediction of NextState synchronous phase, Otherwise the memory pages are reduced using fast reset mode and uses temperature in the prediction of NextState synchronous phase.

In one embodiment, step 3) further include:

Memory pages are calculated in the use temperature in the fault-tolerant stage；If be higher than in the use temperature in the fault-tolerant stage predetermined Threshold value then improves the history of the memory pages using temperature, and the history for otherwise reducing the memory pages uses temperature.It can use Slow turn-on mode is improved the history of memory pages using temperature, and is used using the history that fast reset mode reduces memory pages Temperature.Wherein, memory pages are calculated in the use temperature in fault-tolerant stage can include: memory pages are each within the fault-tolerant stage The arithmetic mean of instantaneous value of the actual use number in state synchronized stage is set as the memory pages in the use temperature in the fault-tolerant stage.? When the fault-tolerant stage starts, memory pages can be taken to use heat using temperature and the history of the memory pages in upper one fault-tolerant stage The weighted average of degree, the prediction as the memory pages in NextState synchronous phase use temperature.

In one embodiment, before the fault-tolerant stage terminates, at the end of the state synchronized stage, using with the state synchronized Stage, corresponding data path executed synchronization；It is logical using the data different from aforementioned data access at the end of the fault-tolerant stage Road executes synchronization；Data path is independent of one another.Wherein, when executing synchronous using data path, the data transmitted can include: refer to Show the sequence number and internal storage state data of time sequencing, wherein internal storage state data include memory pages number and memory pages number According to.When backup virtual machine is received from the data that original virtual machine transmission comes, corresponding memory can be judged according to sequence number Page data whether be it is newest, if it is newest, execute data update, otherwise directly discarding.

The present invention also provides a kind of virtual machine fault tolerant memory synchronization systems, comprising:

Pretreatment unit, for being multiple isometric state synchronized stages by each fault-tolerant divided stages；Wherein, fault-tolerant rank Section corresponds to the interval between fault-tolerant two checkpoints in the process；

Synchronizing device: at the end of the state synchronized stage, being carried out from dirty memory pages before the fault-tolerant stage terminates It selects and synchronizes；Wherein, the dirty memory pages refer to the memory pages that its data is changed；And in the fault-tolerant stage At the end of, the synchronous dirty memory pages not synchronized also within the fault-tolerant stage.

Compared to the prior art, the present invention can get it is following the utility model has the advantages that

(1) the whole less down time of original virtual machine.Original virtual machine shut down the primary operational in the period be will be to Synchronous internal storage state data are cached, this constitutes the main time expense that original virtual machine is shut down.In provided by the invention Deposit synchronous method by extensive, the to be synchronized internal storage state data balancing after checkpointing into different data paths into Row transmission, effectively reduces the internal storage state data volume for needing to concentrate caching after checkpointing in backup virtual machine, thus Reduce the downtime of original virtual machine.

(2) utilization rate of fault-tolerant dedicated network is improved.Virtual machine fault tolerant memory synchronous method provided by the invention can be with The network bandwidth to leave unused in checkpointing interval is efficiently used, and alleviation even is eliminated bursty data biography after checkpointing Defeated situation (more uniformly sharing the system resources consumption of burst to different moments), so that the entirety of fault-tolerant dedicated network Utilization rate increases substantially.

(3) the fault-tolerant overhead of virtual machine, virtual machine fault tolerant memory synchronous method provided by the invention are reduced The efficiency for improving data transmission, so that the occupancy of extra resource is controlled on host, so that virtual machine is held The probability that row performance is involved reduces.

(4) backup end virtual machine is improved to the treatment effeciency of memory status data.The present invention is provided when memory is synchronous Indicate the sequence number of time sequencing, can be with the validity of effective guarantee internal storage state data, and improve backup end virtual machine prison Device is controlled to the treatment effeciency of memory status data.

(5) operating lag of Client application in virtual machine is shortened.Due to the synchronous efficiency of virutal machine memory status data It is improved, so that coherency state can be obtained between original virtual machine and backup virtual machine at faster speed, thus Accelerate the release operation of the internal storage state data interior network packet cached synchronizing cycle.The accelerated release in vitro of network packet So that the request of its interactive user can be responded as early as possible.

Detailed description of the invention

Embodiments of the present invention is further illustrated referring to the drawings, in which:

Fig. 1 is the schematic diagram of the fault-tolerant universal process of virtual machine based on " internal storage state data duplication-synchronization " strategy；

Fig. 2 is the flow chart of virtual machine fault tolerant memory synchronous method according to an embodiment of the invention.

Specific embodiment

In order to which the purpose of the present invention, technical solution and advantage is more clearly understood, pass through below in conjunction with attached drawing specific The present invention is described in more detail for embodiment.It should be appreciated that described herein, specific examples are only used to explain the present invention, It is not intended to limit the present invention.

According to one embodiment of present invention, a kind of virtual machine fault tolerant memory synchronous method is provided, the virtual machine is fault-tolerant interior Depositing synchronous method can be applied to all kinds of monitor of virtual machine (such as QEMU/KVM, XEN).

Generally, this method comprises: being multiple isometric state synchronized stages by fault-tolerant divided stages；In the fault-tolerant stage Before end, at the end of its state synchronized stage, memory pages are selected to synchronize from current dirty memory pages；Fault-tolerant At the end of stage, the synchronous dirty memory pages that data are changed and do not synchronized also within the fault-tolerant stage.

Wherein, before the fault-tolerant stage terminates, at the end of the state synchronized stage, to the selection of dirty memory pages can be used with The method of machine selection.Preferably, can calculate memory pages the state synchronized stage use temperature (as using LRU, LFU, Random permutation algorithm etc.), selection is synchronized using the dirty memory pages that temperature is no more than predetermined threshold.It is highly preferred that can be with Prediction memory pages in the use temperature of NextState synchronous phase, predict dirty interior no more than predetermined threshold using temperature by selection The page is deposited to synchronize.

It will be understood by those skilled in the art that " temperature " here refers to the accessed number of memory pages, if often It is accessed, then illustrate that the memory pages temperature is higher, the probability that data are modified is also higher；Here " dirty " memory pages Refer to the memory pages that its data is changed.

Each step of the virtual machine fault tolerant memory synchronous method is described in detail below in conjunction with Fig. 2.Wherein, same in state At the end of step section, dirty memory pages are selected according to the use temperature of prediction.

Step 1: initial treatment process.

1. regard interval of the virtual machine during fault-tolerant between two checkpoints as a big window, i.e. a big window with One fault-tolerant stage is corresponding, and the big window be mainly used to refer to it is fault-tolerant during the state synchronized stage or conjecture execute rank Section.

2. each big window is divided by n isometric state synchronized stages according to the quantity n of data path, it is also known as small Window.Each wicket is corresponding with a data path, and n data path can concurrently transmit data.

3. initialization indicates four state sections of the memory pages temperature information for each memory pages:

1) history temperature state section (H sections): the history for recording memory pages uses temperature (access temperature)；For example, Temperature is used to the memory pages before current big window after fault tolerant mechanism starting；

2) current big window temperature state section (P sections): for recording memory pages in the use temperature of current big window；

3) temperature state section (C sections) are predicted: for recording the use temperature of prediction of the memory pages in next wicket；

4) current temperature state section (R sections): for recording memory pages in the actual use situation of current wicket, strictly according to the facts Border access times.

It will be understood by those skilled in the art that above-mentioned state section can be indicated using bitmap, also, the value of each state section It can be initialized as zero.

4. large and small window timer (for judging whether current large and small window terminates hereinafter) is arranged, into formal The fault-tolerant stage.

Step 2: whether judge that virtual machine is fault-tolerant terminates, third step is entered if being not finished, otherwise normal termination is held It is wrong.

Step 3: the actual use situation (i.e. R segment value) according to memory pages in current wicket, it is interior gradually to adjust this Prediction of the page in next wicket is deposited using temperature (i.e. C segment value), until current wicket terminates.

In one embodiment, in current wicket, R sections are recorded in such a way that monitor of virtual machine takes turns retrieval more Value, and gradually C segment value of the adjustment memory pages in next wicket.In the retrieval of every wheel, if detecting some page Face is accessed, then according to the temporal locality of program, improves its C segment value；If not detecting makes some memory pages With then reducing its C segment value.In order to which convenience of calculation can round up if C segment value is decimal and take its approximate number.Here, it gradually adjusts Whole C segment value can draw close prediction result to actual conditions.

It is possible to further improve C segment value by the way of " slow turn-on ", and reduced by the way of " fast to restore " C segment value.It will be understood by those skilled in the art that the mode of " slow turn-on " refers to: when some value needs to improve, being increased using first index It is long, more than the improvement method of linear increase again after threshold value；And the mode of " fast to restore " refers to: when some value needs to reduce, using The strategy that halves and the method for continuing " slow turn-on " in subsequent growth.Exponential increase therein will count when can be each increase Multiplied by 2, linear increase can be each numerical value and adds 1 value.

Step 4: current big window be not finished and at the end of current wicket, according to C segment value come isochronous memory, then Return to second step；At the end of current big window, into the 5th step.

1. collecting all dirty memory pages in current wicket by monitor of virtual machine.

2. according to the prediction of dirty memory pages using temperature come isochronous memory.

Specifically, the C segment value for checking all " dirty " memory pages is less than or equal to the dirty of given threshold value for wherein C segment value Memory pages, start corresponding with current wicket data channel complete internal storage state data it is synchronous (in the concrete realization, from A thread is chosen in thread pool to send data)；And be greater than the dirty memory pages of given threshold value for wherein C segment value, then prolong The data of the memory pages are synchronous afterwards, i.e., synchronize again at the end of current big window.If should be noted that institute at this time There is the C segment value of dirty memory pages to be both greater than threshold value (indicating that these memory pages are possible to be modified in next wicket), i.e., It is not necessary to be synchronized at the end of current wicket, also just without starting corresponding data path.

As described above, data path corresponding to different wickets (connection original virtual machine and backup virtual machine) can be only On the spot concurrent working, the corresponding data path of current wicket is when carrying out the transmission of internal storage state data, wicket pair before The data path answered may be also in transmission data, to produce the effect of transmitting data in parallel.

Preferably, before sending dirty internal storage state data, various types of compact technology or algorithm be can use, data is carried out Compression processing, to reduce data volume to be sent.

3. returning to second step, next wicket becomes current wicket.

Step 5: synchronizing the dirty memory pages not synchronized in current big window at the end of current big window.

1. then suspending original virtual machine if there is memory pages include not synchronous dirty internal storage state data.

2. sending the dirty memory not synchronized in big window to backup virtual machine by monitor of virtual machine from original virtual machine Status data.

Wherein, using data channel corresponding with the last one wicket (i.e. current wicket) in current big window into The transmission of row internal storage state data.Preferably, before sending dirty internal storage state data, various types of compact technology or calculation be can use Method carries out compression processing to data, to reduce data volume to be sent.

3. the history for updating memory pages uses temperature (H segment value), use temperature (P segment value) and C in current big window Segment value.

1) calculate memory pages each wicket in current big window actual use situation (R segment value) accumulation and, If accumulation and be higher than predetermined threshold, improve H segment value (for example, by using the mode of " slow turn-on "), otherwise reduce H segment value (such as By the way of " fast to restore ").

2) arithmetic mean of instantaneous value of the R segment value of each wicket in current big window is set as P segment value.

It will be understood by those skilled in the art that the cumulative of R segment value can be carried out at the end of each wicket, and in big window Arithmetic mean of instantaneous value is sought at the end of mouthful；Alternatively, can be carried out at the end of big window cumulative and seek arithmetic mean of instantaneous value.

3) according to the C segment value of H segment value and P segment value setting memory pages.

In one embodiment, take the weighted average of H segment value and P segment value as C segment value, H segment value and the shared power of P segment value Weight is different with the attention degree to two kinds of historical datas.

4. restoring original virtual machine and returning to second step, next big window becomes current big window.

Temperature is used as the selection criteria of dirty memory pages using prediction above, virtual machine provided by the invention is described and holds The specific steps of wrong memory synchronous method.It in actual implementation, can the implementation memory synchronization of the algorithm according to shown in following table.

Table 1

In above-described virtual machine fault tolerant memory synchronous method, n data channel can be with parallel transmission, this is resulted in Be likely to occur following situations: the more new data of same memory pages may be produced and be write due to transmitting in successive multiple channels After write (Write After Write, WAW) data correlation.

In order to guarantee the timeliness of internal storage state data, in a preferred embodiment, transmitted by data channel When internal storage state data (memory pages number+memory pages data), the sequence number for indicating time sequencing is also transmitted.Backup services Device judges the sequencing of same memory pages data according to sequence number, to determine the choice of more new content: only newer Memory pages data can cover older memory pages data, and direct for the memory pages data earlier delayed to reach Given up.

In the concrete realization, original server and backup server can star the implementation memory pages of algorithm shown in following table Transmission.

Table 2

It should be noted that described above is the virtual machine fault tolerant memory synchronous method for formally entering the fault-tolerant stage, ability Field technique personnel should be understood that in the data initialization stage, also can use memory synchronous method provided by the invention.

According to another embodiment of the invention, a kind of virtual machine fault tolerant memory synchronization system is also provided, comprising:

Pretreatment unit, for being multiple isometric state synchronized stages by each fault-tolerant divided stages；

Synchronizing device: at the end of the state synchronized stage, being carried out from dirty memory pages before the fault-tolerant stage terminates It selects and synchronizes；And at the end of the fault-tolerant stage, the synchronous dirty memory pages not synchronized also within the fault-tolerant stage.

For the validity for verifying method and system provided by the invention, inventor will be held using the virtual machine of the present invention program Wrong system and comparative experiments is not carried out using the virtual machine tolerant system of the present invention program.The experimental results showed that applying After method and system of the invention, it is fault-tolerant during memory synchrodata amount averagely reduce 15%.In addition, when virtual machine is answered With for computation-intensive and when memory calculates in the majority, using the memory synchrodata amount of system of the invention during fault-tolerant 20-30% can be reduced.

Although not each embodiment only includes one it should be appreciated that this specification describes according to various embodiments A independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should will say As a whole, the technical solutions in the various embodiments may also be suitably combined for bright book, and forming those skilled in the art can be with The other embodiments of understanding.

The foregoing is merely the schematical specific embodiment of the present invention, the range being not intended to limit the invention.It is any Those skilled in the art, made equivalent variations, modification and combination under the premise of not departing from design and the principle of the present invention, It should belong to the scope of protection of the invention.

Claims

1. a kind of virtual machine fault tolerant memory synchronous method, comprising:

Step 1), by each fault-tolerant divided stages be multiple isometric state synchronized stages；Wherein, the fault-tolerant stage corresponds to fault-tolerant Interval between two checkpoints in the process；

Step 2), before the fault-tolerant stage terminates, at the end of the state synchronized stage, select from dirty memory pages and same Step；Wherein, the dirty memory pages refer to the memory pages that its data is changed；

2. according to the method described in claim 1, selecting to use temperature in the prediction of NextState synchronous phase in step 2) Dirty memory pages no more than predetermined threshold synchronize.

3. method according to claim 1 or 2, wherein in each state synchronized stage, if memory pages are interviewed It asks, then the prediction for improving the memory pages in NextState synchronous phase uses temperature；Otherwise the memory pages are reduced next The prediction in state synchronized stage uses temperature.

4. according to the method described in claim 3, wherein, more wheel retrievals being carried out in each state synchronized stage, are examined in every wheel Suo Zhong improves the memory pages in the same step of NextState using slow turn-on mode if detecting the access to memory pages The prediction of section uses temperature, otherwise reduces the memory pages using fast reset mode and uses in the prediction of NextState synchronous phase Temperature.

5. method according to claim 1 or 2, wherein step 3) further include:

Memory pages are calculated in the use temperature in the fault-tolerant stage；

If the use temperature in the fault-tolerant stage is higher than predetermined threshold, the history for improving the memory pages uses temperature, no The history for then reducing the memory pages uses temperature.

6. according to the method described in claim 5, wherein, the history of memory pages is improved using temperature using slow turn-on mode, And temperature is used using the history that fast reset mode reduces memory pages.

7. according to the method described in claim 5, wherein, use temperature of the calculating memory pages in the fault-tolerant stage includes: will be interior The page being deposited within the fault-tolerant stage, the arithmetic mean of instantaneous value of the actual use number in each state synchronized stage is set as the memory pages In the use temperature in the fault-tolerant stage.

8. according to the method described in claim 5, wherein, when the fault-tolerant stage starts, taking memory pages in upper one fault-tolerant stage The history using temperature and the memory pages use the weighted average of temperature, as the memory pages in the same step of NextState The prediction of section uses temperature.

9. method according to claim 1 or 2, wherein before the fault-tolerant stage terminates, at the end of the state synchronized stage, It is synchronous using data path execution corresponding with the state synchronized stage；At the end of the fault-tolerant stage, use and aforementioned data The different data path of access executes synchronization；Data path is independent of one another.

10. according to the method described in claim 9, wherein, when executing synchronous using data path, the data transmitted include: Indicate the sequence number and internal storage state data of time sequencing, wherein internal storage state data include memory pages number and memory pages Data.

11. according to the method described in claim 9, wherein, the number come from original virtual machine transmission is received in backup virtual machine According to when, according to sequence number judge corresponding memory pages data whether be it is newest, if it is newest, execute data update, Otherwise it directly abandons.

12. a kind of virtual machine fault tolerant memory synchronization system, comprising:

Pretreatment unit, for being multiple isometric state synchronized stages by each fault-tolerant divided stages；Wherein, the fault-tolerant stage pair It should be in the interval between fault-tolerant two checkpoints in the process；

Synchronizing device: at the end of the state synchronized stage, being selected from dirty memory pages before the fault-tolerant stage terminates And it is synchronous；Wherein, the dirty memory pages refer to the memory pages that its data is changed；And terminate in the fault-tolerant stage When, the synchronous dirty memory pages not synchronized also within the fault-tolerant stage.