CN102541686B

CN102541686B - Method for achieving backup and disaster recovery of system by utilizing virtual machine

Info

Publication number: CN102541686B
Application number: CN201110387202.2A
Authority: CN
Inventors: 兰雨晴; 蒋涛; 宋潇豫; 夏颖
Original assignee: China Standard Software Co Ltd
Current assignee: China Standard Software Co Ltd
Priority date: 2011-11-29
Filing date: 2011-11-29
Publication date: 2015-07-01
Anticipated expiration: 2031-11-29
Also published as: CN102541686A

Abstract

The invention provides a method for achieving backup and disaster recovery of a system by utilizing a virtual machine, which utilizes a Remus module of a Xen virtual machine to achieve the disaster recovery of a computer system and utilizes a libvirt bank commonly used by all the virtual machines to achieve the backup of the computer system. When the disaster recovery and the backup are achieved by utilizing the virtual machine operating the computer system, no extra installation of disaster recovery and backup software is required, time and cost for maintaining and upgrading a disaster recovery program and a backup program are reduced, and the problems of file read conflict and the like occurring in a traditional disaster recovery and backup method can be effectively avoided.

Description

A kind of virtual machine that uses is to the method for the backup and disaster recovery that realize system

Technical field

The present invention relates to computer system association area, relate in particular to a kind of virtual machine that uses to the method for the backup and disaster recovery that realize system.

Background technology

In the middle of the operational process of computer system, computer system and its assembly can be subjected to various fault, and these faults can cause the loss of computer system data.Such as, perhaps a memory device of a computer system can break down suddenly (as unexpected power failure), and the data on the equipment of being stored in can be caused to read.Can cause the mistake of data on memory device to the operation of software or hardware error, and perhaps any other all can break down because of this error in data with the related computer system of this memory device or assembly.

In order to reduce the risk of loss of data, data copy can be become many parts and be saved on different memory devices by the user of computer system.On the other hand, what user can be frequent is installed in the middle of computer system by some corresponding backup softwares, in the middle of the use procedure of computer system, and backup software meeting self-timing computer system is backed up.But under many circumstances, perhaps, some or multiple application is used, and this time, stand-by program performed suddenly, and perhaps these application programs are just opening one or more file, at this moment stand-by program is wanted to access these files and is not allowed to, thus causes the failure of backup file.

Therefore, some backup softwares can define a lot of code library for various application program, by code library, stand-by program is attempted and application program carries out communicating or cause application program that data are submitted to file thus make these files can be backed up software to back up by producing some flip flop equipments.But when application program changes time (such as application version changes), stand-by program also will change accordingly.On the other hand, some other files (registration table of such as windows) can be opened by frequent, are therefore difficult to back up.

Under many circumstances; disaster recovery configuration is used to provide extra protection for the loss of data produced because of fault; these faults are not only also comprise environmental factor (such as power-off suddenly, breaking out of fire) around system by the generation of computer system own.In disaster recovery configuration, the state of data periodically can be sent to another computer system from a certain computer system by state-detection bag, and in some cases, second computer system geographically can away from first computer system.If first computer system there occurs fault and can not use, data can safe being stored in the middle of second computer system.Also have in some cases, when first computer system there occurs fault can not use time, operate in application in first computer system and automatically can restart in the middle of second computer system and continue process data.But the disaster recovery software run on the computer systems can run into some problems, and these problems and backup software have similar situation.When application program is just being opened or used some file time, disaster recovery software reads this file simultaneously and generates and detects bag, thus having conflict generation, application program can stop disaster recovery software to read file, causes generating the failure of state-detection bag.In addition, if the application in first computer system will be operated at second computer system reboot, need all running statuses of this application to copy in second computer system, and this process is very complicated.

Summary of the invention

In view of existing computer system backup and the defect of disaster recovery, the object of the invention is to propose to use Xen virtual machine to realize backup and the disaster recovery of computer system.

Use virtual machine to a method for the backup and disaster recovery that realize system, linux operating system is provided with Xen virtual machine, and the method comprises disaster recovery step:

(21) virtual machine is monitored, catch the state that virtual machine is current;

(22) copy needs the virtual machine image of carrying out disaster recovery to disaster recovery node;

(23) current virtual machine state is periodically copied to disaster recovery node;

(24) virtual machine breaks down, and disaster recovery node recovers virtual machine state.

Said method also comprises backup-step:

(11) virtual machine is monitored, catch the state that virtual machine is current;

(12) pending operation is carried out to virtual machine;

(13) virtual machine image is copied;

(14) virtual machine running status is recovered.

The operating system of described computing machine comprises a territory 0, territory 0 is a linux kernel revised, uniquely operate in the virtual machine on Xen supervisory routine, it has the authority of access physical I/O resource, simultaneously and other virtual machines that system is run carry out alternately, multiple virtual machine can be built, and managing virtual equipment, can carry out the management role such as hanging up to virtual machine; In the middle of native system, user needed the computer system used to be installed on virtual machine territory 1, one or more application program can be run in territory 1, stand-by program and strategy execution program are installed in the middle of the virtual machine in territory 0.

Wherein in step (11), by Xen virtual machine territory 0, each virtual machine is monitored, and periodically obtain each virtual machine running status.

Wherein in step (12), by Xen virtual machine territory 0, each virtual machine is managed, when needs back up virtual machine state time, pending operation is carried out to virtual machine.

Wherein in step (14), after (13) step completes, recover virtual machine running status by Xen virtual machine territory 0.

Wherein in step (22), before carrying out disaster recovery, need in advance by the mirror-image copies of virtual machine on disaster recovery node, when disaster recovery starts to carry out time, disaster recovery node can create a new virtual machine by this mirror image, when former virtual machine breaks down time, new virtual machine can take over the running status of former virtual machine, thus the continual operation of application program that guarantee operates on former virtual machine.

Wherein in step (23), new virtual machine on disaster recovery node will be consistent with former virtual machine state, therefore the running status of former virtual machine is needed to copy on new virtual machine, periodically former virtual machine state is copied on new virtual machine by state-detection bag, can also judge whether former virtual machine breaks down by state-detection bag.

Wherein in step (24), have no progeny when periodically detecting in bag in step (23), system can judge former virtual machine and break down, and the new virtual machine of system meeting automatic activation takes over the running status of former virtual machine, thus guarantee system is normally run.

The present invention is backed up and disaster recovery computer system by Xen virtual machine, backup and disaster recovery process in the middle of, only need to back up and disaster recovery the virtual machine of moving calculation machine system, do not need extra installation backup software and disaster recovery software, decrease for backup software and disaster recovery software maintenance with upgrade the time and cost payout that bring, can effectively avoid the problems such as conflict are read for the such as file run in the middle of traditional backup and Disaster Recovery Method process.

Accompanying drawing explanation

Fig. 1 is single computer system structural representation;

Fig. 2 is Fig. 1 computer system duplication process detail flowchart;

Fig. 3 is duplex computer system structural representation, and wherein computer system 2 is the disaster recovery node of computer system 1;

Fig. 4 is Fig. 3 resumption of system disaster process detail flowchart.

Embodiment

Clearer understanding is obtained in order to make feature of the present invention and advantage, below in conjunction with accompanying drawing, be described in detail below: as shown in Figure 1, describe in linux operating system, to install single computer system structural representation after Xen virtual machine, this system comprises a territory 0(Domain 0), territory 0 is a Linux kernel(kernel revised), uniquely operate in Xen Hypervisor(supervisory routine) on virtual machine, it has the authority of access physical I/O resource, simultaneously and other virtual machines that system is run carry out alternately, it has special administration authority, it can build multiple virtual machine, and managing virtual equipment, can carry out the management role such as hanging up to virtual machine.In the middle of native system, user needed the computer system used to be installed on virtual machine territory 1, one or more application program can be run in territory 1.Stand-by program and strategy execution program are installed in the middle of the virtual machine in territory 0, strategy execution program can use C language or script to write, primary responsibility performs user-defined backup policy, by reading the stand-by program called on territory 0 of backup policy information timing.Stand-by program can use C language to write, using the C function library that C language can be good at the virtual instrument of libvirt() function in storehouse calls, operation virtual machine is more convenient, stand-by program needs to realize following action: monitoring virtual machine state, backup virtual machine, recovery virtual machine, automatically copy virtual machine image etc.Its concrete steps are described as follows:

1) as shown in Figure 2, when backing up virtual machine application, it operates according to the backup policy of definition, in the middle of this backup policy, the opening time of definition stand-by program, and the number of times of backup every day, which virtual machine needs to back up, after having formulated backup policy, the information of backup policy be saved on the tactful backup file in territory 0, this file can be common file, also can be the file of XML type, use XML file conversation strategy information to compare specification, program reads more convenient.Operation reserve executive routine, strategy execution program can information on fetch policy backup file, thus calculate stand-by program and need the opening time, and by timing mechanism, program starts the stand-by program on territory 0 automatically at opening time point.

2) after opening the stand-by program on the virtual machine of territory 0 by step 1, in order to ensure the integrality of virtual machine state, first stand-by program can call the C function library of the virtual instrument of libvirt() virDomainSuspend(control domain in storehouse) function carries out pending operation to the virtual machine run in systems in which.

3), after in step 2, pending operation completes, the stand-by program on the virtual machine of territory 0 can by SSH(safety shell protocol) image file of this virtual machine is copied on backup storage device by mode.

4) after virtual machine image having been copied in step 3, the stand-by program on territory 0 can call virDomainResume function in libvirt storehouse and recover virtual machine, and virtual machine normally runs.

5) judge whether that all virtual machines complete backup all by the stand-by program on the virtual machine of territory 0, if all complete, then exit, otherwise select next virtual machine, repeat 2-4 step.

Wherein, in the middle of step 1, user can make oneself backup policy applicable according to the demand of self and the hardware condition had, and opens stand-by program once a day or repeatedly and backs up virtual machine.

Wherein, in the middle of step 2, the interface that stand-by program can be provided by the virtual storehouse of libvirt carries out pending operation to virtual machine, in addition, user also can carry out pending operation by call instruction mode to virtual machine, and the order line using virsh order line or use Xen to carry operates.

Wherein, step 3 copies virtual machine image file on backup storage device, backup storage device is External memory equipment, as shown in Figure 1, break down to prevent the memory disk of computer system itself, backup storage device will be separated with former virtual machine image memory device, and both can not store together.

As shown in Figure 3, describe in linux operating system, to install duplex computer system structural representation after Xen virtual machine, computer system 1, 2 all comprise a territory 0(Domain 0), territory 0 is a Linux kernel revised, uniquely operate in the virtual machine on Xen Hypervisor, it has the authority of access physical I/O resource, simultaneously and other virtual machines that system is run carry out alternately, it has special administration authority, it can build multiple virtual machine, and managing virtual equipment, can carry out the management role such as hanging up to virtual machine.Wherein computer system 2 is the disaster recovery node of computer system 1, computer system 2 has similar running environment to computer system 1, in time breaking down in the virtual machine territory 1 in computer system 1, the application in the virtual machine territory 1 operated in computer system 1 can be taken in virtual machine territory 1 in computer system 2, thus application can continually be run.Installation period sexual state trace routine in the middle of territory 0 in computer system 1, mirror-image copies program.In computer system 2, virtual machine disaster recovery procedure is run in territory 0.In the middle of practical application is disposed, computer system 2 is as disaster recovery node, disaster recovery can be carried out to the system node that multiple computer system 1 is such, but for the purpose of simplifying the description, in the middle of this paper, computer system 2 only carries out disaster recovery to computer system 1, and its concrete steps are as follows:

1) as shown in Figure 4; first open system disaster recovery procedure; select and need the virtual machine carrying out disaster recovery protection; as shown in Figure 3: the virtual machine that needs carry out protecting is the virtual machine territory 1 in computer system 1; after opening disaster protection, in computer system 2, newly-built virtual machine territory 1 is as virtual machine disaster recovery node.Operate in the automatic program of file copy of mirror image on computer system 1 territory 0 can by SSH mode by the mirror-image copies in virtual machine territory 1 that operates in computer system 1 on the memory device in computer system 2.Computer system 2 starts virtual machine territory 1, if its state is halted state.

2) after step 1 completes, in computer system 1 on territory 0, start to send and detect bag program, at first time process of transmitting, need the good working condition (comprising cpu instruction, disk buffering request, internal memory event, network packet etc.) copying computer system 1 territory 1, be copied in computer system 2 co-domain 1, be loaded in computer system domain 1 internal memory.

3), after step 2 completes, computer system 1 territory 0 is detected bag transmission program and can periodically copy the page that computer system domain 1 was revised, be loaded in the middle of state-detection bag, and state-detection bag is sent on computer system 2 virtual machine territory 1.

4) the periodic accepting state in computer system 2 virtual machine territory 1 detects bag, stress state detects the page information in bag, according to state-detection bag, can judge whether computer system 1 territory 1 breaks down, when state bag stops sending, then activate the virtual machine on disaster recovery node.

Wherein, virtual machine image copies on the memory device in computer system 2 by step 1, and what copy the mirror image in computer system 2 to deposits that path must to deposit path consistent with computer system 1.

Along with the development of Xen virtual machine, the aspect of performance of Xen virtual machine is also in continuous lifting, on Xen virtual machine, operational system and application are with operational system on actual physics machine be applied in aspect of performance gap and constantly reduce, particularly start at Xen4.0 version, Xen virtual machine adds Remus module, provide hot standby correlation function, the disaster recovery for system provides a great help.Wherein, step 2,3 can use the Remus module of Xen to realize, and uses Remus can copy accurately the running status of virtual machine, and Remus is by being buffered in all-network Packet Generation in 200 ms intervals to destination.

Wherein, step 4 can use Xen to carry virtual machine to recover module and realize, and when not receiving the state-detection bag that computer system 1 sends within the appointed time, now system can judge that computer system 1 breaks down.Computer system 2 Automatically invoked virtual machine recovery module can carry out activating virtual machine, thus realizes the disaster recovery of system.

The implementation of above-described example to various piece of the present invention is described in detail; but specific implementation form of the present invention is not limited thereto; for the those skilled in the art of the art, the various apparent change carried out it when not deviating from spirit and the right of the method for the invention is all within protection scope of the present invention.

Claims

1. use virtual machine to a method for the backup and disaster recovery that realize system, it is characterized in that: linux operating system is provided with Xen virtual machine, operating system has carried out backup operation, and Disaster Recovery Method comprises step:

Before carrying out disaster recovery, need in advance by the mirror-image copies of virtual machine on disaster recovery node, when disaster recovery starts to carry out time, disaster recovery node can create a new virtual machine by this mirror image, when former virtual machine breaks down time, new virtual machine can take over the running status of former virtual machine, thus the continual operation of application program that guarantee operates on former virtual machine;

(24) virtual machine breaks down, and disaster recovery node recovers virtual machine state;

Wherein, backup-step is:

(12) pending operation is carried out to virtual machine;

(13) virtual machine image is copied;

(14) virtual machine running status is recovered;

Wherein, in step (12), by Xen virtual machine territory 0, each virtual machine is managed, when virtual machine state is backed up, call control domain function in the C function library of virtual instrument and pending operation is carried out to virtual machine;

Described operating system comprises a territory 0, territory 0 is a linux kernel revised, uniquely operate in the virtual machine on Xen supervisory routine, it has the authority of access physical I/O resource, simultaneously and other virtual machines that system is run carry out alternately, build multiple virtual machine, and managing virtual equipment, to the management role that virtual machine is hung up; In the middle of native system, user needed the computer system used to be installed on virtual machine territory 1, multiple application program is run in territory 1, stand-by program and strategy execution program are installed in the middle of the virtual machine in territory 0; Strategy execution program primary responsibility performs user-defined backup policy, is periodically called the stand-by program on territory 0 by reading backup policy information.

2. the method for claim 1, is characterized in that: in step (14), after (13) step completes, is called control domain function in the C function library of virtual instrument recover virtual machine running status by Xen virtual machine territory 0.

3. the method for claim 1, it is characterized in that: in step (23), new virtual machine on disaster recovery node will be consistent with former virtual machine state, therefore the running status of former virtual machine is needed to copy on new virtual machine, periodically former virtual machine state is copied on new virtual machine by state-detection bag, can also judge whether former virtual machine breaks down by state-detection bag.

4. the method for claim 1, it is characterized in that: in step (24), have no progeny when periodically detecting in bag in step (23), system can judge former virtual machine and break down, the new virtual machine of system meeting automatic activation takes over the running status of former virtual machine, thus guarantee system is normally run.

5. the method for claim 1, is characterized in that: in step (13), and the stand-by program on the virtual machine of described territory 0 can be copied on backup storage device by the image file of safety shell protocol mode by this virtual machine.