CN1276349C - Method for mirror backup of cluster platform cross parallel system - Google Patents
Method for mirror backup of cluster platform cross parallel system Download PDFInfo
- Publication number
- CN1276349C CN1276349C CN 03148518 CN03148518A CN1276349C CN 1276349 C CN1276349 C CN 1276349C CN 03148518 CN03148518 CN 03148518 CN 03148518 A CN03148518 A CN 03148518A CN 1276349 C CN1276349 C CN 1276349C
- Authority
- CN
- China
- Prior art keywords
- backup
- node
- file
- address
- service
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Hardware Redundancy (AREA)
Abstract
The present invention discloses a method for mirror image backup of a cluster platform cross parallel system, which at least comprises steps: firstly starting up a plurality of standby node machine after the initializing set, transmitting timing backup commands, then judging whether the set time reaches or not, if true, carrying out the next step to generate remote backup files, else, continuously waiting,. A plurality of node machines are actuated from a remote location, and then, the node machines automatically carry out the system backup according to the program in a system backup script of a start-up mirror image. The system programs are completely copied to the remote node machine in parallel. After the backup, the node machine is started again and automatically returns the system before the backup. The present invention efficiently solves the problems of the backup of the parallel mirror images in large-scale clusters. In the backup operation, a plurality of hard disks of the node machines simultaneously carry out the system-level backup, which can compress a large amount of data of the hard disks according to a designated compression method.
Description
Technical field
The present invention relates to the computer system management technology, particularly relate to the method for in the cross-platform parallel system that constitutes by computer group, carrying out mirror back-up.
Background technology
A group of planes be one group separate, by the set of (being called node again) of the interconnected computing machine of express network, Network of Workstation is managed a group of planes with the pattern of triangular web, promptly make full use of the resource of each computing machine in the group of planes, realize the parallel processing system (PPS) of complex calculation again.
The backup of Network of Workstation is an important topic of cluster management system always.The difference that it and unicomputer node (hereinafter to be referred as node) back up has: the backup of (1) Network of Workstation is the backup of many nodes, and it will back up the data content of a plurality of nodes.(2) group of planes backup is the network remote backup, backs up by network.(3) group of planes backup request is concurrent.Being that a plurality of nodes back up simultaneously, just looks like that a node is backed up equally.
The backup of Network of Workstation can be divided into file-level backup and system-level backup simply.The file-level backup comprises general file or catalogue backup.System-level backup comprises file system backup, fdisk backup, hard disk backup.Wherein hard disk backup also can be described as " hard disk mirror-image ", " hard disk image " or " hard disc cloning ".Real-time degree by data backup also can be divided into backed up in synchronization and async backup.The employed technology of backed up in synchronization generally has RAID (redundant array of inexpensive disks), high available (HA) etc.The synchro system backup request carries out real-time backup fully to the whole software system in system's operation, the disaster-tolerant backup technology just belongs to this category, and the major company that only possesses very strong technical strength at present just has ripe disaster-tolerant backup technology.
Async backup to the backup real-time degree less demanding, can postpone a period of time after, the data content of goal systems before a period of time backed up, system is temporarily stopped system is backed up.
At present the method that a group of planes is carried out the asynchronous system backup mainly also is based on " online " backup mode that the client node backups to the backup server node.Be under network UNICOM state, to carry out network copy between the node computer.This backup method is more effective to the file-level backup, almost is invalid to system-level mirror back-up still.Because system-level backup needs the content of saved system boot section, common copy command is the dubbing system boot files correctly; And system's some system file that is in operation is a disable access.So system-level mirror back-up need be taked special method.For example with the mirror back-up that carries out " off-line " (with respect to " online ", not being real off-line) mode after the system start-up dish restarting systems, its typical case's representative is Ghost.Its shortcoming also is conspicuous, and it must use external unit (flexible plastic disc) to generate a system start-up dish earlier.
Now, support at great majority all to be equipped with PXE (Pre-Execution Environment, pre-service environment) in the mainboard BIOS of scsi device, simplified the DynamicHost address allocation procedure.And linux system itself is a microkernel designs, can build one with very little kernel and a spot of support file and simplify operating system, on this operating system basis, utilize system's commonly used command just can finish than complicated operations, as compression, far call, backup and recovery.
If can design a kind of like this " off-line " redundancy technique, make the backup of system can utilize the strong point of new hardware technology and Linux software well, both can be used for system-level mirror back-up, the backup of support timing automatic, support more than one compress mode, embody concurrent technique, do not need human intervention again; It will be the contribution greatly that prior art is made.
Summary of the invention
The technical problem to be solved in the present invention is the method that proposes the cross-platform parallel system mirror back-up of an a kind of group of planes, use the present invention can reduce backup and rejuvenation complicated operation degree greatly, walk abreast and carry out system backup, improve the speed of backup, thereby increase work efficiency.
The method of the cross-platform parallel system mirror back-up of a group of planes of the present invention mainly comprises the steps:
One, after initialization is provided with, starts a plurality of node computers to be backed up, send the timed backup order;
Two, judge whether timing arrives,, then continue to wait for,, carry out next step if timing arrives if do not arrive;
Three, generate the startup mirror image that remote backup is used, long-rangely restart a plurality of node computers, comprise the system backup script in this startup mirror image;
Four, these node computers carry out system backup according to the Automatic Program in the system backup script that starts in the mirror image, and system program is intactly walked abreast to be copied in the long-range node computer;
Five, after backup was finished, node computer was restarted, and automatically returned to the preceding system of backup.
The present invention has realized the parallel system backup to extensive group of planes node, has both utilized the advantage of PXE, has utilized the characteristics of linux system again, and advantage and innovative point that it is main have:
It efficiently solves in the extensive group of planes difficult problem of parallel mirror back-up, in backup operation, the hard disk of a plurality of node computers has been carried out system-level backup simultaneously.
To may being that the hard disc data of magnanimity compresses according to the compress mode of appointment.
What no matter adorn in the hard disk is linux system, or the Windows system, after backup, recovers again, and system can return to the state before the backup, all operate as normal.
Added the timed backup function,, carried out the concurrent system backup of many nodes automatically, just can finish the system backup of a plurality of nodes fully without manual intervention in the time (as weekend or festivals or holidays) of appointment.
Description of drawings
Fig. 1 is the general flow chart of the method for the invention;
Fig. 2 is the process flow diagram that carries out remote backup in the method for the invention;
Fig. 3 is the process flow diagram that the backup postjunction is restarted in the method for the invention.
Embodiment
For the system manager, backup is the problem that often needs consideration, in the face of a large amount of servers, if adopt Ghost or other conventional methods to back up, workload is very big, and node backs up one by one, and work efficiency is very low, and backup and recovery operation is very complicated.
The present invention mainly solves the problem of the parallel system backup of many nodes in the large-scale group of planes, cross-platform backup, automated back-up and compress backup.
As shown in Figure 1, control desk to a plurality of nodes parallel send the order of system backup after, a plurality of node computers almost generate simultaneously the startup mirror image of each node computer at fixed time automatically, start in the mirror image and comprise the system backup shell script, then these node computers almost restart simultaneously at once, enter a (SuSE) Linux OS environment of simplifying, and can carry out telecommunication with backup server, these nodes carry out system backup by the Automatic Program in the system backup script then, system complete walk abreast and copy to long-range backup server, can also be when duplicating by the compress mode compress backup file of appointment, after backup is finished, these nodes automatically return to the system before the backup, and whole process need not human intervention.
When hard disk generation physical damage or fdisk software fault occurs and cause system normally to start, utilize the present invention to recover, can recover original system apace, and need not adorn operating system again, then adorn various application software systems again and carry out complicated configuration at last.
Fig. 2 has provided the specific implementation process of carrying out remote backup in the method for the invention.
At first, restart node 1-N, after System self-test finishes, the CMOS that enters each node is provided with the interface, adjust starting up's order (Boot Sequence) of node 1-N, before being placed on every other equipment, guarantee that " PXE " is first starting outfit from " PXE " startup.Then, download under the first catalogue of TFTP (TFTP) that the pxelinux.0 file is placed on backup server.
Then, the related service of backup server is set, comprises:
Open node 1-N regularly exectorial backstage service.For the Linux node, start timing services if desired, only need operation "/etc/rc.d/init.d/crond restart " to get final product.If desired once the run timing service of starting shooting, on order line, move "/sbin/chkconfig crond on " with power user's identity and get final product.For the node of Windows 2000 Server systems, if the permission program is at the appointed time moved, operation steps is: start " Task Scheduler " service in " management tool " " service " lining.
Open RSH (remote terminal) service of backup server, and RSH configuration file/root/.rhosts is put in the IP address of node 1-N, allow node 1-N to visit backup server by RSH.
Open the TFTP service of backup server, allow other nodes by transmitting file under the TFTP agreement.With power user's identity on the order line on backup server the operation "/sbin/chkconfig tftpon " and "/etc/rc.d/init.d/xinetd restart " get final product.
After executing above step, just can carry out substantial manipulation according to following steps.
Control desk has the system backup order of timing for the parallel transmission of node 1-N.Node 1-N is after the system backup order that receives the timing that backup server sends separately, time and order are extracted respectively, according to the exectorial form of timing time and order are write regularly in the exectorial configuration file, and restart Crond service (the Windows node is directly carried out timer command).By the time timing one arrives, and the node 1-N while as execution parameter, by far call, operates information such as the MAC Address of this machine, IP address below carrying out on the backup server.
A shared file initrd is loaded (mount) to a temp directory, generate a temporary file system, this file system has shared startup module, communication module and indispensable backup and compress order.If temp directory is loaded, then wait for 1 second after, reattempt loading.This file initrd generates with a special script file, and the inside comprises shared module, order and library file.
Corresponding module insert order, activate network interface card and be furnished with the order of this node IP address, the order of backing up this node hard disk and compress backup by RSH is inserted in the bin/init file under the temp directory, operating system execute file init is simplified in customization, and the instruction of this document the last item is to carry out special restarting (reboot) order.
Corresponding integrated circuit board driver module is copied in certain one-level catalogue under the temp directory.
Unload the temporary file system, compress the file initrd after customizing, and the file after the compression is renamed as the startup image file of naming with node name, be placed under the TFTP head catalogue.
Press the PXE rule and generate PXE startup boot files, in boot files, will contain startup image filename content with the node name name.
Generation contains the IP address information of node 1-N, DHCP (DHCP) configuration file "/etc/dhcpd.conf " of MAC information.
All nodes of wait node 1-N are finished above step, if there is some node computer can not finish above step for some reason, forces after then waiting for a period of time down and carry out.
Judge that the Dynamic Host Configuration Protocol server whether other are arranged is moving and might clash,, then start the DHCP service of backup server if do not have.
Restart order to parallel transmission of node 1-N.
Fig. 3 is the process flow diagram that the backup postjunction is restarted in the method for the invention.
After node computer receives instruction of restarting, normal shutoff operation system also restarts, because the first startup item is PXE, node computer at first enters into the PXE environment, outwards sends the message that contains Network Interface Unit MAC information, and Dynamic Host Configuration Protocol server (backup server) is after receiving this mac address information, check the item that in the DHCP configuration file, whether mates, if have, then find the IP address of this MAC coupling, return IP address, startup file and other information of this machine to node computer; Node computer is then done a mapping with this address, in the first directory search corresponding file of Dynamic Host Configuration Protocol server; After finding file, serve the startup image file of searching this MAC Address correspondence in the first catalogue at TFTP again.
After finding the startup image file, Dynamic Host Configuration Protocol server passes operating system nucleus down and starts image file to node computer by TFTP.Node computer begins the import operation system kernel, sets up a simple and easy file system in internal memory, and afterwards, operating system is at first carried out the init file, and after having loaded necessary module, beginning is according to the shell script order fill order that defines in the image file.
Shell script at first to network interface configuration original IP address and activation, is set up local route, and after souning out several seconds, operating system is set up the routing table of external communication; Then, begin to carry out backup tasks: read the local hard drive data with the system copies order earlier, set up remote procedure call with the RSH order then, set up a data channel (also claiming pipeline) that arrives backup server by far call, data are compressed before by this passage, the data of crossing in the terminal store compressed of passage.
After all data have all read, send information to Remote Node RN, the reporting system backup finishes, and contents such as the MAC of this node computer, IP address restart the DHCP service in the deletion Dynamic Host Configuration Protocol server.Then shell script is carried out and is passed through transformed (reboot) order of restarting.
When node is restarted, though in PXE, will continue to seek dynamic IP addressing, owing to do not have corresponding D HCP configuration item, so this node will can not get the dynamic address of this machine.General PXE is provided with overtime control, and the overtime time limit one arrives, and node computer will normally carry out hard disk startup, read the BOOT information on the hard disk, enter into normal operating system then.
Rejuvenation and the backup procedure of introducing above are very similar, except rejuvenation will be used corresponding recovery order and parameter, also mainly contain with the backup procedure difference:
The parameter of backup command includes the hard disk or the zone name that will back up, and is corresponding image filename (or magnetic tape station implementor name) and recover the order relevant parameters;
If compression has been used in backup, then recover to use corresponding decompression order;
Recovering step proceeds to restarts after the node, enter into the moment that begins to carry out recovery tasks, set up remote procedure call by the RSH order earlier, and set up a data channel (pipeline) that arrives backup server, the beginning part of passage reads teledata with the system copies order, decompressed when data are passed through this passage, at the end of passage the data that decompress are write on local hard drive or the subregion.
In a word, the present invention greatly facilitates many nodes system backup in the group of planes, can intactly back up several operation systems, that is to say and can carry out mirror back-up completely a plurality of fdisks or DISK to Image, and no matter what operating system what install on fdisk and the hard disk is.The present invention is a kind of cross-platform in the group of planes, asynchronous system, system-level image backup method that is suitable for.
It should be noted last that: above embodiment is the unrestricted technical scheme of the present invention in order to explanation only, although the present invention is had been described in detail with reference to the foregoing description, those of ordinary skill in the art is to be understood that: still can make amendment or be equal to replacement the present invention, and not breaking away from any modification or partial replacement of the spirit and scope of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.
Claims (6)
1, the method for the cross-platform parallel system mirror back-up of an a kind of group of planes is characterized in that, comprises the steps:
After step 1, initialization are provided with, start a plurality of node computers to be backed up, send the timed backup order;
Step 2, judge whether timing arrives,, then continue to wait for,, carry out next step if timing arrives if do not arrive;
Step 3, generate the startup mirror image that remote backup is used, long-rangely restart a plurality of node computers, comprise the system backup script in this startup mirror image;
Step 4, these node computers carry out system backup according to the Automatic Program in the system backup script that starts in the mirror image, and system program is intactly walked abreast to be copied in the long-range node computer;
After step 5, backup were finished, node computer was restarted, and automatically returned to the preceding system of backup.
2, the method for the cross-platform parallel system mirror back-up of a group of planes according to claim 1 is characterized in that, described step 3 further comprises:
Open node 1-N regularly exectorial backstage service;
Open the remote terminal service of backup server, and the remote terminal configuration file is put in the IP address of node 1-N, allow node 1-N to visit backup server by remote terminal;
Open the TFTP service of backup server, allow other nodes by transmitting file under the TFTP.
3, the method for the cross-platform parallel system mirror back-up of a group of planes according to claim 2, it is characterized in that, control desk has the system backup order of timing for the parallel transmission of node 1-N, node 1-N is after the system backup order that receives the timing that backup server sends separately, time and order are extracted respectively, according to the exectorial form of timing time and order are write regularly in the exectorial configuration file, and restart related service.
4, the method for the cross-platform parallel system mirror back-up of a group of planes according to claim 3 is characterized in that, by far call, and operation below carrying out on the backup server:
A shared file initrd is loaded into a temp directory, generates a temporary file system;
Corresponding module insert order, activate network interface card and be furnished with the order of this node IP address, the order of backing up this node hard disk and compress backup by RSH is inserted in bin/init file temp directory under, customize and simplify operating system execute file init;
Corresponding integrated circuit board driver module is copied in certain one-level catalogue under the temp directory;
Unload the temporary file system, compress the file initrd after customizing, and the file after the compression is renamed as the startup image file of naming with node name, be placed under the TFTP head catalogue;
Generate the pre-service environment by the pre-service environmental planning and start boot files, in boot files, will contain startup image filename content with the node name name;
Generation contains the IP address information of node 1-N, the DHCP configuration file of MAC information;
All nodes of wait node 1-N are finished above step, judge that the Dynamic Host Configuration Protocol server whether other are arranged is moving and might clash, if do not have, then starts the DHCP service of backup server;
Restart order to parallel transmission of node 1-N.
5, the method of the cross-platform parallel system mirror back-up of a group of planes according to claim 4, it is characterized in that, after node computer receives instruction of restarting, normal shutoff operation system also restarts, node computer at first enters into the pre-service environment, outwards send the message that contains Network Interface Unit MAC information, as the Dynamic Host Configuration Protocol server of backup server after receiving this mac address information, check the item that in the DHCP configuration file, whether mates, if have, then find the IP address of this MAC coupling, return the IP address of this machine to node computer, startup file and other information; Node computer is then done a mapping with this address, in the first directory search corresponding file of Dynamic Host Configuration Protocol server; After finding file, in the first catalogue of TFTP service, search the startup image file of this MAC Address correspondence and carry out corresponding operating again.
6, the method for the cross-platform parallel system mirror back-up of a group of planes according to claim 5 is characterized in that, Dynamic Host Configuration Protocol server passes operating system nucleus down and starts image file to node computer by TFTP; Shell script at first to network interface configuration original IP address and activation, is set up local route; Then, begin to carry out backup tasks;
After all data have all read, send information to Remote Node RN, the reporting system backup finishes, and contents such as the MAC of this node computer, IP address restart the DHCP service in the deletion Dynamic Host Configuration Protocol server;
Last shell script is carried out the reset command through transforming.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 03148518 CN1276349C (en) | 2003-06-30 | 2003-06-30 | Method for mirror backup of cluster platform cross parallel system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 03148518 CN1276349C (en) | 2003-06-30 | 2003-06-30 | Method for mirror backup of cluster platform cross parallel system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1567198A CN1567198A (en) | 2005-01-19 |
CN1276349C true CN1276349C (en) | 2006-09-20 |
Family
ID=34472299
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 03148518 Expired - Lifetime CN1276349C (en) | 2003-06-30 | 2003-06-30 | Method for mirror backup of cluster platform cross parallel system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1276349C (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102457541A (en) * | 2010-10-25 | 2012-05-16 | 鸿富锦精密工业(深圳)有限公司 | System and method for avoiding resource competition during starting diskless workstation |
CN102591750A (en) * | 2011-12-31 | 2012-07-18 | 曙光信息产业股份有限公司 | Recovery method of cluster system |
CN102664922A (en) * | 2012-03-30 | 2012-09-12 | 浪潮电子信息产业股份有限公司 | High-speed network starting method based on Linux system |
CN102707968A (en) * | 2012-04-12 | 2012-10-03 | 华平信息技术股份有限公司 | Method and system for generating installation backup system |
CN104407942A (en) * | 2014-11-28 | 2015-03-11 | 上海爱数软件有限公司 | Off-site storage based Linux operation system backup recovery method |
CN106487524B (en) * | 2015-08-27 | 2019-09-13 | 昆达电脑科技(昆山)有限公司 | The method of remote opening |
CN106326051A (en) * | 2016-08-22 | 2017-01-11 | 浪潮电子信息产业股份有限公司 | Method for realizing automatic switchover of OS (Operating System) in PXE (Pre-boot Execution Environment) testing environment |
CN108804253B (en) * | 2017-05-02 | 2021-08-06 | 中国科学院高能物理研究所 | Parallel operation backup method for mass data backup |
CN114079616B (en) * | 2021-11-02 | 2023-11-03 | 中国船舶重工集团公司第七0三研究所 | Redundancy method for database of non-hot standby disk array server |
-
2003
- 2003-06-30 CN CN 03148518 patent/CN1276349C/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
CN1567198A (en) | 2005-01-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI547875B (en) | Converting machines to virtual machines | |
US7475282B2 (en) | System and method for rapid restoration of server from back up | |
US7353355B1 (en) | System and method for rapid restoration of server from backup | |
CN111338854B (en) | Kubernetes cluster-based method and system for quickly recovering data | |
US7937612B1 (en) | System and method for on-the-fly migration of server from backup | |
US20180121186A1 (en) | Software installation onto a client using existing resources | |
US7725559B2 (en) | Virtual data center that allocates and manages system resources across multiple nodes | |
US7281104B1 (en) | System and method for online data migration | |
US9547562B1 (en) | Boot restore system for rapidly restoring virtual machine backups | |
CN101408856A (en) | System and method for tolerance disaster backup(disaster-tolerant backup) | |
US8612553B2 (en) | Method and system for dynamically purposing a computing device | |
US20070067366A1 (en) | Scalable partition memory mapping system | |
JP2004013563A (en) | Computer system, user data storage device, data transfer method for storage device, backup method for user data and its program | |
WO2002091179A2 (en) | Method and apparatus for migration of managed application state for a java based application | |
CN1276349C (en) | Method for mirror backup of cluster platform cross parallel system | |
CN111381933A (en) | Docker thermal migration implementation method | |
US7506115B2 (en) | Incremental provisioning of software | |
CN116383167A (en) | Method for solving insufficient disk space based on object storage | |
CA2555483A1 (en) | A method for providing live file transfer between machines | |
KR100947136B1 (en) | Incremental provisioning of software |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CX01 | Expiry of patent term |
Granted publication date: 20060920 |
|
CX01 | Expiry of patent term |