CN100589081C - UNIX environment system restoration method - Google Patents

UNIX environment system restoration method Download PDF

Info

Publication number
CN100589081C
CN100589081C CN200710037953A CN200710037953A CN100589081C CN 100589081 C CN100589081 C CN 100589081C CN 200710037953 A CN200710037953 A CN 200710037953A CN 200710037953 A CN200710037953 A CN 200710037953A CN 100589081 C CN100589081 C CN 100589081C
Authority
CN
China
Prior art keywords
disk array
domain
unix
file system
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN200710037953A
Other languages
Chinese (zh)
Other versions
CN101261595A (en
Inventor
辛旻
王磊
陈晓武
许能飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Baosight Software Co Ltd
Original Assignee
Shanghai Baosight Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Baosight Software Co Ltd filed Critical Shanghai Baosight Software Co Ltd
Priority to CN200710037953A priority Critical patent/CN100589081C/en
Publication of CN101261595A publication Critical patent/CN101261595A/en
Application granted granted Critical
Publication of CN100589081C publication Critical patent/CN100589081C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a system recovery method under the UNIX environment, which can still ensure the normal operation of a system under the situations of disk array faults or collapse of database,lost or damage of files which are stored in the disk array, thus realizing the purpose of reducing the impacts on the system which are caused by the faults above and ensuring the effective operation rate of the system. The method mainly comprises three steps of synchronization, switch and recovery; wherein, the synchronization is the step of daily automatic timing completion of the reproduction ofthe disk array to a local disk; the switch is the step of switching the UNIX system from the disk array to the local disk for continuous operation when the disk array is in fault; and the recovery isthe step of re-switching the UNIX system from the local disk to the disk array for operation and recovering to the normal state after the disk array is normal by recovery.

Description

System recovery method under the unix environment
Technical field
The present invention relates to the restoration methods when system breaks down under a kind of unix environment.
Background technology
Along with the capacity of disk array is increasing; virtual store, storage are shared becomes possibility on the one hand; make a disk array to be shared by a plurality of even more system; it will be more and more serious also having caused on the one hand the disk array consequence that is caused that in a single day breaks down; and disk array is shut down maintenance coordinated time possibility hardly; even, still can guarantee the method for the normal operation of unix system so need find a kind of disk array to go wrong.
Though the reliability of disk array and database is strengthening at present, in a single day out of joint be not minor issue just.Disk array generally all guarantees its reliability by two disk arrays at present, significantly improve but such result who causes is a cost, and performance descends to some extent; Database then generally adopts the cluster structure to improve its reliability, but this need depend on cluster software and disk array.Even possessed top two conditions simultaneously, still be difficult to stop the damage of program file or lose, perhaps fault such as database collapse takes place.
Under the present circumstances; when taking place, the disk array fault can only shut down maintenance; and can only shut down reparation for database collapse, for the damage of file or lose and can only shut down progressively the location or all recover by force, this normal operation to important system causes immeasurable loss.Especially because the appearance of package software, make a file system may deposit hundreds of thousands even individual file up to a million, and also just often be difficult to the location when going wrong, very long again release time from tape, need find the method for a quick recovery system operation, give the location problem, recover to race against time fully.
Summary of the invention
Technical matters to be solved by this invention provides the system recovery method under a kind of unix environment, the disk array fault takes place or be stored in database collapse in the disk array, file is lost or situation about damaging under, still can guarantee the normal operation of system, thereby can realize reducing the purpose of above-mentioned fault, guarantee effective operation ratio of system the influence that system caused.
For solving the problems of the technologies described above, the invention provides the system recovery method under a kind of unix environment, this method mainly comprises synchronously, switches and recovers three steps; Wherein, described synchronously for finish the copy step of disk array at daily self-timing to this domain; Described switching is meant when disk array breaks down, and unix system is switched to the step that operation is continued in this domain by disk array; Described recovery is meant and after disk array recovers normally unix system is changed the disk array operation again by this domain, returns to the step of normal condition;
Described synchronously by setting up good synchronization job self-timing initiation in advance, it further may further comprise the steps:
(1) snapshot current file system on disk array;
(2) snapshot with step (1) gained is loaded as the snapshot document system, to be placed in the backup file system in this domain after the packing of described snapshot document system then, described this domain is the disk that is built in the computer that described unix system moves, on described this domain, establish the backup file system and with the corresponding file system of disk array;
(3) file system on the disk array is removed snapshot;
(4) literature kit that will leave in this domain backup file system unpacks, and covers on this domain and the corresponding file system of disk array.
The present invention has such beneficial effect owing to adopted technique scheme, promptly by built-in this domain with enough capacity in computer, periodically the content on this domain and the disk array is carried out automatically synchronously at daily use snapping technique; Disk array fault, database collapse take place, file is lost or during situation such as damage, system is switched to this domain by disk array continues operation; And after fault is got rid of, system is returned under the normal condition of moving on the disk array again; Thereby guaranteed that system still can normally move when producing significant trouble, striven for the time for fixing a breakdown, made described fault can be controlled in the minimum scope, effectively guaranteed effective operation ratio of system, controlled cost effectively the influence of system.
Description of drawings
The present invention is further detailed explanation below in conjunction with accompanying drawing and embodiment:
Fig. 1 is the schematic flow sheet of the method for the invention;
Fig. 2 is the synoptic diagram that carries out according to the present invention when synchronous;
Fig. 3 is the synoptic diagram when switching according to the present invention;
Fig. 4 is the synoptic diagram when recovering according to the present invention.
Embodiment
Be illustrated in figure 1 as the schematic flow sheet of the system recovery method under the unix environment of the present invention, mainly comprise synchronously, switch and recover three steps; Wherein, specifically be meant synchronously: finish between disk array and this domain regularly synchronous step automatically daily, promptly finish disk array automatically, use when switching to the duplicating of this domain; Switch and specifically be meant: when disk array breaks down, unix system is switched to the step that operation is continued in this domain by disk array; Recover specifically to be meant: after disk array recovers normally, unix system is changed the disk array operation again by this domain, return to the step of normal condition.In the present invention, these three steps are to complement each other, and are indispensable.
In the present invention, described unix system should be supported file system snapshot; Described this domain is the disk that is built in the computer that described unix system moves, and in order to ensure having enough capacity, suggestion uses two blocks of built-in disk mirror images as this domain.In order in described this domain, to deposit synchronizing content, should set up in advance on this this domain and the corresponding file system of disk array, be referred to as local file system in the present invention.In another embodiment, also should set up the backup file system in advance on described this domain, synchronously the time, depositing the file of packing temporarily, thereby can shorten lock in time, guarantee synchronous success; Described backup file system also is used to system to preserve a compress backup, guaranteeing can to return back to the front any one day, and can extract arbitrary file.In order to deposit described compress backup, this backup file system should guarantee that enough spaces are arranged; Disk array should be guaranteed to have the data snapshot of depositing between sync period in enough spaces and be changed, and described data snapshot variation is meant that system occurs in the variation in the disk array.In the present invention, also should use the system supervisor (crontab) of management timing operation in the unix system in system, to set up synchronization job and sync packet removal treatment in advance, wherein synchronization job is used for the daily synchronous working of timing automatic initiation enforcement, the bag that the sync packet removal treatment stays before being used for regularly initiating automatically to implement to clear up by retention strategy.For database, also should use crontab to set up the operation of database output journal in advance, thereby guarantee to spue daily record every certain interval time, make and lose the data that are no more than interval time when bust takes place, described interval time, I was made as 1 minute, but be made as generally speaking 5 minutes, the time of recovering with assurance can too much not prolong because journal file quantity.When guaranteeing that disk array damages, this domain can be according to the daily record restore data, and database should be made as dual logging, 1 part leaves on the disk array, 1 part leaves on this domain, wherein in the local log file system link should be set, to guarantee that switching the back daily record is same position.
As shown in Figure 2 of the present invention generally carried out when idle at system night synchronously, preferably does every day once, and consuming time shorter in the time of can guaranteeing to switch like this, synchronization times be exceeded at ordinary times.Described is to initiate by setting up good synchronization job self-timing in advance synchronously, and it further may further comprise the steps:
(1) snapshot current file system on disk array.Because unix system is an on-line system, data file constantly changes, and can not shut down again in the time of synchronously, so the present invention adopts the snapshot mode that the system file in the disk array is copied in the file system in this domain.The time of snapshot is very short, and AIX system guaranteed the consistance of backup current file, and in present tens times experiment, oracle database, tuxedo etc. all can normally open.
(2) snapshot document system packing is placed on this domain, it be the snapshot document system that the snapshot that is about to step (1) gained loads (mount), is placed in the backup file system in this domain after then described snapshot document system being packed.Wherein Da Bao purpose is depositing and compress for the ease of the snapshot document system.If load and time allow, and/or backup file system space anxiety, then also can be when packing described snapshot document system be compressed.
(3) file system on the disk array is removed snapshot, carry out this step and be constantly to produce new data, so packing needs to remove at once snapshot after finishing because moving during system synchronization.
(4) file that will deposit in the backup file system in this domain unpacks, be about to literature kit and untie and cover corresponding file system on this domain, if sync interval changes greatly, then need to unpack after the cleaning file system earlier, described cleaning work can regularly finish by setting up good sync packet removal treatment in advance again; Otherwise can directly unpack.
When following situation takes place, damage as disk array, be difficult in a short time repair; CLUSTER software handover success but system is still undesired; Suspect file destroyed but be difficult to and locate; Confirm needs to recover from tape comprehensively, but the user can not wait for; And confirm normally to carry out synchronously, when situation such as rehearsal has been added in switching before reaching the standard grade, can realize as shown in Figure 3 switching by following steps in the present invention:
(1) comments out each relevant operation of using crontab to set up in advance in the system.Because quicker in order to recover, during switchover operation, need stop synchronization job, sync packet removal treatment and the operation of database output journal.
(2) with the manual both sides CLUSTER software of cutting off of the mode of forcing (Force).Can guarantee not to be subjected under the switching state other to disturb like this, thereby avoid CLUSTER software generation auto-action.Cut off CLUSTER software in the Force mode in addition, also can realize without manual change NIC address, mount file system or the like.
(3) cut off all application that comprise database on the disk array.If at this moment system has taken place unusually, can't guarantee that each application can unload (umount) and get off, therefore can kill (kill) process in case of necessity.
(4) do not bear the same name for guaranteeing, also be convenient to different machine maintenance complex system, need lay down the file system on the disk array, stop disk array and use.
(5) because system can not have the file system of two duplications of name to load, therefore at this moment need to revise loading (mount) point of the respective file system on this domain, then these file system of mount again.
(6) restore database on this domain.Utilize the daily record of regularly implementing to spue by the database output journal operation of setting up in advance to come restore database in the present invention, because daily record of output in general 5 minutes at ordinary times realizes recovering so this goes on foot after starting script again.This should be consuming time the longest when whole switching.
(7) open other application on this domain.
(8) it is normal to confirm to run on the system in this domain.
When the fault that is taken place is confirmed to repair at different machine or development environment; Database shifts to an earlier date 1-2 hour and recovers consistent with current runtime database by modes such as heat are equipped with; Enough Scheduled Down Times are arranged greater than 2 hours; And under the situation that person skilled is shown up, can realize as shown in Figure 4 recovering step by following steps in the present invention:
(1) stops to comprise on this domain all application of database, guarantee preferably that in this step all application can both normally stop, especially database.
(2) because system can not have the file system of two duplications of name to load, therefore at this moment need local file system is returned former, promptly change back former loading (mount) point, then mount again.
(3) activate disk array, mount file system.
(4) anti-corresponding document synchronously.Here mainly be to satisfy a small amount of journal file copy protection in this domain that produces again after the recovery prerequisite to disk array.
(5) database on the recovery disk array.Because certain point before database has returned to and shut down, so the recovery daily record is limited, speed is very fast.
(6) application in the unlatching disk array.
(7) confirm that the system that runs in the disk array is normal.
(8) restart CLUSTER software.Because when switching is the CLUSTER software that stops in the Force mode, so can only start CLUSTER software itself automatically when starting CLUSTER software.
(9) enable the relevant operation of crontab, comprise synchronization job, sync packet removal treatment and the operation of database output journal.

Claims (7)

1, the system recovery method under a kind of unix environment is characterized in that, comprising: synchronously, switch and recover three steps; Wherein, described synchronously for finish the copy step of disk array at daily self-timing to this domain; Described switching is meant when disk array breaks down, and unix system is switched to the step that operation is continued in this domain by disk array; Described recovery is meant and after disk array recovers normally unix system is changed the disk array operation again by this domain, returns to the step of normal condition;
Described synchronously by setting up good synchronization job self-timing initiation in advance, it further may further comprise the steps:
(1) snapshot current file system on disk array;
(2) snapshot with step (1) gained is loaded as the snapshot document system, to be placed in the backup file system in this domain after the packing of described snapshot document system then, described this domain is the disk that is built in the computer that described unix system moves, on described this domain, establish the backup file system and with the corresponding file system of disk array;
(3) file system on the disk array is removed snapshot;
(4) literature kit that will leave in this domain backup file system unpacks, and covers on this domain and the corresponding file system of disk array.
2, the system recovery method under the unix environment according to claim 1 is characterized in that, described this domain is made up of two built-in disks.
3, the system recovery method under the unix environment according to claim 1 is characterized in that, in the described snapshot document system packing to disk array in carrying out step (2), it is compressed.
4, the system recovery method under the unix environment according to claim 1 is characterized in that, in described step (4) is literature kit to be unpacked after the sync packet removal treatment self-timing that foundation is good is in advance cleared up the file system in this domain again.
5, the system recovery method under the unix environment according to claim 1 is characterized in that, described switching further comprises:
(1) comments out each relevant operation that the system supervisor that uses management timing operation in the unix system is set up in advance;
(2) cut off both sides CLUSTER software in compulsory mode;
(3) cut off all application that comprise database on the disk array;
(4) lay down the file system of disk array, stop use disk array;
(5) revise on this domain gatehead with the corresponding file system of disk array, and on described this domain of reloading with the corresponding file system of disk array;
(6) restore database on this domain;
(7) open other application on this domain;
(8) it is normal to confirm to run on the unix system in this domain.
6, the system recovery method under the unix environment according to claim 5 is characterized in that, described step (6) is to utilize the daily record of regularly implementing to spue by the operation of database output journal to come the recovery of fulfillment database.
7, the system recovery method under the unix environment according to claim 1 is characterized in that, described recovery further comprises:
(1) stops to comprise on this domain all application of database;
(2) will change back former gatehead with the corresponding file system of disk array on this domain, and on described this domain of reloading with the corresponding file system of disk array;
(3) activation is loaded the file system of disk array to the use of disk array;
(4) anti-corresponding document synchronously;
(5) database on the recovery disk array;
(6) application in the unlatching disk array;
(7) confirm that the unix system that runs in the disk array is normal;
(8) restart CLUSTER software;
(9) enable the relevant operation of using the system supervisor foundation of managing timing operation in the unix system.
CN200710037953A 2007-03-09 2007-03-09 UNIX environment system restoration method Expired - Fee Related CN100589081C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200710037953A CN100589081C (en) 2007-03-09 2007-03-09 UNIX environment system restoration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200710037953A CN100589081C (en) 2007-03-09 2007-03-09 UNIX environment system restoration method

Publications (2)

Publication Number Publication Date
CN101261595A CN101261595A (en) 2008-09-10
CN100589081C true CN100589081C (en) 2010-02-10

Family

ID=39962061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200710037953A Expired - Fee Related CN100589081C (en) 2007-03-09 2007-03-09 UNIX environment system restoration method

Country Status (1)

Country Link
CN (1) CN100589081C (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541685A (en) * 2011-11-16 2012-07-04 中标软件有限公司 Linux system backup method and Linux system repair method
CN103309763A (en) * 2013-07-04 2013-09-18 曙光信息产业(北京)有限公司 Method and device for protection of fault-tolerant mechanism of virtual machine
CN105630848B (en) * 2014-11-25 2020-05-22 中兴通讯股份有限公司 Processing method and device of file system
CN106412061A (en) * 2016-09-28 2017-02-15 上海爱数信息技术股份有限公司 Linux-based log folder remote transmission system
CN114237719B (en) * 2020-09-09 2023-11-28 中国联合网络通信集团有限公司 USB flash disk identification method, system, computer equipment and storage medium

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
Overview of digtal UNIX cluster system architecture. Wayne M.Cardoza etc.Proceedings of COMPCON '96. 1996
Overview of digtal UNIX cluster system architecture. Wayne M.Cardoza etc.Proceedings of COMPCON'96. 1996 *
UNIX备份与恢复. W.Curtis Preson,51-56,273,机械工业出版社. 2003
UNIX备份与恢复. W.Curtis Preson,51-56,273,机械工业出版社. 2003 *
基于Cluster的多服务器容错与切换技术的研究. 卢燕宁等.微机发展,第6期. 2001
基于Cluster的多服务器容错与切换技术的研究. 卢燕宁等.微机发展,第6期. 2001 *
存储备份技术探析. 韩德志等.计算机应用研究,第6期. 2004
存储备份技术探析. 韩德志等.计算机应用研究,第6期. 2004 *
文件系统SAN备份方案的研究与实现. 王征华等.微电子学与计算机. 2005
文件系统SAN备份方案的研究与实现. 王征华等.微电子学与计算机. 2005 *

Also Published As

Publication number Publication date
CN101261595A (en) 2008-09-10

Similar Documents

Publication Publication Date Title
CN101739313B (en) Method for protecting and restoring continuous data
CN100589081C (en) UNIX environment system restoration method
CN101770410B (en) System reducing method based on client operating system, virtual machine manager and system
CN102521071A (en) Private cloud-based virtual machine maintaining method
CN102681917B (en) A kind of operating system and restorative procedure thereof
US6981177B2 (en) Method and system for disaster recovery
CN103853837B (en) Oracle does not stop the table level back-up restoring method of Production database automatically
CN105677516B (en) A kind of back-up restoring method calculating the high efficient and reliable in storage cloud platform
CN101316184B (en) Disaster tolerance switching method, system and device
CN102792276A (en) Buffer disk in flashcopy cascade
CN103336728A (en) Disk data recovery method
CN102521083A (en) Backup method and system of virtual machine in cloud computing system
CN105550062A (en) Continuous data protection and time point browse recovery based data backflow method
CN102779080B (en) Method for generating snapshot, method and device for data recovery by using snapshot
CN102662751A (en) Method for improving availability of virtual machine system based on thermomigration
JP2006012121A (en) Data backup system and method
CN104461776A (en) Application disaster tolerance method based on CDP and iSCSI virtual disk technology
CN103176831A (en) Virtual machine system and management method thereof
CN102360323A (en) Method and system for self-repairing down of network server
CN109739686A (en) A kind of multiserver heat backup method, system, device and storage medium
CN104142943B (en) A kind of data-base capacity-enlarging method and a kind of database
CN104636218B (en) Data reconstruction method and device
JP6070146B2 (en) Information processing apparatus and backup method
CN110673982A (en) Shared mysql database backup and recovery method and device
CN105591801B (en) A kind of virtual network function VNF fault handling method and VNF management equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100210