CN103793296A

CN103793296A - Method for assisting in backing-up and copying computer system in cluster

Info

Publication number: CN103793296A
Application number: CN201410006210.1A
Authority: CN
Inventors: 聂磊
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2014-01-07
Filing date: 2014-01-07
Publication date: 2014-05-14

Abstract

The invention provides a method for assisting in backing-up and copying a computer system in a cluster. The method specifically includes: setting a back-up copy system in the cluster; allocating every primary back-up copy and secondary back-up copy in a hierarchical structure; when a fault of one of the copies is detected, substituting the faulted copy with the next layer; recopying the copy affected least and rebuilding a primary copy, an auxiliary copy and the secondary back-up copy, wherein the back-up copy system is provided with at least one client end, at least one node, one primary copy, minor copies and one secondary back-up copy. Compared with the prior art, the method for assisting in backing-up and copying the computer system in the cluster has a key task of improving an environment and is applied in real time, and has the advantages of strong practicability, wide application range and easiness in popularization.

Description

A kind of method that copies computer system in cluster for secondary backup

Technical field

The present invention relates to clustered computer system technology, more specifically say the method that copies computer system in cluster for secondary backup.

Background technology

In group system, an intrinsic subject matter is the potential leak of their failure.When the collapse in cluster, may be affected to an available single node of whole system.Redundancy, to increase the reliability of system, is incorporated in system, conventionally by the assembly copying.Be replicated in the service that service in distributed system or process need, the state that each copy is consistent.Guarantee that this consistance is by a specific replication protocol.Have diverse ways, the copy of organization flow and general differentiation are active, passive and half active copying.

In active reproduction technology, be also referred to as the method for state machine, each replica processes request is replied from client and transmission.The behavior of separate copy and technology comprise guarantees that the request of receiving of all copies is with identical order.In the situation of collapse, this technology has the low response time.But because all requests of all copy parallel processings, the expense while producing an operation showing, is a unpractical selection thereby make the high-availability solution of business application.

With passive reproduction technology, be also referred to as active and standby part, one of them copy, is called master, receives the request from client, and returns to response.Backup and main accepting state updating message.If master server breaks down, backup is taken over.Active unlike copying, more active than copying, it needs less processing power and processes the decision of asking and do not make any hypothesis.But, have and showing the response time increasing, failed in the situation that, make it be not suitable for the context of the application program strict to time requirement.

Half Active Replication reprography is active, evades uncertain problem, under the background of time-critical type application program.This technology is based on the active leader who copies and expand and tagger's concept.Although all copies that the request of actual treatment is carried out, it is the processing of carrying out uncertainty part, and the responsibility of informing tagger's leader.This technology is to Active Replication, and the processing of uncertainty is possible difference.But expense release time showing is to produce in the situation of primary copy of a failure.

Summary of the invention

Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of method that copies computer system in cluster for secondary backup is provided.

Technical scheme of the present invention realizes in the following manner, a kind of this method that copies computer system in cluster for secondary backup, and its specific implementation process is:

Backup copies system is set in cluster, in this system, has at least one client, have a node at least, primary copy, less important copy and a Secondary Backup copy;

Each elementary and Secondary Backup copy in Distribution Layer aggregated(particle) structure;

In the time that the fault of one of them copy is detected, replace with lower one deck the copy breaking down;

Regeneration copies the copy with affected lowest level, rebuilds primary copy, secondary copy and Secondary Backup copy.

The described copy breaking down is less important copy, and new secondary copy promotes Secondary Backup copy, and reconfigures, and starts a new Secondary Backup copy.

The described copy breaking down is Secondary Backup copy, and clone itself forms the copy secondary copy of a new Secondary Backup copy.

The described copy copying is a single operating system, the i.e. image of AIX or (SuSE) Linux OS.

The beneficial effect that the present invention compared with prior art produced is:

A kind of method that copies computer system for secondary backup in cluster of the present invention adopts the arrangement of half Active Replication, here the relation of Secondary Backup between the main and less important duplicate adopting, and in group system quick-recovery or fault recovery soon, guarantee lower working time of expense and instantaneous fail-over capability.Copy the cluster of such process or system, lasting availability can guarantee, and failed in the situation that, and response and obviously reducing release time improves the mission critical of environment and application in real time, practical, is easy to popularization.

Accompanying drawing explanation

Accompanying drawing 1 is embodiments of the invention structural representations.

Accompanying drawing 2 is computer system schematic diagram of the cluster of node, client and a communication channel shown in the embodiment of the present invention.

Accompanying drawing 3 is primary copy process flows diagram flow chart of embodiments of the invention fault graph.

Accompanying drawing 4 is process flow diagrams of the duplicate failure of auxiliary view current in embodiments of the invention.

embodiment

Below in conjunction with accompanying drawing, a kind of method that copies computer system for secondary backup in cluster of the present invention is described in detail below.

Fundamental purpose of the present invention is to copy plan, complete " Secondary Backup copies " and process request, reduce at one time simultaneously working time and release time expense determinacy do not make any hypothesis, therefore make its be applicable to high availability and the fault-tolerant management of the application of mission critical and time-critical.

Another object of the present invention is to copy a new reproduction technology referred to as " secondary backup " in clustered computing system.In this technology, node in a process or a computer cluster is copied to three copies of a group or three process copies of clone, participate in Secondary Backup agreement and role classical " elementary " and " secondary ", except having introduced in a new role of this technology, referred to as " Secondary Backup " or " backup ".Secondary Backup is to one of the process of the process group of secondary copy or copy of system as a Hot Spare.Main and less important duplicate is participated in half Active Replication agreement, and has similar passive replication relation, between secondary and Secondary Backup.

Another object of the present invention is secondary copy and the triplicate between triplicate and the agreement of low expense of introducing.In addition, forever only have in addition " follower " participation program, adopt half Active Replication here.

The invention provides a kind of method that copies computer system in cluster for secondary backup, its specific implementation process is:

The copy copying is a single operating system, the i.e. image of AIX or (SuSE) Linux OS.

Embodiment.

Example as shown in Figure 1: this computer system of trooping has one or more client 12a----12N, a kind of communication system 13 and 14, node 16a----16n, disk bus 18, and one or more shared disk 20a----20n.Operable other bunches of the present invention may seem not to be both very much the quantity that depends on processor, the network of use and the selection of magnetic disc, etc.It can be understood that, client 12 is that a processor can be accessed this node 16 by LAN, the public LAN shown in the private local area network (LAN) as shown in 13 or 14.The every operation of client 12 one " front end " or client application querying server application program operate in cluster node 16.It also it will be understood that, the figure in system.As shown in Figure 1, each node 16 has the access of one or more shared external disk equipment 20.Each disk unit 20 can be connected to multiple nodes physically.Shared disk store tasks is closed of bonding data and is conventionally configured to data redundancy.The core of the group system 10 that node 16 forms.Node 16 is a processor, the high availability of operation and fault-tolerant management software and application software.

A new replication management technology, secondary backup copies, one group, copy in the process of the distributed system of disclosure management high availability.In secondary backup process, a copy is as secondary copy, rather than the backup of primary copy is the method for common active and standby part, the wherein situation of triplicate backup primary copy.

Accompanying drawing 2 shows the backup secondary reproducing unit of trooping, by client 1 and three copies 4,5, and 14.Each copy can be considered to operate in single computer systems or LPAR image on the technique single as one or container.A copy, also can represent one single, as AIX or (SuSE) Linux OS reflection.All three copies 4,5,6, also can be regarded as three independently process operate in a computer system.The request of all clients in primary copy 4 and secondary copy 5 processes is operations of being responsible for processing all uncertainty but only have primary copy 4.Secondary copy 5, then be forced to make identical decision, by the Secondary Backup of primary copy 4, secondary copy 5 regular updates, change to the state of Secondary Backup copy 6 comprising its state of inspection, affect run-time overhead cluster thereby reduce to greatest extent Secondary Backup copy 6.

Under normal circumstances, the composition of the unsuccessfully change group of a copy in a group, has provoked view variation.According to the fault of a copy in system or the processing mode difference of loss of data of the effect of the failed copy of hypothesis.Because Secondary Backup copy 6 does not participate in any group interaction that exceeds, its failure is completely transparent, the tissue of this copy.

Accompanying drawing 3 is a kind of methods of a process flow diagram, the failure of the primary copy 4 being wherein detected.In step 9, the primary copy of fault detected.In the time carrying out step 10, detect failed primary copy 4, secondary copy 5 moments adapter, and continue to calculate, consider the effect to primary copy 4.In client 12, the first thing that secondary copy 5 does is any pendent event of resetting, and it has received last known state itself that bring up-to-date primary copy 4 from failed primary copy 4.Secondary copy 5 will continue to carry out, and self-propelled synchronous and Secondary Backup copy 6, all waiting events after processing.Then communication system 13 or Secondary Backup are elevated to new booster action, secondary copy 6.

Accompanying drawing 4 is process flow diagrams of a process, the current secondary copy 5 that described fault is detected.If current secondary copy 5 breaks down, fault 14 detected.In step 15, the secondary part of Secondary Backup copy 6 promotions itself.There is extra resource, start and reconfigure a new copy of group beginning by the effect of Secondary Backup copy 6 at 3 secondary copies 4, recover the original degree that copies.

A process of the Secondary Backup copy 6 that described fault is detected.The fault of Secondary Backup copy 6 does not affect cluster state, because it does not participate in the processing of request and response.At 18 places, 4 clones of secondary copy, set up a new Secondary Backup 6 if possible.

The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. in cluster, copy a method for computer system for secondary backup, it is characterized in that its specific implementation process is:

2. a kind of method that copies computer system in cluster for secondary backup according to claim 1, it is characterized in that: described in the copy that breaks down be less important copy, new secondary copy promotes Secondary Backup copy, and reconfigures, and starts a new Secondary Backup copy.

3. a kind of method that copies computer system in cluster for secondary backup according to claim 1, it is characterized in that: described in the copy that breaks down be Secondary Backup copy, clone itself forms the copy secondary copy of a new Secondary Backup copy.

4. according to a kind of method that copies computer system in cluster for secondary backup described in claim 2 or 3, it is characterized in that: described in the copy that copies be a single operating system, the i.e. image of AIX or (SuSE) Linux OS.