CN103856357A

CN103856357A - Stack system fault processing method and stack system

Info

Publication number: CN103856357A
Application number: CN201410106836.XA
Authority: CN
Inventors: 董琴
Original assignee: Maipu Communication Technology Co Ltd
Current assignee: Maipu Communication Technology Co Ltd
Priority date: 2014-03-21
Filing date: 2014-03-21
Publication date: 2014-06-11
Anticipated expiration: 2034-03-21
Also published as: CN103856357B

Abstract

The invention discloses a stack system fault processing method and a stack system, relating to the technical field of communications, and being capable of preventing services of the stack system from being interrupted or subjected to topology oscillation and ensuring the fault tolerance and stability of the stack system. The stack system fault processing method mainly comprises the following steps: determining whether fault equipment exists by local member equipment according to equipment information of all member equipment in the stack system; if the fault equipment exists, according to the information of the local member equipment and information of main control equipment, determining whether the local member equipment has a fault; and if the local member equipment has the fault, setting the local member equipment to be an isolation state, and forwarding data services inside the stack system. The embodiment of the invention is mainly used for processing a fault caused by the faulted member equipment in the stack system under the normal condition of normal physical communication of the stack system.

Description

A kind of pile system fault handling method and pile system

Technical field

The present invention relates to communication technical field, relate in particular to a kind of pile system fault handling method and pile system.

Background technology

In communication network, in order to increase reliability, some redundant links conventionally can be designed.The loop causing in order to eliminate Redundancy Design, prior art has proposed a kind of Stack Technology, Stack Technology is by many network equipments (normally switch of the described network equipment) the composition pile system that links together, to port as much as possible is provided in limited space by stacking cable; These network equipments in pile system are referred to as the member device of pile system.Along with improving constantly that network stabilization and equipment dependability require, applicant, on the basis of traditional stack technology, has proposed virtual switch technology (VST, Virtual Switching Technology).Virtual switch technology is a kind of based on pile system, virtual many physical equipments be the technology of single virtual unit use.

Concrete, form virtual unit by VST and in network, be equivalent to a switch, be connected with peripheral equipment by aggregated links.Pile system (Stacking System, SS) is usually formed by stacking link connection by many identical equipment of configuration, is externally rendered as a virtual relatively large equipment.Participate in stacking multiple devices, wherein an equipment is main control device (Master), and other equipment are all slave (Slave).Wherein Master equipment, in state of activation (Active), serves as manager and effector's role, its configuration take-effective; Slave equipment is in stand-by state (Standby), and its configuration does not come into force.

But it is that homologous series equipment or the equipment with same hardware specification just can carry out stacking that pile system requires conventionally.In the time that wherein equipment is stacking because specification difference can not normally form, just need to carry out troubleshooting to these equipment.Existing way is, when finding faulty equipment, by main control device, this faulty equipment closed, or artificially this faulty equipment closed, to faulty equipment is reconfigured, or the equipment more renewing.

For example, Fig. 1 is the schematic diagram of a typical pile system.This pile system is made up of six equipment, supposes that wherein equipment 2 is Master equipment, and equipment 1, equipment 3, equipment 4, equipment 5, equipment 6 are Slave equipment, but we are by this pile system called after SS-0.Equipment 4, equipment 6 hardware specifications and main control device is diversified in specifications causes in this pile system.Now, member device 4 and 6 still can not add pile system, needs human intervention and faulty equipment is replaced.

In the time that faulty equipment is closed, can cause whole pile system service disconnection even to occur pile system division.Concrete, in Fig. 2, so equipment 4 and equipment 6 due to inconsistent being closed of hardware specification of main control device, after the stacking circuit of equipment 4 and equipment 6 is closed, related stack circuit disconnects, and is only left equipment 1 in original pile system SS-0, equipment 2, equipment 3; Equipment 5, owing to disconnecting with being connected of main control device, can form alone new pile system (called after SS-1), and equipment 5 is elected as the main equipment of SS-1, its configuration take-effective.In system, will there are like this two identical pile systems of configuration, therefore there will be the conflicts that activate more.

State in realization in the process of troubleshooting of pile system, inventor finds that in prior art, at least there are the following problems: while existing multiple devices to close because of fault in pile system, can cause whole pile system service disconnection, even occur stack system topological concussion, less stable.

Summary of the invention

Embodiments of the invention provide a kind of pile system fault handling method and pile system, avoid service disconnection or the topology concussion of pile system, ensure fault-tolerance and the stability of pile system.

For achieving the above object, embodiments of the invention adopt following technical scheme:

An aspect of of the present present invention provides a kind of pile system fault handling method, comprising:

Local member device, according to the facility information of all member devices in pile system, determines whether to exist faulty equipment;

If there is faulty equipment, according to the facility information of the facility information of self and main control device, determine whether to have fault for described local member device;

If described local member device exists fault, self is set to isolation, forwards the data service of described pile system inside.

Another aspect of the present invention provides a kind of pile system, comprising:

Multiple member devices, one in wherein said multiple member devices is main control device, each in described multiple member devices all can be used as local member device;

Described local member device is used for: according to the facility information of all member devices of pile system, determine whether to exist faulty equipment;

Pile system fault handling method and pile system that the embodiment of the present invention provides, while existing faulty equipment and main control device hardware specification or hardware form inconsistent in pile system, if in the situation that stacking circuit can normally connect, allow faulty equipment to add pile system with isolation, the data service that completes pile system inside by hardware forwarding table forwards, pile system can normally be formed, avoid service disconnection or the topology concussion of pile system, ensure fault-tolerance and the stability of pile system.

Brief description of the drawings

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.

Fig. 1 is a kind of pile system composition schematic diagram;

Fig. 2 is the pile system fault handling method schematic diagram in background technology;

Fig. 3 is a kind of pile system fault handling method flow chart in the embodiment of the present invention 1;

Fig. 4 is a kind of pile system fault handling method flow chart in the embodiment of the present invention 2;

Fig. 5 is a kind of pile system composition schematic diagram in the embodiment of the present invention 3.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiment.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.

Embodiment 1

The embodiment of the present invention provides a kind of pile system fault handling method, and as shown in Figure 3, the method can comprise:

101, local member device, according to the facility information of all member devices in pile system, determines whether to exist faulty equipment.

Wherein, pile system comprises multiple member devices, wherein only has a main control device, the configuration of main control device be come into force realize the external practical function of pile system.Other member devices outside main control device are stand-by equipments, in stand-by state.The configuration of stand-by equipment is infirm, only breaks down or when other reasons can not normally complete external functional of this pile system at main control device, just can from stand-by equipment, select one and substitute and become main control device.

In the present embodiment, local member device can be any one member device in multiple member devices in pile system, comprises main control device or stand-by equipment.

Wherein, according to the facility information of all member devices in pile system, determine whether to exist faulty equipment to comprise: the facility information that obtains all member devices in described pile system; Whether the facility information that judges all member devices is all consistent with the facility information of main control device; If the facility information of all member devices is all consistent with the facility information of main control device, determines in described pile system and do not have faulty equipment; If exist the facility information of at least one member device and the facility information of main control device inconsistent, determine in described pile system and have faulty equipment.

If 102 exist faulty equipment, according to the facility information of the facility information of self and main control device, determine whether to have fault for described local member device.

Wherein, determine whether to be specially for described local member device exists the method for fault: if the facility information of local member device self is consistent with the facility information of main control device, explanation is not that local member device exists fault; If the facility information of local member device self and the facility information of main control device are inconsistent, determine it is that local member device exists fault.

In the present embodiment, if there is not faulty equipment in pile system, can normally elect a main control device, other equipment, as stand-by equipment, complete the external function of whole pile system.Pile system can be born the function of the equipment such as a virtual switch or controller, and the present embodiment does not limit for the specific implementation function of pile system and concrete structure and the business function of member device.

If 103 described local member devices exist fault, self is set to isolation, forwards the data service of described pile system inside.

Wherein, self is set to isolation, and the data service that forwards described pile system inside comprises: the application state that described local member device is set is isolation; Evaluation work topology, arranges hardware forwarding table; Close all service ports of described local member device; Set up the internal managing path of described local member device to described main control device, to managed by described main control device.

Further, if not local member device exists fault, explanation is that other member devices in pile system exist fault.Local member device can further determine whether described local member device is described main control device; If described local member device is main control device, described main control device faulty equipment is set to isolation; If described local member device is not main control device, faulty equipment is set to isolation, and evaluation work topology, arranges hardware forwarding table.

Wherein, main control device faulty equipment is set to isolation, comprising: the application state that faulty equipment is set is isolation; Evaluation work topology, arranges hardware forwarding table; Set up the internal managing path of described main control device to described faulty equipment, to described faulty equipment is managed.

The pile system fault handling method that the embodiment of the present invention provides, while existing faulty equipment and main control device hardware specification or hardware form inconsistent in pile system, if in the situation that stacking circuit can normally connect, allow faulty equipment to add pile system with isolation, the data service that completes pile system inside by hardware forwarding table forwards, pile system can normally be formed, avoid service disconnection or the topology concussion of pile system, ensure fault-tolerance and the stability of pile system.

Embodiment 2

The embodiment of the present invention provides a kind of pile system fault handling method, describes below as an example of arbitrary member device in pile system example, and as shown in Figure 4, the method comprises:

201, obtain the facility information of all member devices in described pile system.

Wherein, in the time carrying out stacking information, all stack member apparatus information be need to collect, software version information, hardware version information and hardware specification information etc. comprised.In the present embodiment, each member device can be first determines main control device and each stand-by equipment according to predefined role's election regulation, role's election regulation for regulation member device how according to all members' facility information select most suitable one as main control device.

202, judge that whether the facility information of all member devices is all consistent with the facility information of main control device; If the facility information of all member devices is all consistent with the facility information of main control device, perform step 203; If exist the facility information of at least one member device and the facility information of main control device inconsistent, perform step 204.

203, determine and in described pile system, do not have faulty equipment.

Wherein, if there is not faulty equipment in pile system, can carry out normal election and confirm, elect a main control device, other equipment, as stand-by equipment, complete the external function of whole pile system.

204,, according to the facility information of the facility information of self and main control device, determine whether to have fault for described local member device; If local member device self is faulty equipment, perform step 205; If local member device self is not faulty equipment, perform step 209.

For example, still taking shown in Fig. 1 to pile system SS-0 as example, this is read dish system and is made up of six equipment, wherein equipment 2 is Master equipment, equipment 1, equipment 3, equipment 4, equipment 5, equipment 6 are Slave equipment.Member device information and the main control device of equipment 4, equipment 6 are inconsistent.Be divided into pile system fault detect according to its operating process, member's isolation processing of main equipment, local member device isolation processing, non-local member device isolation processing, the process of fault restoration processing.The flow process that wherein step 201-204 is fault detect, step 205-208 is local member device isolation processing flow process, and step 210-212 is main control device isolation processing flow process, and 213-214 is non-local member device isolation processing flow process.Fault restoration processing can be managed reparation automatically for pile system, can be also artificial reparation, and the present embodiment does not limit this.

205, the application state of described local member device is set is isolation to faulty equipment.

Wherein, faulty equipment only application state is set to isolation, keeps the connectedness of stacking circuit, allows the data service of transparent forwarding pile system inside.

206, faulty equipment evaluation work topology, arranges hardware forwarding table.

Wherein, faulty equipment enters after isolation, normally calculates and the hardware forwarding table of configuring stacking circuit, guarantees the business datum of correct transparent transmission pile system inside.Wherein, stacked state refers to accuses role and standby role, and wherein which equipment is faulty equipment, and which is faulty equipment.

207, faulty equipment is closed all service ports of described local member device.

Wherein, the service port of faulty equipment closing fault equipment itself, avoids service traffics to forward abnormal; Forbid sending service request to master control.

208, faulty equipment is set up the internal managing path of described local member device to described main control device, to managed by described main control device.

Wherein, faulty equipment allows to set up the internal managing path of this member device to main control device, allows by limited administration.Limited administration refers to that management channels is mainly used in repairing for fault, allows to carry out software release upgrade, the bookkeeping such as restarts.

209, determine whether described local member device is described main control device; If described local member device is main control device, perform step 210; If described local member device is not main control device, perform step 213.

210, the application state of faulty equipment is set is isolation to main control device.

Wherein, main control device does not allow to tell this fault of application module member to add, and the application state of faulty equipment is set to isolation; Do not allow the business board of load fault equipment, avoid business configuration to be issued to faulty equipment; And, ignore the service request of faulty equipment, as synchronization request etc.

211, main control device evaluation work topology, arranges hardware forwarding table.

Wherein, confirm after the stacked state of all member devices, evaluation work topology, arranges the hardware forwarding table of stacking circuit, with the correctness that ensures that pile system internal hardware forwards.

212, main control device is set up the internal managing path of described main control device to described faulty equipment, to described faulty equipment is managed.

Wherein, allow to set up the internal managing path of main control device to faulty equipment, the member device of permission isolation can be by limited administration.

213, non-fault member device faulty equipment is set to isolation.

Wherein, the non-fault member device here refer to by step above has got rid of local member device be faulty equipment may, and local member device is not main control device, that is to say that non-faulty equipment refers to the stand-by equipment that does not have fault.

214, non-fault member device evaluation work topology, arranges hardware forwarding table.

For example, still taking the pile system shown in Fig. 1 as example, explain a kind of mode of carrying out fault restoration under the stacking topological environmental of chain.Under the stacking topological environmental of chain, if the member device on chain breaks down, equipment replacement can cause pile system to occur division, by adopting fault handling method of the present invention, the fault restoration mode under the stacking topological environmental of following chain be can realize, pile system concussion or service disconnection avoided.First, pile system forms, the main control device that equipment 2 is pile system; Equipment 4, equipment 6 are owing to normally being isolated with the member device information of main control device is inconsistent.When reparation, first one group of stacking circuit of new access between equipment 1, equipment 6, making the stacking change in topology of chain is annular stacking topology.Then, repair member device 4, make the member device 4 after repairing can normally add pile system.Repair member device 6, make the member device 6 after repairing can normally add pile system.So, in pile system, be all the member device that specification is consistent, just can normally form pile system, realize external function.

Be understandable that, in the normal pile system of stacking connection, even if exist fault member device software version information, hardware version information, hardware specification information and main control device inconsistent, still can ensure the normal formation of pile system.In the time of fault restoration, be chain pile system if need to replace faulty equipment and pile system, by manual operation, chain pile system is changed to annular pile system, then fault stack member apparatus is replaced one by one, the business that so can not affect pile system itself forwards, can not cause pile system division, make whole pile system more reliable and stable yet.Versatility of the present invention is good, is applicable to any pile system environment that stacking circuit can normally connect.

Embodiment 3

The embodiment of the present invention provides a kind of pile system, and as shown in Figure 5, this system comprises:

Multiple member devices, for example member device 1, member device 2, member device 3, member device 4 ... member device N.One in wherein said multiple member device is main control device, for example the member device 2 in Fig. 5.Each in described multiple member device all can be used as local member device.

Further, described local member device also for:

Obtain the facility information of all member devices in described pile system;

Whether the facility information that judges all member devices is all consistent with the facility information of main control device;

If the facility information of all member devices is all consistent with the facility information of main control device, determines in described pile system and do not have faulty equipment;

If exist the facility information of at least one member device and the facility information of main control device inconsistent, determine in described pile system and have faulty equipment.

Further, if described local member device is faulty equipment, described local member device also for:

The application state that described local member device is set is isolation;

Evaluation work topology, arranges hardware forwarding table;

Close all service ports of described local member device;

Set up the internal managing path of described local member device to described main control device, to managed by described main control device.

Further, if described local member device is not faulty equipment, described local member device also for:

Determine whether described local member device is described main control device;

If described local member device is main control device, described main control device faulty equipment is set to isolation;

If described local member device is not main control device, faulty equipment is set to isolation, and evaluation work topology, arranges hardware forwarding table.

Further, if described local member device is main control device, described local member device also for:

The application state that faulty equipment is set is isolation;

Evaluation work topology, arranges hardware forwarding table;

Set up the internal managing path of described main control device to described faulty equipment, to described faulty equipment is managed.

It should be noted that, each member device in the embodiment of the present invention all can be for realizing the pile system fault handling method in embodiment 1 and 2, therefore the corresponding content of the specific descriptions of part functional unit in can reference method embodiment in the present embodiment, it is no longer repeated here for the present embodiment.

The pile system that the embodiment of the present invention provides, while existing faulty equipment and main control device hardware specification or hardware form inconsistent in pile system, if in the situation that stacking circuit can normally connect, allow faulty equipment to add pile system with isolation, the data service that completes pile system inside by hardware forwarding table forwards, pile system can normally be formed, avoid service disconnection or the topology concussion of pile system, ensure fault-tolerance and the stability of pile system.

Through the above description of the embodiments, those skilled in the art can be well understood to the mode that the present invention can add essential common hardware by software and realize, and can certainly pass through hardware, but in a lot of situation, the former is better execution mode.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product is stored in the storage medium can read, as the floppy disk of computer, hard disk or CD etc., comprise that some instructions are in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) carry out the method described in each embodiment of the present invention.

The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited to this, any be familiar with those skilled in the art the present invention disclose technical scope in; can expect easily changing or replacing, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection range of described claim.

Claims

1. a pile system fault handling method, is characterized in that, comprising:

2. pile system fault handling method according to claim 1, is characterized in that, according to the facility information of all member devices in pile system, determines whether to exist faulty equipment to comprise:

Obtain the facility information of all member devices in described pile system;

3. pile system fault handling method according to claim 1, is characterized in that, describedly self is set to isolation, and the data service that forwards described pile system inside comprises:

The application state that described local member device is set is isolation;

Evaluation work topology, arranges hardware forwarding table;

Close all service ports of described local member device;

4. pile system fault handling method according to claim 1, is characterized in that, after determining whether there is fault for described local member device, also comprises:

If not described local member device exists fault, determine whether described local member device is described main control device;

5. pile system fault handling method according to claim 4, is characterized in that, described main control device faulty equipment is set to isolation, comprising:

The application state that faulty equipment is set is isolation;

Evaluation work topology, arranges hardware forwarding table;

6. a pile system, is characterized in that, comprising:

7. pile system according to claim 6, is characterized in that, described local member device also for:

Obtain the facility information of all member devices in described pile system;

8. pile system according to claim 6, is characterized in that, if described local member device is faulty equipment, described local member device also for:

The application state that described local member device is set is isolation;

Evaluation work topology, arranges hardware forwarding table;

Close all service ports of described local member device;

9. pile system according to claim 6, is characterized in that, if described local member device is not faulty equipment, described local member device also for:

10. pile system according to claim 9, is characterized in that, if described local member device is main control device, described local member device also for:

The application state that faulty equipment is set is isolation;

Evaluation work topology, arranges hardware forwarding table;