CN117155763A - Method and device for processing switch faults, storage medium and electronic equipment - Google Patents

Method and device for processing switch faults, storage medium and electronic equipment Download PDF

Info

Publication number
CN117155763A
CN117155763A CN202311222585.7A CN202311222585A CN117155763A CN 117155763 A CN117155763 A CN 117155763A CN 202311222585 A CN202311222585 A CN 202311222585A CN 117155763 A CN117155763 A CN 117155763A
Authority
CN
China
Prior art keywords
switch
information
target
target switch
meta
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311222585.7A
Other languages
Chinese (zh)
Inventor
刘海军
高铭
杨光熠
王辉
王炜煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Bilibili Technology Co Ltd
Original Assignee
Shanghai Bilibili Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Bilibili Technology Co Ltd filed Critical Shanghai Bilibili Technology Co Ltd
Priority to CN202311222585.7A priority Critical patent/CN117155763A/en
Publication of CN117155763A publication Critical patent/CN117155763A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0659Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)

Abstract

The present disclosure provides a method, apparatus, computer program product, non-transitory computer readable storage medium, and electronic device for processing a switch failure. The method comprises the following steps: receiving alarm information indicating that a target switch in a switch cluster fails, wherein the alarm information comprises a switch identifier of the target switch; acquiring meta information of a target switch according to the switch identification of the target switch, wherein the meta information comprises at least one of a device model and a network structure; determining a corresponding isolation command according to meta information of the target switch; an isolation command is sent to the configuration issuing system to isolate the target switch from the switch cluster. The embodiment of the disclosure can automatically process the faults of the switch, is beneficial to improving the processing efficiency and reducing the labor cost.

Description

Method and device for processing switch faults, storage medium and electronic equipment
Technical Field
The present disclosure relates generally to the field of computer technology, and more particularly, to a method, apparatus, computer program product, non-transitory computer readable storage medium, and electronic device for processing a switch failure.
Background
This section is intended to introduce a few aspects of the art that may be related to various aspects of the present disclosure that are described and/or claimed below. This section is believed to help provide background information to facilitate a better understanding of various aspects of the disclosure. It should therefore be understood that these statements are to be read in this light, and not as admissions of prior art.
The failure of the switch equipment greatly affects the network quality, the service side perception is obvious, and the hardware failure rate is highest in the switch failure condition. The time of the occurrence of the hardware faults of the switch is random, the hardware faults are difficult to predict and prevent, the faults need to be immediately processed once, and the conditions of omission or untimely processing can occur by manual processing, so that an automatic isolation system is urgently needed for operation. At present, a mature system is not available in the industry to automatically solve the problem of hardware failure of a switch.
Therefore, there is a need to propose a new solution to alleviate or solve at least one of the above-mentioned problems.
Disclosure of Invention
The disclosure aims to provide a method, a device, a computer program product, a non-transitory computer readable storage medium and an electronic device for processing a switch fault, so as to automatically process the switch fault, improve processing efficiency and reduce labor cost.
According to a first aspect of the present disclosure, there is provided a method for processing a switch failure, applied to a switch cluster, the switch cluster including at least two interconnected switches and a configuration issuing system for managing the switches, the method comprising: receiving alarm information indicating that a target switch in the switch cluster fails, wherein the alarm information comprises a switch identifier of the target switch; acquiring meta information of the target switch according to the switch identification of the target switch, wherein the meta information comprises at least one of a device model and a network structure; determining a corresponding isolation command according to the meta information of the target switch; and sending the isolation command to the configuration issuing system so as to isolate the target switch from the switch cluster.
According to a second aspect of the present disclosure, there is provided a processing apparatus for a switch failure, applied to a switch cluster including at least two interconnected switches and a configuration issuing system for managing the switches, the apparatus comprising: the receiving module is used for receiving alarm information representing that a target switch in the switch cluster fails, wherein the alarm information comprises a switch identifier of the target switch; the acquisition module is used for acquiring meta information of the target switch according to the switch identification of the target switch, wherein the meta information comprises at least one of a device model and a network structure; the determining module is used for determining corresponding isolation commands according to the meta information of the target switch; and the sending module is used for sending the isolation command to the configuration issuing system so as to isolate the target switch from the switch cluster.
According to a third aspect of the present disclosure, there is provided a computer program product comprising program code instructions which, when the program product is executed by a computer, cause the computer to perform the method according to the first aspect of the present disclosure.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method according to the first aspect of the present disclosure.
According to a fifth aspect of the present disclosure, there is provided an electronic device comprising: a processor, a memory in electronic communication with the processor; and instructions stored in the memory and executable by the processor to cause the electronic device to perform the method according to the first aspect of the present disclosure.
In the embodiment of the disclosure, after the alarm information is received, the corresponding isolation command is determined according to the meta information of the target switch, and the isolation command is sent to the configuration issuing system so as to isolate the target switch from the switch cluster, so that the fault of the switch can be automatically processed, the processing efficiency is improved, and the labor cost is reduced.
It should be understood that what is described in this section is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used solely to determine the scope of the claimed subject matter.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art. Throughout the drawings, identical reference numerals designate similar, but not necessarily identical, elements.
FIG. 1 illustrates a system architecture diagram of one embodiment of a method of handling a switch failure according to the present disclosure;
FIG. 2 illustrates a flow chart of one embodiment of a method of handling a switch failure according to the present disclosure;
fig. 3A and 3B are diagrams showing a specific example of a method of handling a switch failure according to the present disclosure;
FIG. 4 illustrates an exemplary block diagram of one embodiment of a processing apparatus for a switch failure according to the present disclosure;
fig. 5 shows a schematic diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure.
Detailed description of the preferred embodiments
The present disclosure will be described more fully hereinafter with reference to the accompanying drawings. However, the present disclosure may be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein. Thus, while the disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that there is no intention to limit the disclosure to the specific forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the disclosure as defined by the appended claims.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the teachings of the present disclosure.
Some examples are described herein in connection with block diagrams and/or flow charts, wherein each block represents a portion of circuit elements, module, or code that comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in other implementations, the functions noted in the blocks may occur out of the order noted. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
Reference herein to "an embodiment according to … …" or "in an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one implementation of the disclosure. The appearances of the phrase "in accordance with an embodiment" or "in an embodiment" in various places herein are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily mutually exclusive of other embodiments.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of a method, apparatus, terminal device, and storage medium of processing a switch failure of the present disclosure may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a voice interaction type application, a video conference type application, a short video social type application, a web browser application, a shopping type application, a search type application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, various electronic devices with microphones and speakers may be available, including but not limited to smartphones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compressed standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compressed standard audio layer 4) players, portable computers and desktop computers, etc. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as a plurality of software or software modules, or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, and for example, the server 105 may be a background server processing a processing request of a switch failure transmitted by the terminal devices 101, 102, 103.
In some cases, the method for processing a switch failure provided by the present disclosure may be performed by the terminal devices 101, 102, 103, and correspondingly, the processing apparatus for a switch failure may also be disposed in the terminal devices 101, 102, 103, where the system architecture 100 may not include the server 105.
In some cases, the method for processing the switch failure provided by the present disclosure may be executed by the server 105, and correspondingly, the processing apparatus for processing the switch failure may also be disposed in the server 105, where the system architecture 100 may not include the terminal devices 101, 102, 103.
In some cases, the method of handling switch failures provided by the present disclosure may be performed jointly by the terminal devices 101, 102, 103 and the server 105. Accordingly, the processing means for the switch failure may also be provided in the terminal devices 101, 102, 103 and the server 105, respectively.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster formed by a plurality of servers, or as a single server. When server 105 is software, it may be implemented as a plurality of software or software modules (e.g., to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Figure 2 illustrates a flow chart of one embodiment of a method of handling a switch failure according to the present disclosure. The method in this embodiment may be implemented by the terminal device in fig. 1, or by the server in fig. 1, or by both the terminal device and the server in fig. 1.
The method 200 in this embodiment is applied to a switch cluster. The switch cluster includes at least two interconnected switches and a configuration issuing system for managing the switches. The switches in the switch cluster described above may be interconnected in a stacked manner, a cascaded manner, or other manner. In alternative embodiments, the switches in the switch cluster described above may be interconnected in a stacked fashion.
As shown in fig. 2, the method 200 includes the steps of:
in step 210, alert information is received indicating that a target switch in the switch cluster has failed, the alert information including a switch identification of the target switch.
In this embodiment, the switch identification is used to uniquely identify the switch, which may be digital or other identification. Illustratively, the switch identification is, for example, a Port number (Port ID) of the switch.
In this embodiment, a special alarm system may be provided for collecting and transmitting alarm information. The alarm system is, for example, an NMS (Network Management System ), which is an operation and maintenance center responsible for equipment fault diagnosis and operation maintenance, network operation and network management of the wireless access system, and provides data and statistics for network management and planning. When the switch fails or the state changes, the alarm system generates and transmits corresponding alarm information. The alarm information may be classified into various alarm types such as fault alarm information (refer to an alarm generated due to a hardware device failure or some important function abnormality), recovery alarm information (refer to an alarm generated when a failed device or abnormal function is recovered to be normal), and event alarm information (refer to a reminder alarm).
In an alternative embodiment, the execution body of the method 200 may filter all alarm information received, from which the fault alarm information is filtered. Accordingly, step 210 may further include: the first step, receiving alarm information; step two, judging whether the alarm information is fault alarm information according to the alarm type of the alarm information; and thirdly, analyzing the switch identification of the target switch from the alarm information under the condition that the alarm information is fault alarm information.
Step 220, obtaining meta information of the target switch according to the switch identification of the target switch, wherein the meta information comprises at least one of a device model number and a network structure.
In the present embodiment, meta information of the switch is information describing an inherent attribute of the switch, which includes at least one of a device model number and a network structure of the switch. The above-mentioned equipment model is, for example, "A model of A manufacturer" or "B model of B manufacturer". The network structure is, for example, a cascade mode, a stack mode, a port aggregation mode, or a hierarchical mode.
In this embodiment, meta information of each switch in the switch cluster may be stored in the meta information database in advance. Accordingly, step 220 may further comprise: firstly, inquiring a preset meta-information database according to the switch identification of a target switch, wherein the meta-information database stores the corresponding relation between the switch identification and the switch meta-information; and secondly, obtaining the meta information of the switch according to the corresponding relation between the switch identification and the switch meta information.
Step 230, determining a corresponding isolation command according to the meta information of the target switch.
In this embodiment, the isolation command is used to take the corresponding switch off line.
In this embodiment, switches of different device models and network structures correspond to different isolation commands.
Illustratively, the following table lists the corresponding quarantine commands for switches of different device models and network fabrics.
In this embodiment, the isolation commands corresponding to the switches with different device models and network structures may be stored in the isolation command database in advance. Accordingly, step 230 may further include: firstly, inquiring a preset isolation command database according to meta information of a target switch, wherein the isolation command database stores corresponding relations between switch meta information and switch isolation commands, and different switch meta information corresponds to different switch isolation commands; and secondly, obtaining the isolation command of the corresponding target switch according to the corresponding relation between the switch meta information and the switch isolation command.
In this embodiment, the meta information database and the isolation command database may be the same database, or may be different databases, or may be the same type of database, or may be different types of databases.
Step 240, send the isolation command to the configuration issuing system to isolate the target switch from the switch cluster.
In this embodiment, after receiving the isolation command, the configuration issuing system performs a corresponding isolation operation to isolate (or may be called as off-line) the target switch from the switch cluster.
In the embodiment of the disclosure, after the alarm information is received, the corresponding isolation command is determined according to the meta information of the target switch, and the isolation command is sent to the configuration issuing system so as to isolate the target switch from the switch cluster, so that the fault of the switch can be automatically processed, the processing efficiency is improved, and the labor cost is reduced.
In an alternative embodiment, before sending the isolation command to the configuration issuing system, the redundant device information of the target switch may be acquired, where the redundant device information indicates whether the switch has a redundant device. In order to maintain the stability of the network, in the switch cluster, the switches are backed up, and the devices playing a role in backup are redundant devices.
In the above embodiment, whether the target switch has the redundant device may be determined according to the redundant device information of the target switch, and in the case where the target switch has the redundant device, the isolation command is sent to the configuration issuing system. The mode can avoid great influence on the network environment caused by the switch off-line.
In the above-described embodiment, the manual processing notification may be transmitted to the target terminal device in the case where the target switch does not have a redundant device.
In an alternative embodiment, after receiving the alarm information indicating that the target switch in the switch cluster fails, the operation information of the target switch may be obtained according to the switch identifier of the target switch, where the operation information is used in repairing the target switch. The operation information of the target switch is, for example, one or more of port state, device aggregation port state, device route, device arp address table, and stack/DRNI state.
In an alternative embodiment, the step of obtaining the operation information of the target switch may further include the steps of: firstly, determining an operation information acquisition instruction corresponding to a target switch according to meta information of the target switch; and secondly, sending an operation information acquisition instruction to the target switch and receiving operation information returned by the target switch.
In the above embodiment, the collection instruction of the switch operation information is also related to the device model of the switch. Illustratively, the following table lists the operating information collection instructions corresponding to switches of different device models:
fig. 3A and 3B are diagrams showing a specific example of a processing method of a switch failure according to the present disclosure.
Fig. 3A illustrates a system architecture for implementing the method of handling a switch failure of the present disclosure. As shown in fig. 3A, the alarm system is a switch device fault information collection and alarm pushing system, and is used for pushing fault isolation alarm information to the fault automatic isolation system. The isolation system comprises an alarm analysis module, an information acquisition module and an isolation operation module. The alarm analysis module is a module for receiving alarm information and analyzing the alarm information and is used for analyzing whether the alarm type belongs to the fault alarm of the switch equipment and extracting the fault equipment information in the alarm. The information acquisition module is a fault equipment operation information and operation log information acquisition module and is used for acquiring information such as the type and network structure of equipment, whether redundant equipment exists or not. The isolation operation module is an isolation action decision and execution module and is used for judging whether to execute isolation operation or not according to the equipment information acquired by the information acquisition module and selecting different isolation commands according to the equipment model. The configuration issuing system is a system for issuing switch commands, and is used for receiving and executing switch commands sent by the isolation system and returning switch command execution results.
Fig. 3B illustrates a specific flow of a method of handling a switch failure according to the present disclosure. As shown in fig. 3B, the NMS alarm management (i.e., alarm system) transmits device fault alarm information to the fault isolation server (i.e., isolation system). The fault isolation server sends corresponding alarm notification to network maintainers on one hand, and saves basic alarm information through the graphQL (namely a database) on the other hand, and obtains the equipment model and the network structure of the fault switch. The fault isolation server sends an instruction for collecting fault switch operation information to the NMS configuration issuing (namely, the configuration issuing system) and receives returned fault switch operation information. The fault isolation server stores the fault switch operation information to the graphQL for use in repairing the fault switch. The fault isolation server determines whether the faulty switch has redundant devices. If the faulty switch has redundant equipment, the isolation command is issued to the NMS configuration and the returned isolation result is received. If the fault exchanger does not have redundant equipment, a corresponding alarm notification is sent to a network maintainer. And finally, the fault isolation server stores the isolation result to the graphQL and informs network maintenance personnel of the isolation result.
Fig. 4 illustrates an exemplary block diagram of a processing device for a switch failure according to an embodiment of the present disclosure. As shown in fig. 4, the apparatus 400 for processing a switch failure includes: a receiving module 410, configured to receive alarm information indicating that a target switch in the switch cluster fails, where the alarm information includes a switch identifier of the target switch; an obtaining module 420, configured to obtain meta information of the target switch according to a switch identifier of the target switch, where the meta information includes at least one of a device model and a network structure; a determining module 430, configured to determine a corresponding isolation command according to meta information of the target switch; and a sending module 440, configured to send the isolation command to the configuration issuing system, so as to isolate the target switch from the switch cluster.
It should be appreciated that the various modules of the apparatus 400 shown in fig. 4 may correspond to the various steps in the method 200 described with reference to fig. 2. Thus, the operations, features and advantages described above with respect to method 200 apply equally to apparatus 400 and the modules comprised thereby. For brevity, certain operations, features and advantages are not described in detail herein.
In an alternative embodiment, the receiving module 410 is further configured to: receiving alarm information; judging whether the alarm information is fault alarm information or not according to the alarm type of the alarm information; and under the condition that the alarm information is fault alarm information, analyzing the switch identification of the target switch from the alarm information.
In an alternative embodiment, the acquisition module 420 is further configured to: inquiring a preset meta-information database according to the switch identification of the target switch, wherein the meta-information database stores the corresponding relation between the switch identification and the switch meta-information; and obtaining the meta information of the switch according to the corresponding relation between the switch identification and the switch meta information.
In an alternative embodiment, the determining module 430 is further configured to: inquiring a preset isolation command database according to the meta information of the target switch, wherein the isolation command database stores the corresponding relation between the switch meta information and switch isolation commands, and different switch meta information corresponds to different switch isolation commands; and obtaining the isolation command of the corresponding target switch according to the corresponding relation between the switch meta information and the switch isolation command.
In an alternative embodiment, the acquisition module 420 is further configured to: and obtaining redundant equipment information of the target switch, wherein the redundant equipment information indicates whether the switch has redundant equipment or not. The sending module 440 is further configured to: determining whether the target switch has redundant equipment according to the redundant equipment information of the target switch; and sending the isolation command to the configuration issuing system when the target switch has redundant equipment.
In an alternative embodiment, the sending module 440 is further configured to: and in the case that the target switch does not have redundant equipment, sending a manual processing notification to the target terminal equipment.
In an alternative embodiment, the acquisition module 420 is further configured to: and acquiring operation information of the target switch according to the switch identification of the target switch, wherein the operation information is used for repairing the target switch.
In an alternative embodiment, the acquisition module 420 is further configured to: determining an operation information acquisition instruction corresponding to the target switch according to the meta information of the target switch; and sending the operation information acquisition instruction to the target switch, and receiving the operation information returned by the target switch.
In an alternative embodiment, the switches in the switch cluster described above are interconnected in a stacked fashion.
Fig. 5 illustrates a schematic block diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure. Referring to fig. 5, a block diagram of an electronic device 500 that may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein. As shown in fig. 5, the electronic device 500 includes a computing unit 501 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504. Various components in the device 500 are connected to the I/O interface 505, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508 such as a magnetic disk, an optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the respective methods and processes described above, for example, a processing method of a switch failure. For example, in some embodiments, the method of handling switch failures may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into RAM 503 and executed by the computing unit 501, one or more steps of the above-described method of handling switch failures may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the method of handling switch failures by any other suitable means (e.g., by means of firmware).
The various illustrative logics, logical blocks, modules, circuits, and algorithm processes described in connection with the aspects disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. The interchangeability of hardware and software has been described generally in terms of functionality, and is illustrated in the various illustrative components, blocks, modules, circuits, and processes described above. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system.
The hardware and data processing apparatus used to implement the various illustrative logics, logical blocks, modules, and circuits described in connection with the aspects disclosed herein may be implemented or performed with a general purpose single or multi-chip processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor or any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. In some aspects, particular processes and methods may be performed by circuitry specific to a given function.
In one or more aspects, the functions described may be implemented in hardware, digital electronic circuitry, computer software, firmware (including the structures disclosed in this specification and their equivalents), or in any combination thereof. Aspects of the subject matter described in this specification can also be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on a computer storage medium for execution by, or to control the operation of, data processing apparatus.
If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. The processes of the methods or algorithms disclosed herein may be implemented in software modules executable by a processor, which may reside on a computer readable medium. Computer-readable media includes both computer storage media and communication media including any medium that can transfer a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Further, any connection is properly termed a computer-readable medium. Disk (Disk) and disc (Disk) as used herein include high-density optical discs (CDs), laser discs, optical discs, digital Versatile Discs (DVDs), floppy disks, and blu-ray discs where disks (disks) usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media. Additionally, the operations of a method or algorithm may be embodied as one or any combination or set of codes and instructions on a machine-readable medium and computer-readable medium, which may be incorporated into a computer program product.
The various embodiments in this disclosure are described in a related manner, and identical and similar parts of the various embodiments are all referred to each other, and each embodiment is mainly described in terms of differences from the other embodiments. In particular, for apparatus embodiments, device embodiments, computer-readable storage medium embodiments, and computer program product embodiments, the description is relatively simple, as relevant to the method embodiments in part.

Claims (12)

1. A method of handling a switch failure applied to a switch cluster, the switch cluster comprising at least two interconnected switches and a configuration issuing system for managing the switches, the method comprising:
receiving alarm information indicating that a target switch in the switch cluster fails, wherein the alarm information comprises a switch identifier of the target switch;
acquiring meta information of the target switch according to the switch identification of the target switch, wherein the meta information comprises at least one of a device model and a network structure;
determining a corresponding isolation command according to the meta information of the target switch;
and sending the isolation command to the configuration issuing system so as to isolate the target switch from the switch cluster.
2. The method of claim 1, wherein the receiving alert information indicating that a target switch in the switch cluster is failed comprises:
receiving alarm information;
judging whether the alarm information is fault alarm information or not according to the alarm type of the alarm information;
and under the condition that the alarm information is fault alarm information, analyzing the switch identification of the target switch from the alarm information.
3. The method of claim 1, wherein the obtaining meta information of the target switch according to the switch identification of the target switch comprises:
inquiring a preset meta-information database according to the switch identification of the target switch, wherein the meta-information database stores the corresponding relation between the switch identification and the switch meta-information;
and obtaining the meta information of the switch according to the corresponding relation between the switch identification and the switch meta information.
4. The method of claim 1, wherein the determining the respective isolation command from meta information of the target switch comprises:
inquiring a preset isolation command database according to the meta information of the target switch, wherein the isolation command database stores the corresponding relation between the switch meta information and switch isolation commands, and different switch meta information corresponds to different switch isolation commands;
and obtaining the isolation command of the target switch according to the corresponding relation between the switch meta information and the switch isolation command.
5. The method of claim 1, wherein prior to the sending the quarantine command to the configuration issue system, the method further comprises:
obtaining redundant equipment information of the target switch, wherein the redundant equipment information represents whether redundant equipment exists in the switch or not; and
the sending the isolation command to the configuration issuing system comprises the following steps:
determining whether the target switch has redundant equipment according to the redundant equipment information of the target switch;
and in the case that the target switch has redundant equipment, sending the isolation command to the configuration issuing system.
6. The method of claim 1, wherein after receiving alert information indicating that a target switch in the switch cluster is failed, the method further comprises:
and acquiring operation information of the target switch according to the switch identification of the target switch, wherein the operation information is used for repairing the target switch.
7. The method of claim 6, wherein the obtaining the operational information of the target switch comprises:
determining an operation information acquisition instruction corresponding to the target switch according to the meta information of the target switch;
and sending the operation information acquisition instruction to the target switch and receiving the operation information returned by the target switch.
8. The method of any of claims 1-7, wherein switches in the switch cluster are interconnected in a stacked manner.
9. A processing apparatus for a switch failure, applied to a switch cluster, the switch cluster comprising at least two interconnected switches and a configuration issuing system for managing the switches, the apparatus comprising:
the receiving module is used for receiving alarm information representing that a target switch in the switch cluster fails, wherein the alarm information comprises a switch identifier of the target switch;
the acquisition module is used for acquiring meta information of the target switch according to the switch identification of the target switch, wherein the meta information comprises at least one of a device model and a network structure;
the determining module is used for determining corresponding isolation commands according to the meta information of the target switch;
and the sending module is used for sending the isolation command to the configuration issuing system so as to isolate the target switch from the switch cluster.
10. A computer program product comprising program code instructions which, when the program product is executed by a computer, cause the computer to carry out the method of at least one of claims 1-8.
11. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of at least one of claims 1-8.
12. An electronic device, comprising:
the processor may be configured to perform the steps of,
a memory in electronic communication with the processor; and
instructions stored in the memory and executable by the processor to cause the electronic device to perform the method according to at least one of claims 1-8.
CN202311222585.7A 2023-09-20 2023-09-20 Method and device for processing switch faults, storage medium and electronic equipment Pending CN117155763A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311222585.7A CN117155763A (en) 2023-09-20 2023-09-20 Method and device for processing switch faults, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311222585.7A CN117155763A (en) 2023-09-20 2023-09-20 Method and device for processing switch faults, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN117155763A true CN117155763A (en) 2023-12-01

Family

ID=88908049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311222585.7A Pending CN117155763A (en) 2023-09-20 2023-09-20 Method and device for processing switch faults, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN117155763A (en)

Similar Documents

Publication Publication Date Title
CN110704231A (en) Fault processing method and device
CN106933843B (en) Database heartbeat detection method and device
CN111176879A (en) Fault repairing method and device for equipment
WO2017080161A1 (en) Alarm information processing method and device in cloud computing
CN111858240B (en) Monitoring method, system, equipment and medium of distributed storage system
CN112131050B (en) Disaster recovery switching method and device, storage medium and computer equipment
CN112468361A (en) Network connection state monitoring method and device, electronic equipment and storage medium
CN102355368A (en) Fault processing method of network equipment and system
CN112217847A (en) Micro service platform, implementation method thereof, electronic device and storage medium
CN111010318A (en) Method and system for discovering loss of connection of terminal equipment of Internet of things and equipment shadow server
CN114244683A (en) Event classification method and device
CN112711493A (en) Scenario root cause analysis application
CN117155763A (en) Method and device for processing switch faults, storage medium and electronic equipment
CN111698301A (en) Service management method, device and storage medium for ensuring service continuation
CN116302716A (en) Cluster deployment method and device, electronic equipment and computer readable medium
CN105550065A (en) Database server communication management method and device
CN112491464B (en) Distributed fault real-time monitoring and standby equipment switching method for satellite communication
CN113946474A (en) Efficient disaster tolerance protection method and disaster tolerance processing system for storage system
CN114356615A (en) Solution method for self-healing of rail transit software and application fault based on Internet of things
CN109271531B (en) Data management center based on operation and maintenance knowledge graph
CN111614501A (en) Monitoring method and system
CN114528156A (en) Database switching method of heterogeneous disaster tolerance scheme, electronic device and medium
CN115150253B (en) Fault root cause determining method and device and electronic equipment
CN116991579A (en) Migration method and device of edge computing node, storage medium and electronic equipment
CN116841834A (en) State adjustment method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination