CN113873560A - Network fault processing method and device - Google Patents

Network fault processing method and device Download PDF

Info

Publication number
CN113873560A
CN113873560A CN202111149116.8A CN202111149116A CN113873560A CN 113873560 A CN113873560 A CN 113873560A CN 202111149116 A CN202111149116 A CN 202111149116A CN 113873560 A CN113873560 A CN 113873560A
Authority
CN
China
Prior art keywords
network
fault
quality
wifi
strategy module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111149116.8A
Other languages
Chinese (zh)
Inventor
王孝明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202111149116.8A priority Critical patent/CN113873560A/en
Publication of CN113873560A publication Critical patent/CN113873560A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/04Arrangements for maintaining operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/08Testing, supervising or monitoring using real traffic

Abstract

The disclosure relates to a network fault processing method and device, and relates to the technical field of communication. The processing method comprises the following steps: the fault monitoring strategy module monitors the service quality indexes of the WiFi and the bearing network thereof in real time; the fault monitoring strategy module judges whether the service quality index exceeds the end-to-end quality index range of WiFi related service; under the condition that the quality index range is exceeded, the fault monitoring strategy module carries out fault positioning analysis to determine whether the fault is positioned in a bearing network or an access network or is positioned in key equipment of an IP metropolitan area network; and the fault monitoring strategy module processes the network fault according to the fault positioning analysis result.

Description

Network fault processing method and device
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a method and an apparatus for processing a network fault, and a non-volatile computer-readable storage medium.
Background
WiFi (Wireless-Fidelity) technology has been gradually shifted from the original enterprise-level technology to the carrier-level technology. The method has the advantages of high-capacity AC (Access Controller) application, less node deployment, centralized construction, safety and flat network architecture, and can meet the large-scale construction requirement of the WiFi network.
In the related art, large-capacity ACs of multiple manufacturers are deployed in a centralized manner to flexibly Access to APs (Wireless Access points) of various manufacturers in different areas, so as to improve the operation and maintenance efficiency of the W-iFi network and the utilization rate of AC equipment.
Disclosure of Invention
The inventors of the present disclosure found that the following problems exist in the above-described related art: the key index acquisition is distributed in different departments, and the basic data is not concentrated enough, so that the fault cannot be processed quickly.
In view of this, the present disclosure provides a technical solution for processing a network fault, which can quickly process the fault.
According to some embodiments of the present disclosure, a method for processing a network fault is provided, including: the fault monitoring strategy module monitors the service quality indexes of the WiFi and the bearing network thereof in real time; the fault monitoring strategy module judges whether the service quality index exceeds the end-to-end quality index range of WiFi related service; when the quality index range is exceeded, the fault monitoring strategy module carries out fault positioning analysis to determine whether the fault is located in a bearing network or an access network or is located in key equipment of an IP (Internet Protocol, network Protocol) metropolitan area network; and the fault monitoring strategy module processes the network fault according to the fault positioning analysis result.
In some embodiments, the monitoring of the service quality index of the WiFi and its bearer network by the fault monitoring policy module in real time includes: the fault monitoring strategy module sends a plurality of test data streams to each edge probe through a core probe arranged at the fault monitoring strategy module side, and each edge probe is arranged at each device of WiFi and a bearing network thereof; and receiving a plurality of returned test data packets, wherein the plurality of test data packets comprise service quality indexes.
In some embodiments, the service quality indicator includes a network quality indicator of each path of the bearer network in each test period, and the fault location analysis performed by the fault monitoring policy module includes: and determining that the fault is positioned in the bearer network or the access network under the condition that the network quality index exceeds the quality index range.
In some embodiments, the network quality indicator includes at least one of latency and packet loss rate.
In some embodiments, the service quality indicator includes a status indicator of the key device, and the performing, by the fault monitoring policy module, the fault location analysis includes: and determining that the fault is located in the key equipment of the IP metropolitan area network under the condition that the state index exceeds the quality index range.
In some embodiments, the status indicators include at least one of CPU (central processing unit) utilization, memory utilization, software failures.
In some embodiments, the processing of the network fault by the fault monitoring policy module according to the result of the fault location analysis includes: and under the condition that the fault is positioned in the key equipment of the IP metropolitan area network, remotely restarting the key equipment before the key equipment is down.
In some embodiments, the processing of the network fault by the fault monitoring policy module according to the result of the fault location analysis includes: and under the condition that the fault is positioned in the bearing network or the access network, switching the bearing path to other paths with better network quality than the current path.
In some embodiments, switching the bearer path to another path having better network quality than the current path comprises: the bearer network or the access network is triggered by changing the default gateway or increasing the network channel cost of the link, and the bearer path is switched to other paths with the network quality better than that of the current path.
According to other embodiments of the present disclosure, there is provided a processing apparatus for network failure, the processing apparatus being disposed in a failure monitoring policy module, including: the monitoring unit is used for monitoring the service quality indexes of the WiFi and the bearing network thereof in real time; the judging unit is used for judging whether the service quality index exceeds the end-to-end quality index range of the WiFi related service; the analysis unit is used for carrying out fault positioning analysis by the fault monitoring strategy module under the condition that the quality index range is exceeded so as to determine whether the fault is positioned in a bearing network or an access network or in key equipment of an IP metropolitan area network; and the processing unit is used for processing the network fault by the fault monitoring strategy module according to the fault positioning analysis result.
In some embodiments, the monitoring unit sends a plurality of test data streams to each edge probe through a core probe arranged at the fault monitoring strategy module side, and each edge probe is arranged at each device of the WiFi and its carrying network; and receiving a plurality of returned test data packets, wherein the plurality of test data packets comprise service quality indexes.
In some embodiments, the service quality indicator includes a network quality indicator of each path of the bearer network in each test period, and the analysis unit determines that the fault is located in the bearer network or the access network when the network quality indicator exceeds a quality indicator range.
In some embodiments, the network quality indicator includes at least one of latency and packet loss rate.
In some embodiments, the service quality indicator includes a status indicator of the critical device, and the analysis unit determines that the fault is located in the critical device of the IP metropolitan area network if the status indicator exceeds a quality indicator range.
In some embodiments, the status indicators include at least one of CPU (central processing unit) utilization, memory utilization, software failures.
In some embodiments, the processing unit remotely restarts the critical device before the critical device is down in the event of a failure of the critical device located in the IP metropolitan area network.
In some embodiments, the processing unit switches the bearer path to another path having a better network quality than the current path in case the failure is located in the bearer network or the access network.
In some embodiments, the processing unit triggers the bearer network or the access network by changing a default gateway or increasing a network channel cost of a link, and switches the bearer path to another path with better network quality than the current path.
According to still other embodiments of the present disclosure, there is provided a network failure processing apparatus, including: a memory; and a processor coupled to the memory, the processor configured to perform the method of handling a network failure in any of the above embodiments based on instructions stored in the memory device.
According to still further embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of handling a network failure in any of the above embodiments.
According to still further embodiments of the present disclosure, there is provided a system for processing a network failure, including: a fault monitoring policy module, configured to execute the processing method in any of the embodiments; the core probe is arranged at the fault monitoring strategy module side; and each edge probe is arranged at each device of the WiFi and the carrier network thereof.
In the embodiment, a fault monitoring strategy module is introduced into the network for uniformly monitoring the service quality index; and the fault can be positioned and analyzed according to the monitored service quality index, so that the fault can be quickly processed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The present disclosure can be more clearly understood from the following detailed description with reference to the accompanying drawings, in which:
fig. 1 illustrates a flow diagram of some embodiments of a method of handling a network failure of the present disclosure;
fig. 2 illustrates a schematic diagram of some embodiments of a method of handling a network failure of the present disclosure;
FIG. 3 illustrates a flow diagram of further embodiments of a method of handling a network failure of the present disclosure;
fig. 4 illustrates a block diagram of some embodiments of a network failure handling apparatus of the present disclosure;
FIG. 5 illustrates a block diagram of further embodiments of a network failure handling apparatus of the present disclosure;
FIG. 6 illustrates a block diagram of yet further embodiments of a network failure handling apparatus of the present disclosure;
fig. 7 illustrates a block diagram of some embodiments of a network failure handling system of the present disclosure.
Detailed Description
Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
As described above, the key devices of the operator WiFi, such as the switch, the router, the BAS (Broadband Access Server/Broadband Remote Access Server), the AC, the Portal (gateway), the Server (Server), etc., of the IP metropolitan area network are maintained by different departments, and the collection of the key indexes is distributed in different departments, which results in insufficient concentration of the basic data.
The operation data of various devices cannot be comprehensively mastered in real time, network optimization and management from the global perspective are not utilized, and end-to-end management and control capabilities are not provided. Therefore, once a fault occurs, because the data stream spans multiple departments of equipment, the fault is difficult to find by the department, the fault is not easy to rapidly process, and the customer satisfaction is not easy to improve.
Aiming at the technical problem, the fault monitoring strategy module is introduced, service quality indexes such as network delay, packet loss rate, equipment memory, CPU utilization rate and the like are collected through the fault monitoring strategy module, and are compared with a preset allowable range (quality index range); based on the comparison result, whether the failure is caused by the degradation of the device port and the link quality index of the AP, the AC, the wired access network, and the IP bearer network can be distinguished.
Thus, the fault can be found before the user, and a targeted countermeasure can be taken. For example, the fault removal can be actively and rapidly realized by methods of actively issuing related instructions, remotely restarting and the like, so that the service interruption time is greatly reduced. For example, the technical solution of the present disclosure can be realized by the following embodiments.
Fig. 1 illustrates a flow diagram of some embodiments of a method of handling a network failure of the present disclosure.
As shown in fig. 1, in step 110, the fault monitoring policy module monitors the service quality indicators of the WiFi and its bearer network in real time.
In some embodiments, the fault monitoring policy module monitors the quality of service indicator of WiFi and its bearer network in real time between AC, switch, BAS.
In some embodiments, the fault monitoring policy module sends a plurality of test data streams to each edge probe through a core probe arranged at the fault monitoring policy module side, and each edge probe is arranged at each device of the WiFi and its bearer network; and receiving a plurality of returned test data packets, wherein the plurality of test data packets comprise service quality indexes. For example, each device includes a plurality of switches, routers, BAS, AC, Portal, and Server.
In some embodiments, the test data stream is an RTP (Real-time Transport Protocol) data stream that simulates WiFi traffic sent from the core probe to each edge probe. By configuring and sending each test data stream, the edge probe returns each test data packet to the core probe. Therefore, the network quality of the bearer network path reflected by each test data stream in each test period can be obtained.
In step 120, the fault monitoring policy module determines whether the service quality indicator exceeds the end-to-end quality indicator range of the WiFi related service.
In some embodiments, an end-to-end quality index range of the WiFi internet related service is preset. For example, the quality index range includes at least one of delay and packet loss rate of networks in different network segments, and an allowable range of at least one of memory and CPU utilization rate of wireless devices such as APs and ACs. And judging whether the service quality indexes of the WiFi and the bearer network thereof are within the quality index range.
In step 130, when the quality index range is exceeded, the fault monitoring policy module performs fault location analysis to determine whether the fault is located in the bearer network or the access network or in the key device of the IP metropolitan area network. For example, the critical devices include at least one of a switch, router, BAS, AC, Portal, Server.
In some embodiments, if the service quality index of the WiFi and its bearer network is within the quality index range, the monitoring is continued; if the quality index range is exceeded, the WiFi or professional IP network management system is informed.
In some embodiments, the service quality indicator includes a network quality indicator of each path of the bearer network in each test period. And determining that the fault is positioned in the bearer network or the access network under the condition that the network quality index exceeds the quality index range. For example, the network quality indicator includes at least one of delay time and packet loss rate.
In some embodiments, the quality of service indicator comprises a status indicator of the critical device. And determining that the fault is located in the key equipment of the IP metropolitan area network under the condition that the state index exceeds the quality index range. For example, the status indicator includes at least one of CPU utilization, memory utilization, and software failure.
In step 140, the fault monitoring policy module processes the network fault according to the result of the fault location analysis.
In some embodiments, in the event that the failure is located on a critical device of the IP metropolitan area network, the critical device is restarted remotely before the critical device goes down.
In some embodiments, in the event of a failure in the bearer or access network, the bearer path is switched to another path having a better network quality than the current path. For example, by changing the default gateway or increasing the network channel cost of the link, the bearer network or the access network is triggered, and the bearer path is switched to another path with better network quality than the current path.
In some embodiments, the related strategies are set through a wireless and data professional network management system. For example, if the channel cost of the optical fiber link or the network device in the "sub-health" state (exceeding the quality index range) is changed, the bearer network routing switching condition is triggered, and the fast protection and recovery of the network link are realized.
Fig. 2 illustrates a schematic diagram of some embodiments of a method of handling network failures of the present disclosure.
As shown in fig. 2, the fault monitoring policy module senses conditions of an IP MAN (Internet Protocol metropolian Area Network, Network Protocol Metropolitan Area Network), a BAS, and an AC from a WiFi service level; the fault monitoring strategy module simulates the WiFi service characteristics to send test data streams, and then calculates the quality of the bearer network and the access network on the premise of meeting the customer perception of the WiFi service.
A core probe is deployed on one side of a fault monitoring strategy module, and edge probes are respectively deployed on WiFi and equipment (such as a switch, a router, a BAS and an AC) on the core side of a carrying network.
Each test data stream is configured by a fault monitoring policy module. The test data stream is an RTP data stream simulating WiFi service sent from the core probe to each edge probe.
The edge probe returns each test data packet to the core probe, thereby obtaining the network quality of the bearer network path reflected by each test data stream in each test period (e.g. 50 ms). For example, the network may be tested for delay, packet loss rate, and other indicators.
And the fault monitoring strategy module compares the fed-back test data packet with an index range in an expert experience base so as to quickly perform fault positioning analysis. For example, if the fault belongs to a wireless network, such as a CPU or a memory with an excessively high utilization rate, a software fault, and the like, the wireless device such as an AP and the like is quickly recovered to normal by actively taking measures such as remote restart and the like before the downtime through linkage with a professional network manager of the wireless network.
When the quality of the bearer network is reduced to the baseline of the service requirement (for example, parameters such as delay and packet loss rate of the network exceed the allowable range), it is determined that the network device or link is in a "sub-health" state (that is, the service experience of the user is affected, but the service is not interrupted). The fault monitoring strategy module alarms the service platform and the alarm center, automatically triggers the bearing network or the access network by increasing the cost of network channels and the like, and actively (before equipment downtime or link interruption occurs) quickly switches the bearing path to a path with good quality, thereby ensuring the end-to-end quality of the WiFi service.
Fig. 3 illustrates a flow diagram of further embodiments of a method of handling a network failure of the present disclosure.
As shown in fig. 3, an end-to-end quality indicator range of the WiFi related service is preset. For example, the quality index range includes at least one of delay and packet loss rate of networks in different network segments, and an allowable range of at least one of memory and CPU utilization rate of wireless devices such as APs and ACs.
And the fault monitoring strategy module monitors the service quality indexes of the WiFi and the bearing network thereof in real time among the AC, the switch and the BAS.
In some embodiments, the test data stream is an RTP (Real-time Transport Protocol) data stream that simulates WiFi traffic sent from the core probe to each edge probe. By configuring and sending each test data stream, the edge probe returns each test data packet to the core probe. Therefore, the network quality of the bearer network path reflected by each test data stream in each test period can be obtained.
And judging whether the service quality indexes of the WiFi and the bearer network thereof are within the quality index range. And if the service quality indexes of the WiFi and the bearer network thereof are within the quality index range, continuing monitoring.
If the service quality index of the WiFi and the bearing network thereof exceeds the quality index range, recording the degraded quality index and giving an alarm to the WiFi or a professional IP network management system.
And analyzing and positioning the fault. If the fault is the fault of the bearing network and the access network, the channel cost of the default gateway or the link is changed, and the link and the route are switched.
And if the faults are AC faults and AP faults, comparing whether the utilization rates of CPUs and/or memories of the AP and the AC exceed corresponding threshold values. In case the respective threshold is exceeded, the remote restart occurs before the failure occurs.
In some embodiments, after the failure is handled, a determination is made as to whether the failure is recovered. If so, ending the fault processing; and if not, informing the operation and maintenance personnel to process.
In the embodiment, the fault monitoring strategy module is arranged, so that WiFi network optimization and management from the global perspective are realized, and the system has the capabilities of crossing wireless and data professional networks and realizing end-to-end management and control.
On the premise of not changing the overall architecture of the original telecom WiFi networking, a fault monitoring strategy module is added, so that whether the fault is caused by the degradation of the quality indexes of the AP, the AC or the ports or the links of the wired access network and the IP bearing network equipment can be distinguished. Thus, a targeted processing measure is taken. Through methods of actively issuing related instructions, remotely restarting and the like, the master quickly realizes fault recovery, thereby reducing service interruption time.
The sensing service and the network quality are realized through the fault monitoring strategy module, and the association between the service quality and the network link is realized.
The defect that professional network management can only manage respective equipment is overcome, monitoring can be performed aiming at equipment of different manufacturers, different networks and different platforms, and rapid fault positioning and elimination are achieved, so that cooperation of wired and wireless networks is achieved.
Fig. 4 illustrates a block diagram of some embodiments of a network failure handling apparatus of the present disclosure.
As shown in fig. 4, the processing apparatus 4 for network failure is disposed in the failure monitoring policy module, and includes: the monitoring unit 41 is configured to monitor the service quality indexes of the WiFi and the carrier network thereof in real time; a judging unit 42, configured to judge whether the service quality indicator exceeds an end-to-end quality indicator range of the WiFi related service; an analysis unit 43, configured to perform fault location analysis by the fault monitoring policy module when the quality index range is exceeded, so as to determine whether a fault is located in the bearer network or the access network, or in a key device of the IP metropolitan area network; and the processing unit 44 is used for the fault monitoring strategy module to process the network fault according to the result of the fault positioning analysis.
In some embodiments, the monitoring unit 41 sends a plurality of test data streams to each edge probe through a core probe disposed at the side of the fault monitoring policy module, where each edge probe is disposed at each device of the WiFi and its bearer network; and receiving a plurality of returned test data packets, wherein the plurality of test data packets comprise service quality indexes.
In some embodiments, the service quality indicator includes a network quality indicator of each path of the bearer network in each test period, and the analyzing unit 43 determines that the fault is located in the bearer network or the access network when the network quality indicator exceeds the quality indicator range.
In some embodiments, the network quality indicator includes at least one of latency and packet loss rate.
In some embodiments, the service quality indicator includes a status indicator of the critical device, and the analyzing unit 43 determines that the fault is located in the critical device of the IP metropolitan area network if the status indicator exceeds the quality indicator range.
In some embodiments, the status indicators include at least one of CPU (central processing unit) utilization, memory utilization, software failures.
In some embodiments, the processing unit 44 remotely restarts critical devices before they are down in the event of a failure of a critical device located in the IP metropolitan area network.
In some embodiments, the processing unit 44 switches the bearer path to another path with better network quality than the current path in case the failure is located in the bearer network or the access network.
In some embodiments, the processing unit 44 triggers the bearer network or the access network by changing the default gateway or increasing the network channel cost of the link, and switches the bearer path to another path with better network quality than the current path.
Fig. 5 shows a block diagram of further embodiments of a network failure handling apparatus of the present disclosure.
As shown in fig. 5, the network failure processing apparatus 5 of the embodiment includes: a memory 51 and a processor 52 coupled to the memory 51, the processor 52 being configured to execute a method for processing a network failure in any one embodiment of the present disclosure based on instructions stored in the memory 51.
The memory 51 may include, for example, a system memory, a fixed nonvolatile storage medium, and the like. The system memory stores, for example, an operating system, an application program, a Boot Loader, a database, and other programs.
In some embodiments, the memory may further store normal performance indexes of different services corresponding to each device and link of the WiFi and its bearer network, and "trigger conditions" of processing of network routing, device remote restart, and the like "
Fig. 6 illustrates a block diagram of still further embodiments of a network failure handling apparatus of the present disclosure.
As shown in fig. 6, the network failure processing device 6 of the embodiment includes: a memory 610 and a processor 620 coupled to the memory 610, wherein the processor 620 is configured to execute the method for processing the network failure in any of the foregoing embodiments based on instructions stored in the memory 610.
The memory 610 may include, for example, system memory, fixed non-volatile storage media, and the like. The system memory stores, for example, an operating system, an application program, a Boot Loader, and other programs.
The processing device 6 for network failure may further include an input-output interface 630, a network interface 640, a storage interface 650, and the like. These interfaces 630, 640, 650 and the connections between the memory 610 and the processor 620 may be through a bus 660, for example. The input/output interface 630 provides a connection interface for input/output devices such as a display, a mouse, a keyboard, a touch screen, a microphone, and a sound box. The network interface 640 provides a connection interface for various networking devices. The storage interface 650 provides a connection interface for external storage devices such as an SD card and a usb disk.
Fig. 7 illustrates a block diagram of some embodiments of a network failure handling system of the present disclosure.
As shown in fig. 7, the processing system 7 for network failure includes: a fault monitoring policy module 71 for executing the processing method in any of the above embodiments; a core probe 72 disposed at the fault monitoring strategy module side; edge probes 73 located at the WiFi and its equipment carrying the network.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media having computer-usable program code embodied therein, including but not limited to disk storage, CD-ROM, optical storage, and the like.
So far, the processing method of the network failure, the processing apparatus of the network failure, and the nonvolatile computer readable storage medium according to the present disclosure have been described in detail. Some details that are well known in the art have not been described in order to avoid obscuring the concepts of the present disclosure. It will be fully apparent to those skilled in the art from the foregoing description how to practice the presently disclosed embodiments.
The method and system of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the foregoing examples are for purposes of illustration only and are not intended to limit the scope of the present disclosure. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the present disclosure. The scope of the present disclosure is defined by the appended claims.

Claims (13)

1. A method for processing network faults comprises the following steps:
the fault monitoring strategy module monitors the service quality indexes of the WiFi and the bearing network thereof in real time;
the fault monitoring strategy module judges whether the service quality index exceeds the end-to-end quality index range of WiFi related service;
when the quality index range is exceeded, the fault monitoring strategy module carries out fault positioning analysis to determine whether the fault is located in a bearing network or an access network or in key equipment of an IP metropolitan area network;
and the fault monitoring strategy module processes the network fault according to the fault positioning analysis result.
2. The processing method according to claim 1, wherein the monitoring of the service quality indicators of WiFi and its bearer network in real time by the fault monitoring policy module comprises:
the fault monitoring strategy module sends a plurality of test data streams to each edge probe through a core probe arranged at the fault monitoring strategy module side, and each edge probe is arranged at each device of the WiFi and the bearing network thereof;
and receiving a plurality of returned test data packets, wherein the plurality of test data packets comprise the service quality index.
3. The processing method according to claim 1, wherein the QoS metrics include network quality metrics of paths of the bearer network in each test period,
the fault location analysis performed by the fault monitoring strategy module comprises:
and determining that the fault is positioned in the bearer network or the access network under the condition that the network quality index exceeds the quality index range.
4. The processing method according to claim 3, wherein the network quality indicator includes at least one of delay time and packet loss rate.
5. The processing method of claim 1, wherein the quality of service indicator comprises a status indicator of the critical device,
the fault location analysis performed by the fault monitoring strategy module comprises:
and determining that the fault is positioned in the key equipment of the IP metropolitan area network under the condition that the state index exceeds the quality index range.
6. The processing method of claim 5, wherein the status indicators comprise at least one of CPU utilization, memory utilization, software failures.
7. The processing method according to any one of claims 1 to 6, wherein the fault monitoring policy module, according to the result of the fault location analysis, performing the processing of the network fault comprises:
under the condition that the fault is located in the key equipment of the IP metropolitan area network, remotely restarting the key equipment before the key equipment goes down.
8. The processing method according to any one of claims 1 to 6, wherein the fault monitoring policy module, according to the result of the fault location analysis, performing the processing of the network fault comprises:
and under the condition that the fault is positioned in the bearing network or the access network, switching the bearing path to other paths with better network quality than the current path.
9. The processing method according to claim 8, wherein the switching the bearer path to another path with better network quality than the current path comprises:
and triggering the bearing network or the access network by changing a default gateway or increasing the network channel cost of the link, and switching the bearing path to other paths with the network quality better than that of the current path.
10. A processing device for network failure, wherein the processing device is disposed in a failure monitoring policy module, and comprises:
the monitoring unit is used for monitoring the service quality indexes of the WiFi and the bearing network thereof in real time;
the judging unit is used for judging whether the service quality index exceeds the end-to-end quality index range of the WiFi related service;
the analysis unit is used for carrying out fault positioning analysis by the fault monitoring strategy module under the condition that the quality index range is exceeded so as to determine whether the fault is positioned in a bearing network or an access network or in key equipment of an IP metropolitan area network;
and the processing unit is used for processing the network fault by the fault monitoring strategy module according to the fault positioning analysis result.
11. A system for handling network failures, comprising:
a fault monitoring policy module for executing the processing method of any one of claims 1 to 9;
the core probe is arranged on the fault monitoring strategy module side;
and each edge probe is arranged at each device of the WiFi and the carrier network thereof.
12. A processing device for network failure, wherein the processing device is disposed in a failure monitoring policy module, and comprises:
a memory; and
a processor coupled to the memory, the processor configured to perform the method of handling a network fault of any of claims 1-9 based on instructions stored in the memory.
13. A non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of handling a network fault of any one of claims 1-9.
CN202111149116.8A 2021-09-29 2021-09-29 Network fault processing method and device Pending CN113873560A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111149116.8A CN113873560A (en) 2021-09-29 2021-09-29 Network fault processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111149116.8A CN113873560A (en) 2021-09-29 2021-09-29 Network fault processing method and device

Publications (1)

Publication Number Publication Date
CN113873560A true CN113873560A (en) 2021-12-31

Family

ID=78992445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111149116.8A Pending CN113873560A (en) 2021-09-29 2021-09-29 Network fault processing method and device

Country Status (1)

Country Link
CN (1) CN113873560A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023221547A1 (en) * 2022-05-20 2023-11-23 中兴通讯股份有限公司 Device fault detection method and device, and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055626A (en) * 2010-12-31 2011-05-11 北京中创信测科技股份有限公司 Internet protocol (IP) network quality detecting method and system
CN105471620A (en) * 2015-11-12 2016-04-06 广州市柏特科技有限公司 Broadband intelligent terminal embedded network analysis and diagnosis device and method thereof
US20190007847A1 (en) * 2017-06-28 2019-01-03 Thomson Licensing Method of communication failure reporting and corresponding apparatus
CN109347688A (en) * 2018-11-26 2019-02-15 锐捷网络股份有限公司 A kind of method and apparatus of positioning failure in a wireless local area network
US20200112905A1 (en) * 2018-10-09 2020-04-09 At&T Intellectual Property I, L.P. Routing optimization based on historical network measures
CN111371648A (en) * 2020-03-03 2020-07-03 北京百度网讯科技有限公司 Monitoring method and device for global fault of virtual gateway cluster
CN212034350U (en) * 2020-06-19 2020-11-27 成都玖翼通讯科技有限公司 Communication performance monitoring and fault positioning device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055626A (en) * 2010-12-31 2011-05-11 北京中创信测科技股份有限公司 Internet protocol (IP) network quality detecting method and system
CN105471620A (en) * 2015-11-12 2016-04-06 广州市柏特科技有限公司 Broadband intelligent terminal embedded network analysis and diagnosis device and method thereof
US20190007847A1 (en) * 2017-06-28 2019-01-03 Thomson Licensing Method of communication failure reporting and corresponding apparatus
US20200112905A1 (en) * 2018-10-09 2020-04-09 At&T Intellectual Property I, L.P. Routing optimization based on historical network measures
CN109347688A (en) * 2018-11-26 2019-02-15 锐捷网络股份有限公司 A kind of method and apparatus of positioning failure in a wireless local area network
CN111371648A (en) * 2020-03-03 2020-07-03 北京百度网讯科技有限公司 Monitoring method and device for global fault of virtual gateway cluster
CN212034350U (en) * 2020-06-19 2020-11-27 成都玖翼通讯科技有限公司 Communication performance monitoring and fault positioning device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王孝明;张连营;: "基于业务监控的IP网络主动保护解决方案", 光通信研究, no. 04, 10 August 2010 (2010-08-10) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023221547A1 (en) * 2022-05-20 2023-11-23 中兴通讯股份有限公司 Device fault detection method and device, and storage medium

Similar Documents

Publication Publication Date Title
US20200106662A1 (en) Systems and methods for managing network health
US20110270957A1 (en) Method and system for logging trace events of a network device
CN108306747B (en) Cloud security detection method and device and electronic equipment
US20130176858A1 (en) Method for Determining a Severity of a Network Incident
CN102868553B (en) Fault Locating Method and relevant device
US10708155B2 (en) Systems and methods for managing network operations
US20130182559A1 (en) Transmission network and transmission network management system
US20210105173A1 (en) Method and system for assessing network resource failures using passive shared risk resource groups
EP2613480A1 (en) Communication quality monitoring system, communication quality monitoring method, and storage medium
US20160226714A1 (en) Method and device for monitoring network link and storage medium therefor
EP3895379A1 (en) Orchestration of activities of entities operating in a network cloud
CN111147286A (en) IPRAN network loop monitoring method and device
CN107888455A (en) A kind of data detection method, device and system
US20110075567A1 (en) Methods, apparatus and articles of manufacture to monitor communication paths in communication systems
CN113873560A (en) Network fault processing method and device
CN111835595B (en) Flow data monitoring method, device, equipment and computer storage medium
EP3735767B1 (en) Method and system for assigning resource failure severity in communication networks
CN107005440B (en) method, device and system for positioning link fault
CN109964450B (en) Method and device for determining shared risk link group
CN115473825A (en) Business service level agreement guarantee method and system, controller and storage medium
KR100887874B1 (en) System for managing fault of internet and method thereof
US10432451B2 (en) Systems and methods for managing network health
US20100153543A1 (en) Method and System for Intelligent Management of Performance Measurements In Communication Networks
CN113285871B (en) Link protection method, SDN controller and communication network system
CN101431435A (en) Connection-oriented service configuration and management method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination