CN113497721B - Network fault positioning method and device - Google Patents

Network fault positioning method and device Download PDF

Info

Publication number
CN113497721B
CN113497721B CN202010202402.5A CN202010202402A CN113497721B CN 113497721 B CN113497721 B CN 113497721B CN 202010202402 A CN202010202402 A CN 202010202402A CN 113497721 B CN113497721 B CN 113497721B
Authority
CN
China
Prior art keywords
network
abnormal
traced
network equipment
abnormality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010202402.5A
Other languages
Chinese (zh)
Other versions
CN113497721A (en
Inventor
张洪林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Sichuan Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Sichuan Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Sichuan Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202010202402.5A priority Critical patent/CN113497721B/en
Publication of CN113497721A publication Critical patent/CN113497721A/en
Application granted granted Critical
Publication of CN113497721B publication Critical patent/CN113497721B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
    • Y04S10/52Outage or fault management, e.g. fault detection or location

Abstract

The application discloses a network fault positioning method and device, and relates to the technical field of cloud computing. The network fault positioning method and the device monitor the running states of a plurality of network devices at preset time intervals in the current target time period; determining network equipment to be traced with abnormal running state at the same acquisition time point from a plurality of network equipment; according to the abnormal network topology relation established by the network equipment to be traced which is abnormal and the network equipment to be traced which is abnormal in advance according to the historical target time period, the target network equipment with network fault is determined, so that the target network equipment with network fault is automatically positioned, manual participation is not needed, and time cost and labor cost are saved.

Description

Network fault positioning method and device
Technical Field
The present disclosure relates to the field of cloud computing technologies, and in particular, to a method and an apparatus for locating network faults.
Background
Cloud computing (clouding) is one type of distributed computing, which refers to decomposing a huge data computing process program into numerous small programs through a network "cloud", and then processing and analyzing the small programs through a system composed of multiple servers to obtain results and returning the results to users. When a network device in a cloud computing scenario is in an operating state, a fault is usually easy to occur due to the need to perform a large amount of data analysis and interaction. When the cloud computing system alarms are collected, a plurality of network devices with abnormal operation can be reported at one time, and workers are required to timely locate the network devices with faults (namely abnormal sources) and overhaul the network devices so as to ensure that the whole cloud computing service can normally operate.
In the prior art, a worker is usually required to examine and examine one by one through a plurality of network devices with abnormality in operation of the reported alarms so as to find out the network device with the fault (i.e. the abnormality source), and a great deal of labor cost and time cost are required to be consumed.
Disclosure of Invention
In view of this, the embodiments of the present application provide a network fault location method and apparatus, so as to improve the problem of locating a network device with a fault (i.e. an abnormal root cause), which requires a lot of manpower and time costs.
In a first aspect, an embodiment of the present application provides a network fault location method, where the method includes:
monitoring the running states of a plurality of network devices at intervals of preset time in a current target time period;
determining network equipment to be traced with abnormal running state at the same acquisition time point from a plurality of network equipment;
and determining the target network equipment with network faults according to the abnormal network topology relationship established by the abnormal network equipment to be traced and which is generated in advance according to the historical target time period.
In a second aspect, an embodiment of the present application further provides a network fault location device, where the device includes:
an operation state analysis unit configured to monitor operation states of a plurality of network devices at every preset time interval in a current target period;
the abnormal equipment determining unit is configured to determine network equipment to be traced back, wherein the running state of the network equipment is abnormal at the same acquisition time point, from the plurality of network equipment;
the fault positioning unit is configured to determine the target network equipment with network faults according to the abnormal network topology relationship established by the abnormal network equipment to be traced and which is generated in advance according to the historical target time period.
In a third aspect, an embodiment of the present application further provides an electronic device, including:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the network fault location method according to the first aspect of the embodiment of the present application.
The above-mentioned at least one technical scheme that this application embodiment adopted can reach following beneficial effect: monitoring the operation states of a plurality of network devices at intervals of preset time in a current target time period; determining network equipment to be traced with abnormal running state at the same acquisition time point from a plurality of network equipment; according to the abnormal network topology relation established by the network equipment to be traced which is abnormal and the network equipment to be traced which is abnormal in advance according to the historical target time period, the target network equipment with network fault is determined, so that the target network equipment with network fault is automatically positioned, manual participation is not needed, and time cost and labor cost are saved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
fig. 1 is a flowchart of a network fault location method according to an embodiment of the present application;
fig. 2 is an interaction schematic diagram of an electronic device provided in an embodiment of the present application with a plurality of network devices, respectively;
FIG. 3 is a schematic diagram of an architecture of an abnormal network topology according to an embodiment of the present application;
fig. 4 is a flowchart of a network fault location method according to an embodiment of the present application;
FIG. 5 is a functional block diagram of a network fault location device according to one embodiment of the present application;
FIG. 6 is a functional block diagram of a network fault location device according to one embodiment of the present application;
fig. 7 is a circuit connection block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes, technical solutions and advantages of the present application, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Referring to fig. 1, an embodiment of the present application provides a network fault location method applied to an electronic device 100, where the electronic device 100 may be, but is not limited to, a monitoring server, as shown in fig. 2, the electronic device 100 is communicatively connected to a plurality of network devices 200, and the electronic device 100 may monitor an operation state of the network devices 200, where the network devices 200 may be switches, routers, data processing servers, and so on. The method comprises the following steps:
s11: the operation states of the plurality of network devices 200 are monitored every preset time in the current target period.
The current target time period may be, but is not limited to, 9:00-11:00, 13:00-14:00, 21:00-22:00, etc. each day, and the preset interval may be 5 minutes, 10 minutes, 20 minutes, etc., which is not limited herein.
S12: and determining the network equipment 200 to be traced back, wherein the running state of the network equipment 200 is abnormal at the same acquisition time point.
For example, assuming that the current target time period is 9:00-11:00 a day and the preset interval is 10 minutes, 7 time acquisition points are included, which are respectively 9:00, 9:10, 9:20, 9:30, 9:40, 9:50, and 10:00, and the same time acquisition point refers to any one of the time points 9:00, 9:10, 9:20, 9:30, 9:40, 9:50, and 10:00.
S13: the target network device 200 with network fault is determined according to the abnormal network topology relationship established by the abnormal network device 200 to be traced and which is generated in advance according to the historical target time period.
Wherein, if the current target period is 9:00-11:00 per day, the historical target period is 9:00-11:00 per day for a preset period of time (e.g., 1 month or 3 months) prior to the current day. The abnormal network topology relationship includes network devices 200 associated with an abnormality occurrence when the network device 200 is abnormal historically, and carries a probability that each network device 200 causes the other network devices 200 to be abnormal.
The process of establishing the abnormal network topology relationship may include:
for the network devices 200 with the abnormality acquired at the same time acquisition point, the number S of the network devices 200 with the abnormality acquired at the same time acquisition point is counted A Simultaneously acquiring an associated abnormal network equipment 200 set ψA.a of each abnormal network equipment 200A.a and all associated abnormal sets ψA= ψA.a corresponding to the abnormal network equipment 200 acquired by the same time acquisition point; the condition probability P (b|a) that each network device 200 generating the abnormality acquired by the same time acquisition point causes any other network device 200 to generate the abnormality is calculated in sequence, and the formula is as follows:
wherein S is B|A The number of associated anomaly network devices 200 included in the associated anomaly set of the anomaly-producing network device 200 acquired for the same time acquisition point; s is S A The number of network devices 200 generating anomalies acquired for the same time acquisition point; alpha is a preset condition threshold value, and P (B|A) is a probability value; the abnormal network devices 200 acquired by the same time acquisition point trigger the abnormal condition probability of any other network device 200 to be organized into a condition probability matrix:
regarding an element P (B|A) in the conditional probability matrix, if the value of the corresponding element in the adjacent matrix is 1, the network equipment 200 which is acquired by the same time acquisition point and generates the abnormality is considered to cause any other network equipment 200 to generate the abnormality; generating an adjacent matrix according to the generated conditional probability matrix, wherein the specific process of generating the adjacent matrix is as follows: for any element P (B|A) in the conditional probability matrix, if P (B|A) is larger than the confidence threshold, setting the element corresponding to P (B|A) in the adjacent matrix as 1, otherwise setting the element as 0; in addition, for all the network devices 200 that generate anomalies and are acquired by the same time acquisition point, the element corresponding to P (a|a) in the adjacency matrix is set to 0; acquiring the network topology relation of the network equipment 200 which generates the abnormality and is acquired by the same time acquisition point according to the adjacency matrix; then, loop detection is carried out, loops in the topological network relation are eliminated, and the specific mode for eliminating the loops in the topological network relation can be as follows: setting a loop scale threshold, and when the number of abnormal elements forming a loop is smaller than or equal to the threshold, marking the abnormal elements in the loop, and merging the abnormal elements into a logic abnormal element; otherwise, combining the conditional probability matrix, and sequentially setting the edges forming the loop in the adjacent matrix to 0 according to the sequence from the smaller conditional probability value to the larger conditional probability value until the loop is eliminated. Wherein, the established network topology relationship may be, but is not limited to, as shown in fig. 3, and a circle in fig. 3 represents a network device.
The network fault location method monitors the operation states of a plurality of network devices 200 by every preset time in a current target time period; determining network equipment 200 to be traced with abnormal running state at the same acquisition time point from a plurality of network equipment 200; according to the network equipment 200 to be traced which is abnormal and the abnormal network topological relation established by the network equipment 200 to be traced which is abnormal in advance according to the historical target time period, the target network equipment 200 which is abnormal is determined, so that the target network equipment 200 which is abnormal is automatically positioned, manual participation is not needed, and the time cost and the labor cost are saved.
Optionally, as shown in fig. 4, S13 includes:
s131: the network device 200 to be traced, which is abnormal, is mapped into the abnormal network topology.
For example, if the network device 200 to be traced, in which the anomaly has occurred, is the network device 200A, the network device 200A is found in the anomaly network topology relationship, so that the network device 200A establishes an association with other anomaly network devices 200.
S132: from the abnormal network topology relationship, the network device 200 having an abnormality adjacent to the network device 200 to be traced back, in which the abnormality has occurred, is determined.
Among the abnormal network topology relationships, the network device 200 having an abnormality, which has a direct association relationship with the network device 200A, i.e., the network device 200 having an abnormality adjacent to the network device 200 to be traced, which has an abnormality.
S133: the probability that the network device 200 to be traced back, which has an abnormality, is caused to have an abnormality by the adjacent network devices 200 is determined.
Specifically, S133 may include a method according to the formulaDetermining a probability that the network device 200 to be traced back, which is abnormal, is caused to be abnormal by the adjacent network device 200, wherein S B|A The number of network devices 200 adjacent to the network device 200 to be traced back to the occurrence of the abnormality; s is S A The number of network devices 200 to be traced back for occurrence of an anomaly; alpha is a preset condition threshold; p (b|a) is a probability value.
S134: if the probability of occurrence of the anomaly is greater than a preset threshold, determining the adjacent network device 200 as a new network device 200 to be traced back of occurrence of the anomaly.
S135: returning to the step of determining the probability that the network device 200 to be traced back, which is abnormal, is caused to be abnormal by the adjacent network device 200 until the probability that the abnormality occurs is smaller than a preset threshold value.
S136: all the network devices 200 to be traced back which are determined to be new abnormal are taken as the target network devices 200 which have network faults.
Optionally, the process of determining the network device 200 to be traced, where the anomaly occurs, is: and screening the network equipment 200 to be traced back, which generates the abnormality, from the running network equipment 200 according to a preset abnormality filtering algorithm. For example, network devices 200 with data processing amount greater than a preset first threshold value and survival time smaller than a preset second threshold value in the same time acquisition point are screened out from the running network devices 200, and the network devices 200 to be traced are determined to be abnormal.
Referring to fig. 5, the embodiment of the present application further provides a network fault location apparatus 500, which is applied to the electronic device 100, where the electronic device 100 may be, but is not limited to, a monitoring server, as shown in fig. 2, the electronic device 100 is communicatively connected to a plurality of network devices 200, and the electronic device 100 may monitor an operation state of the network devices 200, where the network devices 200 may be a switch, a router, a data processing server, and so on. It should be noted that, the basic principle and the technical effects of the network fault locating device 500 provided in the embodiment of the present application are the same as those of the above embodiment, and for brevity, reference may be made to the corresponding contents in the above embodiment for the description of the embodiment of the present application. The apparatus 500 comprises an operation state analysis unit 501, an abnormal device determination unit 502, a fault localization unit 503, wherein,
the operation state analysis unit 501 is configured to monitor the operation states of the plurality of network devices 200 every preset time in the current target period.
The abnormal device determining unit 502 is configured to determine, from the plurality of network devices 200, a network device 200 to be traced, in which an abnormal operation state occurs at the same collection time point.
The fault locating unit 503 is configured to determine the target network device 200 with the network fault according to the abnormal network topology relationship established by the network device 200 to be traced with the abnormality and the network device 200 to be traced with the abnormality according to the historical target time period in advance.
The network fault location device 500 may perform the following functions when executed: monitoring the operation states of the plurality of network devices 200 by every preset time in the current target period; determining network equipment 200 to be traced with abnormal running state at the same acquisition time point from a plurality of network equipment 200; according to the network equipment 200 to be traced which is abnormal and the abnormal network topological relation established by the network equipment 200 to be traced which is abnormal in advance according to the historical target time period, the target network equipment 200 which is abnormal is determined, so that the target network equipment 200 which is abnormal is automatically positioned, manual participation is not needed, and the time cost and the labor cost are saved.
Alternatively, as shown in fig. 6, the fault location unit 503 includes a relationship mapping module 601, a device determination module 602, a probability determination module 603, a target update module 604, a process return module 605, and a fault location module 606.
The relationship mapping module 601 is configured to map the network device 200 to be traced with the anomaly into an anomaly network topology relationship.
The device determining module 602 is configured to determine, from the abnormal network topology relationship, the network device 200 adjacent to the network device 200 to be traced back, where the abnormality occurs, that has the abnormality.
The probability determination module 603 is configured to determine a probability that the network device 200 to be traced back, which is abnormal, is caused to be abnormal by the adjacent network device 200.
The probability determination module 603 is specifically configured toDetermining a probability that the network device 200 to be traced back, which is abnormal, is caused to be abnormal by the adjacent network device 200, wherein S B|A The number of network devices 200 adjacent to the network device 200 to be traced back to the occurrence of the abnormality; s is S A The number of network devices 200 to be traced back for occurrence of an anomaly; alpha is a preset condition threshold, and P (B|A) is a probability value.
The target updating module 604 is configured to determine the adjacent network device 200 as a new network device 200 to be traced back when the probability of occurrence of the anomaly is greater than a preset threshold.
The process returning module 605 is configured to return to the step of determining the probability that the network device 200 to be traced back, which is abnormal, is caused to be abnormal by the adjacent network device 200 until the probability that the abnormality is generated is less than a preset threshold.
The fault locating module 606 is configured to take all network devices 200 to be traced back, which are determined to be new abnormal, as the target network devices 200 where the network fault occurs.
Specifically, the anomaly device determining unit 502 is specifically configured to screen the network device 200 to be traced for occurrence of anomalies from the running network device 200 according to a preset anomaly filtering algorithm.
The abnormal device determining unit 502 is specifically configured to screen out, from the running network devices 200, the network devices 200 whose data processing amounts in the same time acquisition points are greater than a preset first threshold value and whose survival times are less than a preset second threshold value, and determine that the network devices 200 to be traced back are abnormal.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Referring to fig. 7, at the hardware level, the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, network interface, and memory may be interconnected by an internal bus, which may be an ISA (Industry Standard Architecture ) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 7, but not only one bus or type of bus.
And the memory is used for storing programs. In particular, the program may include program code including computer-operating instructions. The memory may include memory and non-volatile storage and provide instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs, and forms the network fault locating device on the logic level. The processor is used for executing the programs stored in the memory and is specifically used for executing the following operations:
monitoring the running states of a plurality of network devices at intervals of preset time in a current target time period;
determining network equipment to be traced with abnormal running state at the same acquisition time point from a plurality of network equipment;
and determining the target network equipment with network faults according to the abnormal network topology relationship established by the abnormal network equipment to be traced and which is generated in advance according to the historical target time period.
The method performed by the network fault locating device disclosed in the embodiment shown in fig. 1 of the present application may be applied to a processor or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
The electronic device may also execute the method of fig. 1 and implement the functions of the network fault location device in the embodiment shown in fig. 1, which is not described herein.
Of course, other implementations, such as a logic device or a combination of hardware and software, are not excluded from the electronic device of the present application, that is, the execution subject of the following processing flow is not limited to each logic unit, but may be hardware or a logic device.
The present embodiments also provide a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, enable the portable electronic device to perform the method of the embodiment of fig. 1, and in particular to:
monitoring the running states of a plurality of network devices at intervals of preset time in a current target time period;
determining network equipment to be traced with abnormal running state at the same acquisition time point from a plurality of network equipment;
and determining the target network equipment with network faults according to the abnormal network topology relationship established by the abnormal network equipment to be traced and which is generated in advance according to the historical target time period.
In summary, the foregoing description is only a preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

Claims (8)

1. A method for locating a network failure, the method comprising:
monitoring the running states of a plurality of network devices at intervals of preset time in a current target time period;
determining network equipment to be traced with abnormal running state at the same acquisition time point from a plurality of network equipment;
determining a target network device with network faults according to abnormal network topology relations established by abnormal network devices to be traced and preset according to the historical target time period;
the determining the target network device with network fault according to the abnormal network device to be traced, and the abnormal network topological relation established by the abnormal network device to be traced in advance according to the historical target time period comprises the following steps:
mapping the network equipment to be traced which is abnormal to an abnormal network topological relation;
determining network equipment adjacent to the network equipment to be traced, which is abnormal, with abnormality from the abnormal network topological relation;
determining the probability that the network equipment to be traced with the abnormality is caused to be abnormal by the adjacent network equipment;
if the probability of causing the occurrence of the abnormality is larger than a preset threshold value, determining adjacent network equipment as new network equipment to be traced back, wherein the occurrence of the abnormality is new;
returning to the step of determining the probability that the network equipment to be traced back is caused to generate the abnormality by the adjacent network equipment, until the probability that the abnormality is caused to generate is smaller than a preset threshold value;
and taking all the network devices to be traced which are determined to be new abnormal network devices as target network devices with network faults.
2. The method of claim 1, wherein determining the probability that the network device to be traced back to the occurrence of the anomaly is caused to occur by the neighboring network device comprises:
according to the formulaDetermining the probability that the network equipment to be traced with the abnormality is caused to be abnormal by the adjacent network equipment, wherein S B|A The number of network devices adjacent to the network device to be traced back, which is abnormal, is the number of the network devices adjacent to the network device to be traced back, which is abnormal; s is S A The number of network devices to be traced for occurrence of abnormality; alpha is a preset condition threshold, and P (B|A) is a probability value.
3. The method of claim 1, wherein the determining that the network device to be traced is abnormal is: and screening the network equipment to be traced which generates the abnormality from the running network equipment according to a preset abnormality filtering algorithm.
4. A method according to claim 3, wherein the screening network devices to be traced for anomalies from the running network devices according to a preset anomaly filtering algorithm comprises:
and screening out network equipment with data processing capacity larger than a preset first threshold value and survival time smaller than a preset second threshold value in the same time acquisition point from the running network equipment, and determining the network equipment as the network equipment to be traced back, wherein the network equipment is abnormal.
5. A network fault location device, the device comprising:
an operation state analysis unit configured to monitor operation states of a plurality of network devices at every preset time interval in a current target period;
the abnormal equipment determining unit is configured to determine network equipment to be traced back, wherein the running state of the network equipment is abnormal at the same acquisition time point, from the plurality of network equipment;
the fault positioning unit is configured to determine a target network device with network faults according to abnormal network topology relations established by the abnormal network device to be traced and which are generated in advance according to a historical target time period; the abnormal network topological relation is established based on the condition probability that the abnormal network equipment is caused to be abnormal by the network equipment which is acquired at the same time in history;
the fault locating unit includes:
the relation mapping module is configured to map the network equipment to be traced with the abnormality into an abnormal network topological relation;
the device determining module is configured to determine abnormal network devices adjacent to the network devices to be traced, which are abnormal, from the abnormal network topological relation;
the probability determining module is configured to determine the probability that the network equipment to be traced back, which is abnormal, is caused to be abnormal by the adjacent network equipment;
the target updating module is configured to determine the adjacent network equipment as new network equipment to be traced back, wherein the occurrence probability of the abnormality is greater than a preset threshold value;
the process returning module is configured to return the step of determining the occurrence probability of the abnormality of the network equipment to be traced back caused by the adjacent network equipment until the occurrence probability of the abnormality is smaller than a preset threshold value;
and the fault positioning module is configured to take all network devices to be traced back, which are determined to be new abnormal network devices, as target network devices with network faults.
6. The apparatus of claim 5, wherein the probability determination module is specifically configured to, according to the formulaDetermining the probability that the network equipment to be traced with the abnormality is caused to be abnormal by the adjacent network equipment, wherein S B|A The number of network devices adjacent to the network device to be traced back, which is abnormal, is the number of the network devices adjacent to the network device to be traced back, which is abnormal; s is S A The number of network devices to be traced for occurrence of abnormality; alpha is a preset condition threshold, and P (B|A) is a probability value.
7. The apparatus according to claim 5, wherein the anomaly device determining unit is specifically configured to screen network devices to be traced for anomalies from the running network devices according to a preset anomaly filtering algorithm.
8. The apparatus according to claim 7, wherein the abnormal device determining unit is specifically configured to screen out network devices with data processing amounts greater than a preset first threshold value and survival times less than a preset second threshold value in the same time acquisition point from the running network devices, and determine that the network devices are abnormal network devices to be traced back.
CN202010202402.5A 2020-03-20 2020-03-20 Network fault positioning method and device Active CN113497721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010202402.5A CN113497721B (en) 2020-03-20 2020-03-20 Network fault positioning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010202402.5A CN113497721B (en) 2020-03-20 2020-03-20 Network fault positioning method and device

Publications (2)

Publication Number Publication Date
CN113497721A CN113497721A (en) 2021-10-12
CN113497721B true CN113497721B (en) 2023-08-01

Family

ID=77994361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010202402.5A Active CN113497721B (en) 2020-03-20 2020-03-20 Network fault positioning method and device

Country Status (1)

Country Link
CN (1) CN113497721B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115857461B (en) * 2023-03-02 2023-05-09 东莞正大康地饲料有限公司 Online monitoring method and system for production of premixed feed for piglets

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010134862A (en) * 2008-12-08 2010-06-17 Nec Corp Log analysis system, method, and program
CN103986604A (en) * 2014-05-23 2014-08-13 华为技术有限公司 Method and device for locating network fault
CN108600009A (en) * 2018-04-25 2018-09-28 北京思特奇信息技术股份有限公司 A kind of network alarm root localization method based on alarm data analysis
CN109298704A (en) * 2018-08-31 2019-02-01 江苏方天电力技术有限公司 A kind of industrial failure path retroactive method and system based on Bayesian network

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10015089B1 (en) * 2016-04-26 2018-07-03 Sprint Communications Company L.P. Enhanced node B (eNB) backhaul network topology mapping
CN106656588A (en) * 2016-12-12 2017-05-10 国网北京市电力公司 Fault locating method and device for intelligent substation
CN108306748B (en) * 2017-01-12 2021-03-30 阿里巴巴集团控股有限公司 Network fault positioning method and device and interaction device
CN108737147B (en) * 2017-04-25 2021-09-03 中国移动通信集团广东有限公司 Network alarm event processing method and device
US10802942B2 (en) * 2018-12-28 2020-10-13 Intel Corporation Methods and apparatus to detect anomalies of a monitored system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010134862A (en) * 2008-12-08 2010-06-17 Nec Corp Log analysis system, method, and program
CN103986604A (en) * 2014-05-23 2014-08-13 华为技术有限公司 Method and device for locating network fault
CN108600009A (en) * 2018-04-25 2018-09-28 北京思特奇信息技术股份有限公司 A kind of network alarm root localization method based on alarm data analysis
CN109298704A (en) * 2018-08-31 2019-02-01 江苏方天电力技术有限公司 A kind of industrial failure path retroactive method and system based on Bayesian network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Research on algorithm for peer-to-peer overlay network topology automatic-restoration;Bie Zhi,Wang Fan;2016 First IEEE ICCCI;全文 *
复杂性网络攻击的行为关联性分析;陈瑞东;中国优秀硕士学位论文数据库;全文 *

Also Published As

Publication number Publication date
CN113497721A (en) 2021-10-12

Similar Documents

Publication Publication Date Title
CN110213068B (en) Message middleware monitoring method and related equipment
US10585774B2 (en) Detection of misbehaving components for large scale distributed systems
CN108683530B (en) Data analysis method and device for multi-dimensional data and storage medium
US10361943B2 (en) Methods providing performance management using a proxy baseline and related systems and computer program products
CN111600746B (en) Network fault positioning method, device and equipment
TW201941058A (en) Anomaly detection method and device
US20170039554A1 (en) Method And System For Real-Time, Load-Driven Multidimensional And Hierarchical Classification Of Monitored Transaction Executions For Visualization And Analysis Tasks Like Statistical Anomaly Detection
CN109639504B (en) Alarm information processing method and device based on cloud platform
CN112311617A (en) Configured data monitoring and alarming method and system
CN111786818A (en) Block chain consensus node state monitoring method and device
CN109327353B (en) Service flow determination method and device and electronic equipment
CN114465870B (en) Alarm information processing method and device, storage medium and electronic equipment
CN110990233A (en) Method and system for displaying SOAR by using Gantt chart
US20200099570A1 (en) Cross-domain topological alarm suppression
CN105549508A (en) Alarm method based on information combination and apparatus thereof
CN112737800A (en) Service node fault positioning method, call chain generation method and server
CN111258798A (en) Fault positioning method and device for monitoring data, computer equipment and storage medium
CN113497721B (en) Network fault positioning method and device
CN112152833B (en) Network abnormity alarm method and device and electronic equipment
CN112526905B (en) Processing method and system for index abnormity
CN110138720B (en) Method and device for detecting abnormal classification of network traffic, storage medium and processor
CN113835961B (en) Alarm information monitoring method, device, server and storage medium
CN111431977B (en) Processing method and system for malicious node in block chain system
CN113076451B (en) Abnormal behavior identification and risk model library establishment method and device and electronic equipment
CN114860432A (en) Method and device for determining information of memory fault

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant