CN114172796B - Fault positioning method and related device for communication network - Google Patents

Fault positioning method and related device for communication network Download PDF

Info

Publication number
CN114172796B
CN114172796B CN202111603641.2A CN202111603641A CN114172796B CN 114172796 B CN114172796 B CN 114172796B CN 202111603641 A CN202111603641 A CN 202111603641A CN 114172796 B CN114172796 B CN 114172796B
Authority
CN
China
Prior art keywords
network
switches
flow
switch
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111603641.2A
Other languages
Chinese (zh)
Other versions
CN114172796A (en
Inventor
霍江游
张勇
李骢
许广洋
徐晨灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202111603641.2A priority Critical patent/CN114172796B/en
Publication of CN114172796A publication Critical patent/CN114172796A/en
Application granted granted Critical
Publication of CN114172796B publication Critical patent/CN114172796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/55Prevention, detection or correction of errors
    • H04L49/555Error detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The fault positioning method and the related device of the communication network can be used in the technical field of information security or other fields. In the technical scheme provided by the application, the communication network comprises first end side equipment, second end side equipment and N switches, wherein the first end side equipment and the second end side equipment transmit data streams through the N switches, N is a positive integer, network flow is obtained when the data streams flow through each of S switches in the N switches, and S is a positive integer smaller than or equal to N; p switches with abnormal network flow of the data flow are determined according to the network flow when the data flow flows through each switch in the S switches, wherein P is a positive integer less than or equal to S; and then according to the position of each switch in the P switches in the communication network and the network flow information of each switch in the P switches, the network fault in the communication network is positioned.

Description

Fault positioning method and related device for communication network
Technical Field
The present disclosure relates to the field of information security technologies, and in particular, to a fault locating method and related device for a communication network.
Background
The nature of the network is that of communication, providing a channel to ensure that information is sent from the source to the destination without errors. The network is composed of several nodes and links connecting the nodes, and the nodes in the network may be computers, switches, routers or mobile terminals. During the network operation, each level of nodes can automatically switch paths and establish a data exchange channel, but in the process, network faults such as packet loss or delay can occur. In order to ensure the normal operation of the network, various faults in the network need to be processed in time, and the primary task of solving the network faults is to locate the network faults.
Therefore, how to locate the failure of the communication network is a problem to be solved.
Disclosure of Invention
The application provides a fault positioning method and a related device for a communication network, which realize real-time positioning of network faults.
In a first aspect, the present application provides a fault location method of a communication network, where the communication network includes a first end side device, a second end side device, and N switches, where the first end side device and the second end side device perform data stream transmission through the N switches, and N is a positive integer, and the method includes: acquiring network flow when the data flow flows through each of S switches in the N switches, wherein S is a positive integer less than or equal to N; determining P switches with abnormal network flow of the data flow according to the network flow when the data flow flows through each switch in the S switches, wherein P is a positive integer less than or equal to S; and positioning network faults in the communication network according to the position of each switch in the communication network and the network flow information of the data flow in each switch in the P switches, wherein the network flow information comprises the value of each index in at least one communication index.
In the method, data flow transmission is carried out between first end side equipment and second end side equipment in a communication network through N switches, P switches with abnormal network flow of the data flow are determined according to the acquired network flow when the data flow flows through each of S switches in the N switches, and network faults in the communication network are positioned according to the position of each switch in the P switches in the communication network and the flow information of each switch in the P switches, wherein N, S and P are positive integers, N is greater than or equal to S is greater than or equal to P, and the problem of positioning of the network faults in the communication network is solved. In addition, when the existence of an abnormal switch in the switches through which the data flow flows is determined according to the acquired network flow when the data flow flows through each switch, the network fault locating flow is immediately triggered, and the efficiency and the instantaneity of fault locating of the communication network are improved.
In one possible implementation manner, the acquiring network traffic when the data flow flows through each of S switches in the N switches includes: the network flow collected by each flow probe in S flow probes is obtained, the network flow when the data flow flows through each switch in S switches is obtained, the S flow probes are in one-to-one correspondence with the S switches, and each flow probe in S flow probes is used for collecting the network flow flowing through the corresponding switch in S switches.
In the implementation manner, the S flow probes are used for collecting the network flow when the data flow flows through each switch in the S switches, and the S flow probes are in one-to-one correspondence with the S switches, so that the efficiency of obtaining the network flow when the data flow flows through each switch is improved.
In one possible implementation manner, the determining P switches with abnormal network traffic of the data flow according to the network traffic of the data flow flowing through each switch in the S switches, and the network traffic information of each switch in the P switches, includes: detecting the network flow when the data flow flows through each of the S switches to obtain the network flow information when the data flow flows through each of the S switches; judging the network flow information when the data flow flows through each switch in the S switches based on a preset network flow evaluation standard, and obtaining P switches with abnormal network flow of the data flow in the S switches and the network flow information of each switch in the P switches, wherein the network flow evaluation standard indicates the health value of each index in the at least one communication index.
In the implementation manner, the network flow obtained when the data flow flows through each switch is detected, the network flow information when the data flow flows through each switch is obtained, the obtained network flow information is judged based on a preset network flow evaluation standard, P switches with abnormal network flow of the data flow and the network flow information of each switch in the P switches are obtained, wherein the network flow evaluation standard indicates the health value of each index in at least one communication index, and the accuracy of judging the switches with abnormal network flow of the data flow is improved.
In one possible implementation manner, the locating the network fault in the communication network according to the location of each switch in the P switches in the communication network and the network traffic information of each switch in the P switches includes: and if the P is larger than 2, determining that the network fault corresponding to the first index occurs on the transmission path between the first switch and the second switch if the first value of the first index in the first network flow information corresponding to the first switch in the P switches is not equal to the second value of the first index in the second network flow information corresponding to the second switch and the first index is the index with abnormality.
In the implementation manner, if the number of switches with abnormal network traffic of the data flow is greater than 2, if the first value of the first index in the first network traffic information corresponding to the first switch in the switches with abnormal network traffic of the data flow is not equal to the second value of the first index in the second network traffic information corresponding to the second switch, and the first index is the abnormal index, determining that the network fault corresponding to the first index occurs on the transmission path between the first switch and the second switch, and improving the accuracy of fault location in the communication network.
In one possible implementation manner, the locating the network fault in the communication network according to the location of each switch in the P switches in the communication network and the network traffic information of each switch in the P switches includes: if a third switch in the P switches is the switch closest to the first end side device in the S switches, determining that a network fault corresponding to a second index occurs on a transmission path before the third switch, where the second index is an index of abnormality in network traffic information corresponding to the third switch.
In the implementation manner, if a third switch in the switches with abnormal network traffic of the data flow is the switch closest to the first end side device in all switches for acquiring the network traffic in the switches with abnormal network traffic of the data flow, determining that a network fault corresponding to a second index occurs on a transmission path before the third switch, wherein the second index is an index with abnormal network traffic information corresponding to the third switch, and improving the accuracy of fault positioning in the communication network.
In a possible implementation manner, the data flow is one of a plurality of network flows in the communication network, and accordingly, the method further includes: and if the transmission paths of the faults corresponding to all the data streams in the network streams contain the same communication equipment, determining that the same communication equipment has faults.
In the implementation manner, if the data stream is one of a plurality of network streams in the communication network and the transmission paths of the faults corresponding to all the data streams in the plurality of network streams contain the same communication equipment, the same communication equipment is determined to have faults, so that the accuracy of fault positioning in the communication network is improved.
In one possible implementation, the at least one communication indicator includes one or more of the following: flow, number of concurrent connections, rate of established links, rate of connection no response, rate of connection failure, rate of terminated links, retransmission rate, rate of packet loss, time delay, response time, rate of response, or number of service 0 windows.
In a second aspect, the present application provides a fault locating device of a communication network, where the communication network includes a first end-side device, a second end-side device, and N switches, where the first end-side device and the second end-side device perform data stream transmission through the N switches, and N is a positive integer, and the device includes: the acquisition module is used for acquiring the network flow when the data flow flows through each of S switches in the N switches, wherein S is a positive integer less than or equal to N; the determining module is used for determining P switches with abnormal network flow of the data flow according to the network flow when the data flow flows through each switch in the S switches, wherein P is a positive integer less than or equal to S; and the positioning module is used for positioning the network fault in the communication network according to the position of each switch in the communication network and the network flow information of each switch in the P switches, wherein the network flow information comprises the value of each index in at least one communication index.
In one possible implementation manner, the acquiring module is specifically configured to: the network flow collected by each flow probe in S flow probes is obtained, the network flow when the data flow flows through each switch in S switches is obtained, the S flow probes are in one-to-one correspondence with the S switches, and each flow probe in S flow probes is used for collecting the network flow flowing through the corresponding switch in S switches.
In one possible implementation manner, the determining module is specifically configured to: detecting the network flow when the data flow flows through each of the S switches to obtain the network flow information when the data flow flows through each of the S switches; judging the network flow information when the data flow flows through each switch in the S switches based on a preset network flow evaluation standard, and obtaining P switches with abnormal network flow of the data flow in the S switches and the network flow information of each switch in the P switches, wherein the network flow evaluation standard indicates the health value of each index in the at least one communication index.
In one possible implementation manner, the positioning module is specifically configured to: and if the P is larger than 2, determining that the network fault corresponding to the first index occurs on the transmission path between the first switch and the second switch if the first value of the first index in the first network flow information corresponding to the first switch in the P switches is not equal to the second value of the first index in the second network flow information corresponding to the second switch and the first index is the index with abnormality.
In one possible implementation manner, the positioning module is specifically configured to: if a third switch in the P switches is the switch closest to the first end side device in the S switches, determining that a network fault corresponding to a second index occurs on a transmission path before the third switch, where the second index is an index of abnormality in network traffic information corresponding to the third switch.
In a possible implementation manner, the data flow is one of a plurality of network flows in the communication network, and correspondingly, the positioning module is further configured to: and if the transmission paths of the multiple network flows, which are corresponding to all the data flows and have faults, contain the same communication equipment, determining that the same communication equipment has faults.
In one possible implementation, the at least one communication indicator includes one or more of the following: flow, number of concurrent connections, rate of established links, rate of connection no response, rate of connection failure, rate of terminated links, retransmission rate, rate of packet loss, time delay, response time, rate of response, or number of service 0 windows.
The advantages of the second aspect and the various possible implementations of the second aspect may be referred to in the first aspect and the various possible implementations of the first aspect, and are not described here again.
In a third aspect, the present application provides a fault locating device for a communication network. The apparatus may include a processor coupled to a memory. Wherein the memory is for storing program code and the processor is for executing the program code in the memory to implement the method of the first aspect or any one of the implementations.
Optionally, the apparatus may further comprise the memory.
In a fourth aspect, the present application provides a chip comprising at least one processor and a communication interface, the communication interface and the at least one processor being interconnected by a wire, the at least one processor being configured to execute a computer program or instructions to perform a method as described in the first aspect or any one of the possible implementations thereof.
In a fifth aspect, the present application provides a computer readable medium storing program code for execution by a device, the program code comprising instructions for performing the method of the first aspect or any one of the possible implementations thereof.
In a sixth aspect, the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method according to the first aspect or any one of the possible implementations thereof.
In a seventh aspect, the present application provides a computing device comprising at least one processor and a communication interface, the communication interface and the at least one processor being interconnected by a wire, the communication interface being in communication with a target system, the at least one processor being configured to execute a computer program or instructions to perform a method as described in the first aspect or any one of the possible implementations thereof.
In an eighth aspect, the present application provides a computing system comprising at least one processor and a communication interface, the communication interface and the at least one processor being interconnected by a line, the communication interface being in communication with a target system, the at least one processor being configured to execute a computer program or instructions to perform a method as described in the first aspect or any one of the possible implementations thereof.
Drawings
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present application;
fig. 2 is a flow chart of a fault location method of a communication network according to an embodiment of the present application;
fig. 3 is a flow chart of a fault location method of a communication network according to an embodiment of the present application;
FIG. 4 is a schematic block diagram of a fault locating device of a communication network according to one embodiment of the present application;
fig. 5 is a schematic structural diagram of a fault locating device of a communication network according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
It should be noted that, the fault locating method and the related device of the communication network disclosed in the present application may be used in the information security technical field, and may also be used in any field other than the information security technical field, and the application field is not limited in this application.
Fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present application. As shown in fig. 1, network fault location system 100 includes a communication network 110 and a network fault location server 120.
The communication network 110 may include a first end side device 111, a second end side device 112, switch 1, switch 2 …, switch N. Data communication is performed between the first end-side device 111 and the second end-side device 112 through a switch, and the number of switches may be one or more, which is not limited in this application.
The communication network 110 may comprise further end-side devices, the first end-side device 111 and the second end-side device 112 being only any two of the plurality of end-side devices in the communication network 110. The end-side devices may be clients and servers. Typically, the first end-side device is a client and the second end-side device is a server.
The network fault location server 120 is a device for providing network fault location in a communication network. The network fault location server 120 may be a blade server, a rack-mounted server, or the like, and the network fault location server 120 may also be a server cluster deployed in the cloud, which is not limited in this application.
It is to be understood that the system architecture shown in fig. 1 is only one example of a network fault location system provided herein, and that in other embodiments of the present application, network fault location system 100 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Fig. 2 is a flow chart of a fault locating method of a communication network according to an embodiment of the present application, as shown in fig. 2, the method at least includes S201 to S203.
S201, network flow is obtained when the data flow flows through each of S switches in the N switches, and S is a positive integer less than or equal to N.
The communication network comprises first end side equipment, second end side equipment and X switches, wherein the first end side equipment and the second end side equipment transmit data streams through N switches in the X switches in the communication network, N is a positive integer, and X is a positive integer greater than or equal to N.
More end-side devices may be included in the communication network, the first end-side device and the second end-side device being only any two of the plurality of end-side devices in the communication network. The end-side device includes a client and a server.
Acquiring network traffic when a data stream flows through each of S switches in the N switches, and when S is equal to N, acquiring communication traffic when the data stream passes through each switch; when S is smaller than N, it means that communication traffic is acquired only when the partial switch through which the data stream flows.
The following possible implementations are possible for obtaining the network traffic when the data flow flows through each of the S switches in the N switches:
In one possible implementation manner, S flow probes are disposed in the communication network, the S flow probes are in one-to-one correspondence with the S switches, each flow probe in the S flow probes is used for collecting network traffic flowing through a corresponding switch in the S switches, and the network traffic collected by each flow probe in the S flow probes can obtain the network traffic when the data stream flows through each switch in the S switches.
Illustratively, the flow probe may be a sniffer (sniffer) probe.
In another possible implementation, network traffic is collected as a data stream flows through each of the S switches by a telemetry (telemetry) function of the S switches in the communication network.
Illustratively, a telemetry function of each of the S switches in the communication network is turned on, and used in conjunction with collecting inter-access traffic between servers.
In yet another possible implementation, the communication traffic between servers and within the servers is collected by a server transport control protocol (transmission control protocol, TCP) collection script.
The server executes the TCP acquisition script according to a preset period, and randomly reports the network traffic of the data stream to the corresponding collector of each of the S switches through the hypertext transfer protocol (hyper text transfer protocol, HTTP) within a time window.
S202, determining P switches with abnormal network flow of the data flow according to the network flow when the data flow flows through each switch in the S switches, wherein P is a positive integer less than or equal to S.
The switch with abnormal network traffic of the data flow refers to a switch corresponding to the network traffic with network faults such as packet loss, delay and the like in the acquired network traffic.
In one possible implementation manner, network traffic when the data stream flows through each of the S switches is detected, so as to obtain network traffic information when the data stream flows through each of the S switches, and then the network traffic information when the data stream flows through each of the S switches is judged based on a preset network traffic evaluation criterion, so as to obtain P switches in which the network traffic of the data stream in the S switches is abnormal and the network traffic information of each of the P switches, where the network traffic evaluation criterion indicates a health value of each of at least one communication index.
As one example, the network traffic information includes a value for each of the at least one communication indicator, and the network traffic evaluation criteria indicates a health value for each of the at least one communication indicator. The communication index may include traffic, number of concurrent connections, established link ratio, connection non-response rate, connection failure rate, terminated link ratio, retransmission rate, packet loss rate, time delay, response time, response rate, number of service 0 windows, etc.
As an example, 12 indexes of 4 dimensions of business trend, link establishment/disconnection, transmission performance and load interaction are screened out according to the principles of multisource and network faults of the acquisition modes, the 12 indexes are used for establishing a flow health index system, data of each acquisition mode are abstracted and converted according to the health index system to shield differences of different data, and decoupling data acquisition and upper layer analysis functions are realized.
The traffic trend dimension in the established traffic health index system comprises traffic and concurrent connection two indexes, the link establishment/disconnection dimension comprises four indexes of link establishment proportion, connection non-response rate, connection failure rate and termination link proportion, the transmission performance dimension comprises three indexes of retransmission rate, packet loss rate and time delay, and the load interaction dimension comprises three indexes of response time, response rate and service 0 window times.
After the characteristic index system is built, the network flow condition can be analyzed by using an intelligent algorithm, the intelligent algorithm forms an evaluation standard for the network flow health condition by learning a large amount of monitoring data, and the intelligent algorithm can identify the abnormality of the network flow with high accuracy by matching with expert rules; meanwhile, the system can detect the total network traffic in real time, find abnormality and output prompt information to help troubleshooting.
Illustratively, the intelligent algorithm is used to analyze the plurality of monitored network traffic data to obtain a network traffic evaluation criterion, where the network traffic evaluation criterion indicates a health value of each of at least one indicator in the traffic health indicator system.
As an example, determining network traffic information when a data stream flows through each of the S switches based on a preset network traffic evaluation criterion, obtaining P switches in which an abnormality occurs in the network traffic of the data stream in the S switches includes: comparing the actual value corresponding to each communication index in the network flow information when the data flow flows through each switch in the S switches with the health value corresponding to the corresponding communication index in the preset network flow evaluation standard, and recording the switch corresponding to the network flow information, of which the actual value corresponding to the communication index is smaller than the health value corresponding to the communication index in the preset network flow evaluation standard, as the switch of which the network flow of the data flow is abnormal.
S203, positioning network faults in the communication network according to the position of each switch in the communication network and the network flow information of the data flow in each switch in the P switches, wherein the network flow information comprises the value of each index in at least one communication index.
In one possible implementation manner, when P is greater than 2, that is, when the number of switches in which the abnormality occurs in the network traffic information in the switches through which the data flow between the first end-side device and the second end-side device flows is greater than 2, if the first value of the first index in the first network traffic information corresponding to the first switch in the P switches is not equal to the second value of the first index in the second network traffic information corresponding to the second switch, and the first index is the index in which the abnormality occurs, it is determined that the network failure corresponding to the first index occurs on the transmission path between the first switch and the second switch.
The network failure corresponding to the first index occurs on a transmission path between the first switch and the second switch, which means that one or more communication devices or links on the transmission path have the network failure corresponding to the first index, and the transmission path comprises the first switch and the second switch.
In another possible implementation manner, if the third switch in the P switches is the switch closest to the first end device in the S switches, it is determined that a network failure corresponding to the second indicator occurs on the transmission path before the third switch, where the second indicator is an indicator that an abnormality occurs in the network traffic information corresponding to the third switch.
Wherein the switch closest to the first end-side device represents the switch closest to the first end-side device on the transmission path of the data stream, and not the switch closest to the first end-side device in physical distance.
As an example, the communication devices through which the data flow flows are a first end side device, a first switch, a second switch, and a second end side device in order, and when the first index is packet loss, when the first value of the first index in the first network traffic information corresponding to the first switch is 0 and the second value of the first index in the second network traffic information corresponding to the second switch is 1, it indicates that the data flow generates packet loss when the first switch flows to the second switch; when the first value of the first index in the first network flow information corresponding to the first switch and the second value of the first index in the second network flow information corresponding to the second switch are both 1, the data flow is indicated to generate packet loss when flowing from the first end side device to the first switch; when the first value of the first index in the first network flow information corresponding to the first switch and the second value of the first index in the second network flow information corresponding to the second switch are both 0 and the second end side device feeds back that the packet loss occurs, the packet loss occurs when the data flow flows from the second switch to the second end side device.
As another example, the communication devices through which the data flow flows are sequentially a first end side device, a first switch, a second switch, and a second end side device, where the network traffic information includes two indexes of a link establishment delay and a server response delay when the first index is a delay, and when values of the link establishment delay and the server response delay in the first network traffic information corresponding to the first switch are both 0, and the values of the link establishment delay and the server response delay in the second network traffic information corresponding to the second switch are both 1, the data flow is delayed when the first switch flows to the second switch; when the values of the link establishment delay and the server response delay in the first network flow information corresponding to the first switch are 1, and the values of the link establishment delay and the server response delay in the second network flow information corresponding to the second switch are 1, the delay is generated when the data flow flows from the first end side equipment to the first switch; when the values of the link establishment delay and the server response delay in the first network traffic information corresponding to the first switch are both 0, the values of the link establishment delay and the server response delay in the second network traffic information corresponding to the second switch are both 0, which indicates that the delay occurs in the second terminal side equipment.
As yet another example, the data flow flows from the first end side device to the second end side device through the plurality of switches, but only the third network traffic information of the third switch of the plurality of switches is collected, and when the second index is packet loss, when the value of the packet loss of the client in the third network traffic information is 1, it indicates that the data flow flows from the first end side device to the third switch, and packet loss occurs; when the value of the packet loss of the server in the third network traffic information is 1, the packet loss is generated when the data stream flows from the second end side device to the third switch during the packet returning.
As yet another example, a data flow flows from the first end side device to the second end side device via the plurality of switches, but only third network traffic information of a third switch of the plurality of switches is collected, and in the case where the second index is a delay, when a value of a server response delay in the third network traffic information is 1 and a value of a link establishment delay is 0, it indicates that the delay occurs at the first end side device side; when the value of the server response delay and the value of the link establishment delay in the third network traffic information are both 1, it indicates that the delay occurs at the second end side device side.
In yet another possible implementation, the data stream is one of a plurality of network streams in the communication network, and if the failed transmission paths corresponding to all the data streams in the plurality of network streams include the same communication device, it is determined that the same communication device fails.
As an example, when the number of network flows in which packet loss occurs in the communication network exceeds a preset threshold value in a time window, it indicates that a large number of data flows have concentrated packet loss in a short time, and if the transmission paths in which packet loss occurs corresponding to all the data flows in the plurality of network flows include the same communication device, it is determined that packet loss occurs in the same communication device, where the communication device may include an access switch, a convergence switch, a server, or the like.
According to the technical scheme, network faults in the communication network are positioned according to the position of the switch with abnormal network flow information in the switch through which the data flow flows in the communication network and the network flow information of each switch in the switch with abnormal network flow information, so that real-time automatic investigation of abnormal network faults in the communication network is realized, the efficiency of network fault positioning is improved, time is saved, and human resources are saved.
Fig. 3 is a flow chart of a fault locating method of a communication network according to an embodiment of the present application. As shown in fig. 3, the method includes at least S301 to S304.
S301, obtaining network flow when the data flow flows through each of S switches in the N switches, wherein S is a positive integer less than or equal to N.
It should be noted that S301 may refer to S201, and will not be described herein.
S302, detecting network flow when the data flow flows through each of the S switches to obtain network flow information when the data flow flows through each of the S switches.
As an example, the network traffic information includes a value of each of at least one communication indicator, which may include traffic, number of concurrent connections, set-up link ratio, connection no-answer ratio, connection failure ratio, termination link ratio, retransmission ratio, packet loss ratio, time delay, response time, response ratio, number of service 0 windows, and the like.
It should be noted that, the method for detecting the network traffic of the data flow flowing through the switch to obtain the network traffic information of the data flow flowing through the switch may refer to the existing method for obtaining the network traffic information according to the network traffic, which is not described herein.
S303, judging the network flow information when the data flow flows through each of the S switches based on a preset network flow evaluation standard, so as to obtain P switches with abnormal network flow of the data flow in the S switches and the network flow information of each of the P switches, wherein P is a positive integer less than or equal to S.
The switch with abnormal network traffic of the data flow refers to a switch corresponding to the network traffic with network faults such as packet loss, delay and the like in the acquired network traffic. The network traffic evaluation criteria indicates a health value for each of the at least one communication indicator.
In one possible implementation manner, 12 indexes of 4 dimensions of business trend, link establishment/disconnection, transmission performance and load interaction are screened out according to the multi-source and network fault principles of the acquisition modes, the 12 indexes are used for establishing a flow health index system, data of each acquisition mode are abstracted and converted according to the health index system to shield differences of different data, and decoupling data acquisition and upper layer analysis functions are realized.
The traffic trend dimension in the established traffic health index system comprises traffic and concurrent connection two indexes, the link establishment/disconnection dimension comprises four indexes of link establishment proportion, connection non-response rate, connection failure rate and termination link proportion, the transmission performance dimension comprises three indexes of retransmission rate, packet loss rate and time delay, and the load interaction dimension comprises three indexes of response time, response rate and service 0 window times.
After the characteristic index system is built, the network flow condition can be analyzed by using an intelligent algorithm, the intelligent algorithm forms an evaluation standard for the network flow health condition by learning a large amount of monitoring data, and the intelligent algorithm can identify the abnormality of the network flow with high accuracy by matching with expert rules; meanwhile, the system can detect the total network traffic in real time, find abnormality and output prompt information to help troubleshooting.
Illustratively, the intelligent algorithm is used to analyze the plurality of monitored network traffic data to obtain a network traffic evaluation criterion, where the network traffic evaluation criterion indicates a health value of each of at least one indicator in the traffic health indicator system.
As an example, determining network traffic information when a data stream flows through each of the S switches based on a preset network traffic evaluation criterion, obtaining P switches in which an abnormality occurs in the network traffic of the data stream in the S switches includes: comparing the actual value corresponding to each communication index in the network flow information when the data flow flows through each switch in the S switches with the health value corresponding to the corresponding communication index in the preset network flow evaluation standard, and recording the switch corresponding to the network flow information, of which the actual value corresponding to the communication index is smaller than the health value corresponding to the communication index in the preset network flow evaluation standard, as the switch of which the network flow of the data flow is abnormal.
S304, positioning network faults in the communication network according to the position of each switch in the communication network and the network flow information of the data flow of each switch in the P switches.
It should be noted that S304 may refer to S203, and will not be described herein.
According to the technical scheme, the index system is reasonably set, the network communication quality is analyzed, the traditional problems of access failure, slow access and the like are refined into a plurality of granularities such as abnormal session establishment, abnormal transmission delay and abnormal transmission packet loss, the requirements of different flow models and different technical stacks on network monitoring are met, the fault sensing capability is improved, and meanwhile, the fault positioning accuracy of the communication network is further improved.
Fig. 4 is a schematic structural diagram of a fault locating device of a communication network according to an embodiment of the present application. As shown in fig. 4, the apparatus 400 may include an acquisition module 401, a determination module 402, and a positioning module 403.
Any module of the acquisition module, the determination module and the positioning module in the embodiments of the present application may be implemented in whole or in part by software and/or hardware. Wherein, the part realized by the software can be run on the processor to realize the corresponding functions, and the part realized by the hardware can be a constituent part of the processor.
The apparatus 400 may be used to implement the methods shown in fig. 2 or 3.
Fig. 5 is a schematic structural diagram of a fault locating device of a communication network according to an embodiment of the present application. The apparatus 500 shown in fig. 5 may be used to perform the method described in any of the previous embodiments.
As shown in fig. 5, the apparatus 500 of the present embodiment includes: memory 501, processor 502, communication interface 503, and bus 504. The memory 501, the processor 502, and the communication interface 503 are communicatively connected to each other via a bus 504.
The memory 501 may be a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a random access memory (random access memory, RAM). The memory 501 may store a program, and the processor 502 may be configured to perform the steps of the method shown in fig. 2 or fig. 3 when the program stored in the memory 501 is executed by the processor 502.
The processor 502 may employ a general-purpose central processing unit (central processing unit, CPU), microprocessor, application specific integrated circuit (application specific integrated circuit, ASIC), or one or more integrated circuits for executing associated programs to implement the fault localization methods of the communication network of the method embodiments of the present application.
The processor 502 may also be an integrated circuit chip with signal processing capabilities. In implementation, various steps of methods of various embodiments of the present application may be performed by integrated logic circuitry in hardware or by instructions in software in processor 502.
The processor 502 may also be a general purpose processor, a digital signal processor (digital signal processing, DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 501, and the processor 502 reads information in the memory 501, and in combination with its hardware, performs functions necessary for performing the methods in the embodiments of the present application, for example, may perform the steps/functions of the embodiments shown in fig. 2 or fig. 3.
Communication interface 503 may enable communication between apparatus 500 and other devices or communication networks using, but is not limited to, a transceiver-like transceiver.
Bus 504 may include a path to transfer information between various components of apparatus 500 (e.g., memory 501, processor 502, communication interface 503).
It should be understood that the apparatus 500 shown in the embodiments of the present application may be an electronic device, or may be a chip configured in an electronic device.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with the embodiments of the present application are all or partially produced. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
It should be understood that the term "and/or" is merely an association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B may mean: there are three cases, a alone, a and B together, and B alone, wherein a, B may be singular or plural. In addition, the character "/" herein generally indicates that the associated object is an "or" relationship, but may also indicate an "and/or" relationship, and may be understood by referring to the context.
In the present application, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. A fault location method for a communication network, wherein the communication network includes a first end-side device, a second end-side device, and N switches, and the first end-side device and the second end-side device transmit data streams through the N switches, where N is a positive integer, the method includes:
acquiring network flow when the data flow flows through each of S switches in the N switches, wherein S is a positive integer less than or equal to N;
determining P switches with abnormal network flow of the data flow according to the network flow when the data flow flows through each switch in the S switches, wherein P is a positive integer less than or equal to S;
positioning a network fault in the communication network according to the position of each switch in the communication network and the network flow information of the data flow in each switch in the P switches, wherein the network flow information comprises the value of each index in at least one communication index;
The obtaining network traffic when the data stream flows through each of the S switches of the N switches includes:
acquiring network flow acquired by each flow probe in S flow probes, and acquiring network flow when the data flow flows through each switch in the S switches, wherein the S flow probes are in one-to-one correspondence with the S switches, and each flow probe in the S flow probes is used for acquiring network flow flowing through a corresponding switch in the S switches;
the determining P switches with abnormal network traffic of the data flow according to the network traffic of the data flow flowing through each switch in the S switches, and the network traffic information of each switch in the P switches, includes:
detecting the network flow when the data flow flows through each of the S switches to obtain the network flow information when the data flow flows through each of the S switches;
judging the network flow information when the data flow flows through each switch in the S switches based on a preset network flow evaluation standard, and obtaining P switches with abnormal network flow of the data flow in the S switches and the network flow information of each switch in the P switches, wherein the network flow evaluation standard indicates the health value of each index in the at least one communication index.
2. The method of claim 1, wherein said locating network faults in said communications network based on the location of said each of said P switches in said communications network and network traffic information of said data flow at each of said P switches comprises:
and if the P is larger than 2, determining that the network fault corresponding to the first index occurs on the transmission path between the first switch and the second switch if the first value of the first index in the first network flow information corresponding to the first switch in the P switches is not equal to the second value of the first index in the second network flow information corresponding to the second switch and the first index is the index with abnormality.
3. The method of claim 1, wherein said locating network faults in said communications network based on the location of said each of said P switches in said communications network and network traffic information of said data flow at each of said P switches comprises:
if a third switch in the P switches is the switch closest to the first end side device in the S switches, determining that a network fault corresponding to a second index occurs on a transmission path before the third switch, where the second index is an index of abnormality in network traffic information corresponding to the third switch.
4. The method of claim 1, wherein the data stream is one of a plurality of network streams in the communication network, and wherein the method further comprises:
and if the transmission paths of the faults corresponding to all the data streams in the network streams contain the same communication equipment, determining that the same communication equipment has faults.
5. The method of claim 1, wherein the at least one communication indicator comprises one or more of: flow, number of concurrent connections, rate of established links, rate of connection no response, rate of connection failure, rate of terminated links, retransmission rate, rate of packet loss, time delay, response time, rate of response, or number of service 0 windows.
6. A fault locating device of a communication network, wherein the communication network includes a first end-side device, a second end-side device, and N switches, the first end-side device and the second end-side device perform data stream transmission through the N switches, and N is a positive integer, the device includes:
the acquisition module is used for acquiring the network flow when the data flow flows through each of S switches in the N switches, wherein S is a positive integer less than or equal to N;
The determining module is used for determining P switches with abnormal network flow of the data flow according to the network flow when the data flow flows through each switch in the S switches, wherein P is a positive integer less than or equal to S;
the positioning module is used for positioning network faults in the communication network according to the position of each switch in the communication network and the network flow information of the data flow in each switch in the P switches, and the network flow information comprises the value of each index in at least one communication index;
the acquisition module is specifically configured to acquire network traffic acquired by each of S flow probes, so as to obtain network traffic when the data flow flows through each of the S switches, where the S flow probes are in one-to-one correspondence with the S switches, and each of the S flow probes is configured to acquire network traffic flowing through a corresponding switch of the S switches;
the positioning module is specifically configured to detect a network flow when the data flow flows through each switch of the S switches, so as to obtain network flow information when the data flow flows through each switch of the S switches; judging the network flow information when the data flow flows through each switch in the S switches based on a preset network flow evaluation standard, and obtaining P switches with abnormal network flow of the data flow in the S switches and the network flow information of each switch in the P switches, wherein the network flow evaluation standard indicates the health value of each index in the at least one communication index.
7. A fault locating device for a communication network, comprising: a memory and a processor;
the memory is used for storing program instructions;
the processor is configured to invoke program instructions in the memory to perform the method of any of claims 1 to 5.
8. A chip comprising at least one processor and a communication interface, the communication interface and the at least one processor being interconnected by wires, the at least one processor being configured to execute a computer program or instructions to perform the method of any of claims 1-5.
9. A computer readable medium, characterized in that the computer readable medium stores a program code for computer execution, the program code comprising instructions for performing the method according to any of claims 1 to 5.
CN202111603641.2A 2021-12-24 2021-12-24 Fault positioning method and related device for communication network Active CN114172796B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111603641.2A CN114172796B (en) 2021-12-24 2021-12-24 Fault positioning method and related device for communication network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111603641.2A CN114172796B (en) 2021-12-24 2021-12-24 Fault positioning method and related device for communication network

Publications (2)

Publication Number Publication Date
CN114172796A CN114172796A (en) 2022-03-11
CN114172796B true CN114172796B (en) 2024-01-30

Family

ID=80488121

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111603641.2A Active CN114172796B (en) 2021-12-24 2021-12-24 Fault positioning method and related device for communication network

Country Status (1)

Country Link
CN (1) CN114172796B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117193272B (en) * 2023-11-07 2024-01-26 常州华纳电气有限公司 Electronic control test data management system and method based on big data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9438471B1 (en) * 2012-02-20 2016-09-06 F5 Networks, Inc. Multi-blade network traffic management apparatus with improved failure handling and methods thereof
CN107835098A (en) * 2017-11-28 2018-03-23 车智互联(北京)科技有限公司 A kind of network fault detecting method and system
CN110380907A (en) * 2019-07-26 2019-10-25 京信通信系统(中国)有限公司 A kind of network fault diagnosis method, device, the network equipment and storage medium
CN113162800A (en) * 2021-03-12 2021-07-23 电子科技大学 Network link performance index abnormity positioning method based on reinforcement learning
WO2021244415A1 (en) * 2020-06-03 2021-12-09 华为技术有限公司 Network failure detection method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9438471B1 (en) * 2012-02-20 2016-09-06 F5 Networks, Inc. Multi-blade network traffic management apparatus with improved failure handling and methods thereof
CN107835098A (en) * 2017-11-28 2018-03-23 车智互联(北京)科技有限公司 A kind of network fault detecting method and system
CN110380907A (en) * 2019-07-26 2019-10-25 京信通信系统(中国)有限公司 A kind of network fault diagnosis method, device, the network equipment and storage medium
WO2021244415A1 (en) * 2020-06-03 2021-12-09 华为技术有限公司 Network failure detection method and apparatus
CN113162800A (en) * 2021-03-12 2021-07-23 电子科技大学 Network link performance index abnormity positioning method based on reinforcement learning

Also Published As

Publication number Publication date
CN114172796A (en) 2022-03-11

Similar Documents

Publication Publication Date Title
US8443074B2 (en) Constructing an inference graph for a network
US7385931B2 (en) Detection of network misconfigurations
CN108092854B (en) Test method and device for train-level Ethernet equipment based on IEC61375 protocol
US11038587B2 (en) Method and apparatus for locating fault cause, and storage medium
CN108900319B (en) Fault detection method and device
CN111327471A (en) Network quality analysis method and device, computer equipment and storage medium
CN114172796B (en) Fault positioning method and related device for communication network
JP2021502788A (en) Detection of sources of computer network failures
CN111200544B (en) Network port flow testing method and device
CN115001829A (en) Protocol vulnerability mining method, device, equipment and storage medium
JP2019102974A (en) Data collection system, controller, control program, gateway unit, and gateway program
CN108512675B (en) Network diagnosis method and device, control node and network node
CN101252477A (en) Determining method and analyzing apparatus of network fault root
CN109831335B (en) Data monitoring method, monitoring terminal, storage medium and data monitoring system
CN110896544B (en) Fault delimiting method and device
CN111654405A (en) Method, device, equipment and storage medium for fault node of communication link
CN116506340A (en) Flow link testing method and device, electronic equipment and storage medium
CN115242610A (en) Link quality monitoring method and device, electronic equipment and computer readable storage medium
CN113364652B (en) Network card flow testing method, device, network equipment, system and readable medium
CN113810332B (en) Encrypted data message judging method and device and computer equipment
CN107222332A (en) Method of testing, device, system and machinable medium
CN113009246A (en) PSE equipment detection device and PSE equipment detection method
CN117255005B (en) CDN-based service alarm processing method, device, equipment and medium
WO2017202241A1 (en) Crosstalk detection method and apparatus, and operation maintenance server
CN117201292B (en) Method capable of accurately positioning request call exception among micro services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant