CN117880055A

CN117880055A - Network fault diagnosis method, device, equipment and medium based on transmission layer index

Info

Publication number: CN117880055A
Application number: CN202410277601.0A
Authority: CN
Inventors: 苑志超; 铁智慧; 刘奎; 封兴东
Original assignee: Primate Intelligent Technology Hangzhou Co ltd
Current assignee: Primate Intelligent Technology Hangzhou Co ltd
Priority date: 2024-03-12
Filing date: 2024-03-12
Publication date: 2024-04-12
Anticipated expiration: 2044-03-12

Abstract

The invention relates to the technical field of artificial intelligence, and provides a network fault diagnosis method, device, equipment and medium based on a transmission layer index, wherein on one hand, a bypass of a network side is used for collecting data packets in real time in a probe deployment mode, so that the influence on normal communication of original service connection can be effectively avoided; on the one hand, each transport layer index is configured according to the transport layer network attribute, and candidate abnormal connection is detected in real time based on the configured transport layer index, and because the transport layer index can reflect user experience such as blocking, large delay and the like of a user terminal more truly, the abnormal detection is more reliable; on the other hand, the fault root cause is predicted based on a pre-constructed network fault diagnosis model, and fault treatment measures are determined based on the assistance of a knowledge graph, so that the automatic prediction of the fault root cause and the treatment measures is realized based on an artificial intelligence means, and the network abnormality treatment efficiency is improved.

Description

Network fault diagnosis method, device, equipment and medium based on transmission layer index

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a medium for diagnosing a network failure based on a transmission layer index.

Background

With the wide application of technologies such as 5G (5 th Generation Mobile Communication Technology, fifth generation mobile communication technology), artificial intelligence, big data, cloud computing and the like and the accelerated advancement of digital transformation, a series of new business scenes in vertical industries such as smart cities, smart medical treatment, smart education, industrial internet and the like are induced, and higher requirements are put forward on the characteristics such as availability, reliability, delay and bandwidth of the network. Therefore, the conventional network capability and operation and maintenance mode cannot meet the requirement of the new service scenario on the network.

Therefore, the 2019 telecommunications management forum proposes a "Self-intelligent network project" to achieve comprehensive improvement of Self-X capability (i.e., self-monitoring/reporting, self-diagnosis, self-repairing, self-optimizing, etc.) of the network, and provide Zero-X (i.e., zero-waiting, zero-contact, zero-fault, etc.) experience for vertical industries and consumer users.

However, the conventional network operation and maintenance mode lacks means for timely finding, automatically analyzing root cause and solving network faults, faults can be found only when customer complaints are received, and then the fault causes are manually analyzed according to past experience and measures for solving the faults are taken, so that the following problems mainly exist:

1) The real-time performance is poor. Monitoring the network by periodically sending detection packets or adopting a mode of statistical analysis of historical data, wherein network problems cannot be monitored and found in real time;

2) The coverage is narrow. All network abnormal conditions cannot be captured by using a sampling mode;

3) The end user experience cannot be directly reflected. The modes of sending network detection packets and the like are not real-time network parameter statistics on the data network connection of the terminal user, so that the real user experience cannot be directly reflected;

4) Too relying on manual and personal experience and being inefficient.

Disclosure of Invention

In view of the foregoing, it is necessary to provide a method, an apparatus, a device, and a medium for diagnosing a network failure based on a transport layer index, which are aimed at solving the problems that the real-time performance of the network failure diagnosis is poor, the coverage area is narrow, and the cause and the treatment measures of the network failure cannot be automatically predicted.

A network failure diagnosis method based on a transport layer index, the network failure diagnosis method based on the transport layer index comprising:

responding to a fault diagnosis instruction of a target network, bypassing and deploying a probe on an access network side of the target network, and acquiring a target data packet in real time by utilizing the probe;

Configuring each transport layer index according to the transport layer network attribute, and calculating the index value of each transport layer index in real time according to the target data packet;

detecting candidate abnormal connection in real time according to index values of indexes of each transmission layer;

filtering instantaneous disturbance from the candidate abnormal connection to obtain a target abnormal connection, and carrying out network abnormal alarm according to the target abnormal connection;

acquiring an index value of each transmission layer index associated with the target abnormal connection in a preset time period before alarming as a target transmission layer index value, acquiring an index value of each key performance index associated with the target abnormal connection in the preset time period before alarming as a key performance index value, acquiring an index value of each key quality index associated with the target abnormal connection in the preset time period before alarming as a key quality index value, acquiring link detection data during alarming, and acquiring network equipment monitoring data associated with the target abnormal connection during alarming;

inputting the target transmission layer index value, the key performance index value, the key quality index value, the link detection data and the network equipment monitoring data into a pre-trained network fault diagnosis model to obtain a target fault root cause;

Traversing the target fault root in a pre-constructed knowledge graph to obtain fault treatment measures corresponding to the target fault root.

According to a preferred embodiment of the present invention, the bypassing the probe on the access network side of the target network, and collecting the target data packet in real time by using the probe includes:

acquiring an IP address, a port number and a protocol type corresponding to the target network;

the probe is deployed on network equipment at the access network side in a software mode, and the target data packet is obtained from a network card of the network equipment in real time according to the IP address, the port number and the protocol type; or alternatively

The probe is deployed on network routing and/or switching equipment at the access network side in a hardware server mode, and the target data packet is obtained from the network routing and/or switching equipment in real time according to the IP address, the port number and the protocol type; the hardware server performs optical fiber communication with the network route and/or the switching equipment, and the network route and/or the switching equipment mirrors the target data packet to the hardware server through an optical fiber.

According to a preferred embodiment of the present invention, the configuring each transport layer indicator according to the transport layer network attribute includes:

for each connection in the target network, acquiring a delay sequence of each connection, and carrying out segmentation processing on the delay sequence according to a preset time interval to obtain each delay period of the delay sequence;

calculating the average delay of each delay period according to the preset time interval;

acquiring a configured high delay threshold;

when the average delay of the delay period is detected to be larger than the high delay threshold value, accumulating the reporting times by 1;

acquiring the total reporting times obtained by final accumulation and the total number of the delay time periods;

calculating the quotient of the total reporting times and the total number to obtain a first configuration index;

acquiring a current reporting period;

calculating the duty ratio of the time length of congestion of the data packet in each connection in the reporting period to obtain a second configuration index;

acquiring data receiving and transmitting speed, packet loss rate, delay jitter and delay;

and combining the first configuration index, the second configuration index, the data receiving and transmitting rate, the packet loss rate, the delay jitter and the delay to obtain each transmission layer index.

According to a preferred embodiment of the present invention, the detecting candidate abnormal connections in real time according to the index value of each transport layer index includes:

normalizing the index value of each transmission layer index to obtain the index value of the first configuration index as a first index value, the index value of the second configuration index as a second index value, the index value of the data receiving and transmitting rate as a third index value, the index value of the packet loss rate as a fourth index value, the index value of the delay jitter as a fifth index value and the index value of the delay as a sixth index value;

acquiring a first coefficient corresponding to the first configuration index, a second coefficient corresponding to the second configuration index, a third coefficient corresponding to the data receiving and transmitting rate, a fourth coefficient corresponding to the packet loss rate, a fifth coefficient corresponding to the delay jitter and a sixth coefficient corresponding to the delay;

calculating the product of the first index value corresponding to each connection and the first coefficient to obtain a first numerical value corresponding to each connection;

calculating the product of a second index value corresponding to each connection and the second coefficient to obtain a second numerical value corresponding to each connection;

Calculating the product of a third index value corresponding to each connection and the third coefficient to obtain a third numerical value corresponding to each connection;

calculating the product of a fourth index value corresponding to each connection and the fourth coefficient to obtain a fourth numerical value corresponding to each connection;

calculating the product of a fifth index value corresponding to each connection and the fifth coefficient to obtain a fifth numerical value corresponding to each connection;

calculating the product of a sixth index value corresponding to each connection and the sixth coefficient to obtain a sixth numerical value corresponding to each connection;

calculating the sum of a first value, a second value, a third value, a fourth value, a fifth value and a sixth value corresponding to each connection to obtain a connection quality quantized value corresponding to each connection;

acquiring a connection quality threshold;

when the connection quality quantized value corresponding to the detected connection is larger than the connection quality threshold, determining the detected connection as the candidate abnormal connection;

wherein the sum of the first coefficient, the second coefficient, the third coefficient, the fourth coefficient, the fifth coefficient and the sixth coefficient is 1.

According to a preferred embodiment of the present invention, the filtering the transient disturbance from the candidate abnormal connection to obtain the target abnormal connection includes:

Detecting the number of times each connection is determined to be the candidate abnormal connection in a configuration time period as an abnormal number;

acquiring a frequency threshold;

and when the abnormal times corresponding to the detected connection are smaller than the times threshold, filtering the detected connection from the candidate abnormal connection.

According to a preferred embodiment of the present invention, after the target root cause of the fault is obtained, the method further includes:

when the target fault root is in error prediction, marking the fault reason of the target abnormal connection;

adding the target transmission layer index value, the key performance index value, the key quality index value, the link detection data, the network equipment monitoring data and the marked fault cause to a training data set;

acquiring a pre-configured training period;

and according to the training period, optimally training the network fault diagnosis model by using the training data set.

According to a preferred embodiment of the present invention, before the traversing by using the target fault root cause in the pre-constructed knowledge graph, the method further includes:

crawling network failure data to establish an initial dataset;

preprocessing the initial data set to obtain a target data set;

Identifying each entity in the target data set and the relation among the entities to obtain each fault mode, each fault handling measure and the relation among each fault mode and each fault handling measure;

determining each fault mode and each fault handling measure as a node, and connecting each node according to the relation between each fault mode and each fault handling measure to obtain an initial map;

acquiring network data, and carrying out knowledge extraction on the network data to obtain child nodes corresponding to each node;

adding the child nodes corresponding to each node to the initial map to obtain the knowledge map;

and periodically updating the knowledge graph according to the newly added fault root cause and the corresponding fault processing measures.

A network failure diagnosis apparatus based on a transport layer index, the network failure diagnosis apparatus based on a transport layer index comprising:

the acquisition unit is used for responding to a fault diagnosis instruction of the target network, bypassing and deploying a probe at an access network side of the target network, and acquiring a target data packet in real time by utilizing the probe;

the computing unit is used for configuring each transmission layer index according to the transmission layer network attribute and computing the index value of each transmission layer index in real time according to the target data packet;

The detection unit is used for detecting candidate abnormal connection in real time according to the index value of each transmission layer index;

the filtering unit is used for filtering the instantaneous disturbance from the candidate abnormal connection to obtain a target abnormal connection and carrying out network abnormal alarm according to the target abnormal connection;

an obtaining unit, configured to obtain, as a target transport layer index value, an index value of each transport layer index associated with the target abnormal connection within a preset duration before an alarm, obtain, as a key performance index value, an index value of each key performance index associated with the target abnormal connection within the preset duration before the alarm, obtain, as a key quality index value, an index value of each key quality index associated with the target abnormal connection within the preset duration before the alarm, obtain link detection data during the alarm, and obtain network device monitoring data associated with the target abnormal connection during the alarm;

the diagnosis unit is used for inputting the target transmission layer index value, the key performance index value, the key quality index value, the link detection data and the network equipment monitoring data into a pre-trained network fault diagnosis model to obtain a target fault root cause;

And the traversing unit is used for traversing the target fault root in a pre-constructed knowledge graph to obtain fault processing measures corresponding to the target fault root.

A computer device, the computer device comprising:

a memory storing at least one instruction; and

And the processor executes the instructions stored in the memory to realize the network fault diagnosis method based on the transmission layer index.

A computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the transport layer indicator-based network failure diagnosis method.

According to the technical scheme, on one hand, the data packet is acquired in real time in a mode of bypassing the deployment probe at the network side, so that the influence on normal communication of the original service connection can be effectively avoided; on the one hand, each transport layer index is configured according to the transport layer network attribute, and candidate abnormal connection is detected in real time based on the configured transport layer index, and because the transport layer index can reflect user experience such as blocking, large delay and the like of a user terminal more truly, the abnormal detection is more reliable; on the other hand, the fault root cause is predicted based on a pre-constructed network fault diagnosis model, and fault treatment measures are determined based on the assistance of a knowledge graph, so that the automatic prediction of the fault root cause and the treatment measures is realized based on an artificial intelligence means, and the network abnormality treatment efficiency is improved.

Drawings

Fig. 1 is a flow chart of a network fault diagnosis method based on the transport layer index according to a preferred embodiment of the present invention.

Fig. 2 is a functional block diagram of a network fault diagnosis apparatus according to a preferred embodiment of the present invention based on the transmission layer index.

Fig. 3 is a schematic structural diagram of a computer device according to a preferred embodiment of the present invention for implementing a network fault diagnosis method based on a transport layer indicator.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.

Fig. 1 is a flowchart of a network fault diagnosis method based on the transmission layer index according to a preferred embodiment of the present invention. The order of the steps in the flowchart may be changed and some steps may be omitted according to various needs.

The network fault diagnosis method based on the transmission layer index is applied to one or more computer devices, wherein the computer device is a device capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware of the computer device comprises, but is not limited to, a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, an ASIC), a programmable gate array (Field-Programmable Gate Array, an FPGA), a digital processor (Digital Signal Processor, a DSP), an embedded device and the like.

The computer device may be any electronic product that can interact with a user in a human-computer manner, such as a personal computer, tablet computer, smart phone, personal digital assistant (Personal Digital Assistant, PDA), game console, interactive internet protocol television (Internet Protocol Television, IPTV), smart wearable device, etc.

The computer device may also include a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network server, a server group composed of a plurality of network servers, or a Cloud based Cloud Computing (Cloud Computing) composed of a large number of hosts or network servers.

The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The network in which the computer device is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (Virtual Private Network, VPN), and the like.

And S10, responding to a fault diagnosis instruction of a target network, bypassing and deploying a probe at an access network side of the target network, and acquiring a target data packet in real time by using the probe.

In this embodiment, the fault diagnosis instruction may be triggered by an operation and maintenance person, or may be triggered automatically when an access network is detected.

In this embodiment, bypassing the probe at the access network side of the target network, and collecting the target data packet in real time by using the probe includes:

acquiring an IP (Internet Protocol ) address, a port number and a protocol type corresponding to the target network;

The network problem is mainly concentrated on the access network side, so that the probe bypass is deployed to the access network side of the target network (such as base station equipment of a mobile communication network, convergence switching equipment of a cell optical fiber and the like). For example: network faults can be caused by the problems of weak signal strength, small coverage area and the like of cellular data; for example, when the network is converged from each family to the access network, the access bandwidth is insufficient, so that the problems of network congestion and the like easily occur in the peak period, and the network access rate is too slow.

For the backbone network and the bearing network, network faults generally rarely occur, so that probes do not need to be deployed, and further the detection cost of the network faults is saved.

In the above embodiment, by bypassing the probe on the access network side, the normal communication of the original service connection is not affected when the data packet is collected.

S11, configuring each transport layer index according to the transport layer network attribute, and calculating the index value of each transport layer index in real time according to the target data packet.

In this embodiment, the configuring each transport layer indicator according to the transport layer network attribute includes:

acquiring a configured high delay threshold;

Acquiring a current reporting period;

The preset time interval and the high delay threshold can be configured according to actual use scenes.

For example: and for any connection in the target network, acquiring a delay sequence of the connection, splitting the delay sequence according to a section of N seconds to obtain M delay periods, determining the quotient of the sum of the delays of each delay period and the number of delays in N seconds as the average delay of each delay period, and recording and reporting once when the average delay of the delay period is larger than a high delay threshold X. Counting the total reporting times Y of all delay periods, and calculating Y/M as the first configuration index for reflecting the reporting rate of high delay; if the current reporting period is E and the time length of congestion of the connection in one reporting period is F, F/E is calculated as the second configuration index and used for reflecting the duty ratio of the time length of congestion; and simultaneously, acquiring the data receiving and transmitting rate, the packet loss rate, the delay jitter and the delay, and forming each transmission layer index together with the first configuration index and the second configuration index.

The first configuration index and the second configuration index are network parameters defined after a large amount of network data are analyzed and tested and verified, so that user experience such as blocking of a user terminal, large delay and the like can be truly reflected, for example, the problem that the blocking of the user terminal exists in some connections with low average delay and high reporting rate of the high delay. Thus, the first configuration index and the second configuration index are of great significance for optimizing the user's experience of use of the network.

The data receiving and transmitting rate refers to the data receiving and transmitting rate of the service connection of the terminal user.

The packet loss rate refers to the packet loss rate of a data packet sent in service connection of a terminal user.

The delay jitter refers to the jitter of the delay of a data packet sent in the service connection of the terminal user.

The delay refers to delay of sending a data packet in the end user service connection, that is, time from sending time to receiving corresponding ACK (Acknowledge character, acknowledgement character).

In the above embodiment, each transport layer indicator can be configured pertinently according to the transport layer network attribute, so as to calculate and count the quality of the transport layer network connected by each end user in real time according to each transport layer indicator. The transmission layer index has the characteristics of strong instantaneity, wide coverage, direct reflection of user experience and the like, and can effectively improve the accuracy and efficiency of network fault diagnosis.

In this embodiment, after each transport layer indicator is configured specifically according to the transport layer network attribute, the indicator value of each transport layer indicator may be further calculated in real time for use in subsequent network diagnosis.

And S12, detecting candidate abnormal connection in real time according to the index value of each transmission layer index.

In this embodiment, the detecting the candidate abnormal connection in real time according to the index value of each transport layer index includes:

acquiring a connection quality threshold;

The connection quality threshold can be configured according to the requirement of the actual scene on the network quality.

In the above embodiment, the index value of each transport layer index is normalized so that the indexes with different units are in the same level; and carrying out weighted average on the index values of the various transmission layer indexes according to different coefficients of the various transmission layer indexes, and comparing the index values with a threshold value to detect the network quality based on different dimensions.

S13, filtering instantaneous disturbance from the candidate abnormal connection to obtain a target abnormal connection, and carrying out network abnormal alarm according to the target abnormal connection.

In this embodiment, the filtering the transient disturbance from the candidate abnormal connection to obtain the target abnormal connection includes:

acquiring a frequency threshold;

The configuration duration can be comprehensively configured according to reporting period and the like.

For example: when a connection is determined to be an abnormal connection 5 times in a specified time and the threshold number of times is not exceeded 10 times, the connection is likely to be abnormal due to instantaneous disturbance or the like, not a real network abnormality, and at this time, the connection can be filtered out.

By the above embodiment, erroneous judgment due to instantaneous disturbance can be avoided.

In this embodiment, the node CPU (Central Processing Unit ) occupancy rate, node memory occupancy rate, and node disk occupancy rate may also be detected simultaneously. Specifically, when the node CPU occupancy rate reaches 90% and the duration time reaches 30 seconds, determining that the network is abnormal; when the node memory occupancy rate reaches 90% and the duration time reaches 30 seconds, determining that the network is abnormal; when the node disk occupancy rate reaches 90% and the duration reaches 30 seconds, network abnormality is determined.

And, different alarm levels can also be configured for different anomalies. For example: when the node CPU occupancy rate reaches 90% and the duration time reaches 30 seconds, or when the node memory occupancy rate reaches 90% and the duration time reaches 30 seconds, or when the node disk occupancy rate reaches 90% and the duration time reaches 30 seconds, or when connection is continuously determined to be abnormal connection for 10 times or more within the specified time, a primary network abnormal alarm is sent out. Further, when the node CPU occupancy rate reaches 100%, or when the node memory occupancy rate reaches 100%, or when the node disk occupancy rate reaches 100%, the primary network anomaly alarm can be updated to the secondary network anomaly alarm.

Through the embodiment, different levels of abnormality alarms can be performed according to different abnormality types and influence conditions.

S14, acquiring an index value of each transmission layer index associated with the target abnormal connection in a preset time period before alarming as a target transmission layer index value, acquiring an index value of each key performance index associated with the target abnormal connection in the preset time period before alarming as a key performance index value, acquiring an index value of each key quality index associated with the target abnormal connection in the preset time period before alarming as a key quality index value, acquiring link detection data during alarming, and acquiring network equipment monitoring data associated with the target abnormal connection during alarming.

For example: the preset duration may be 5 seconds.

In this embodiment, the key performance indicators refer to indicators regarding network hardware performance and network transmission quality.

For example: the key performance indicators may include:

RSRP (Reference Signal Receiving Power, reference signal received power): signal quality metrics in LTE (Long Term Evolution ) networks;

RSRQ (Reference Signal Receiving Quality, reference signal received quality): based on the ratio of RSRP and channel bandwidth, calculating to evaluate the receiving quality of user equipment;

SINR (Signal to Interference plus Noise Ratio, signal-to-interference-plus-noise ratio): for measuring the quality of the signal;

user average throughput: reflecting the amount of data that a user can receive or transmit from the network on average over a given period of time.

In this embodiment, the key quality indicator mainly focuses on the quality of the user experience level.

For example: the key quality indicators may include:

VoLTE (Voice over LTE), long term evolution Voice bearer) call success rate: measuring the success rate when the VoLTE is used for calling;

VoLTE Audio quality: measuring the audio quality of VoLTE call;

success rate of data session establishment: measuring the success rate of establishing data connection between the user equipment and the network;

network delay: measuring the time for data to be transferred from the source point to the target point;

data transmission speed: the speed at which data is transferred from the network to the user equipment is measured.

In this embodiment, the link detection data refers to delay and packet loss rate between each hop when the link detection data is routed.

In this embodiment, the network device monitoring data refers to data such as a memory, a CPU, a throughput rate, and the number of connections.

S15, inputting the target transmission layer index value, the key performance index value, the key quality index value, the link detection data and the network equipment monitoring data into a pre-trained network fault diagnosis model to obtain a target fault root cause.

In this embodiment, the network fault diagnosis model may be an artificial intelligent model (such as an xgboost model) obtained by training based on data such as historical data and a fault processing result when a network fails.

Through the embodiment, the root cause of the network fault can be automatically predicted based on the pre-trained artificial intelligent model, the dependence on artificial experience is not needed, and the detection efficiency and the accuracy of the root cause of the fault are improved.

In this embodiment, after the obtaining the target root cause of the fault, the method further includes:

acquiring a pre-configured training period;

Through the embodiment, the model can be optimized regularly according to the fault of the prediction error and the labeling of root cause, so that the coverage of the training set is expanded, and meanwhile, the accuracy of model prediction is improved.

And S16, traversing the target fault root in a pre-constructed knowledge graph by utilizing the target fault root to obtain fault processing measures corresponding to the target fault root.

In this embodiment, before the traversing by using the target fault root factor in the pre-constructed knowledge graph, the method further includes:

crawling network failure data to establish an initial dataset;

preprocessing the initial data set to obtain a target data set;

For example: the network failure data may be data collected from network failure related databases, log files, technical documents, forums, and other related sources, including information on failure phenomena, failure types, failure causes, solutions, configuration information, device types, and the like.

The preprocessing of the initial data set is to clean the data, such as removing duplicate data, correcting error data, filling missing values, and the like, and the standardized processing of the data format is performed to ensure that the data from different sources has consistency.

The relationship between the fault mode and the fault handling measures can be determined through the identification of the entities and the relationship between the entities.

By extracting knowledge from the network data, useful information can be extracted from text data to perfect a map. For example: it is possible to obtain which contents the failure mode specifically includes.

The knowledge graph can be perfected by continuous updating, so that the query efficiency and the accuracy are further improved.

It should be noted that if there is no matching solution in the knowledge graph, the solution may be generated by manual intervention, and the knowledge graph is further updated based on a manual scheme after the fault is resolved, so as to perfect the knowledge graph.

By the embodiment, the solution to the network fault can be automatically provided, so that the fault processing efficiency is improved.

In this embodiment, the target fault root cause and the corresponding fault handling measures may also be displayed on a user interaction interface, so that the user may easily access and use the knowledge graph.

User feedback can also be acquired through the user interaction interface so as to further optimize the knowledge graph and the fault detection scheme according to the user feedback.

The embodiment monitors and diagnoses the network faults in real time based on the indexes of the transmission layer and the artificial intelligence means, recommends solving measures at the same time, improves the operation and maintenance efficiency of the network, endows the network with self-monitoring and self-diagnosis capabilities, can improve the efficiency of network fault detection and network operation and maintenance, and promotes the landing of the self-intelligence network.

Fig. 2 is a functional block diagram of a network fault diagnosis apparatus according to a preferred embodiment of the present invention based on the transmission layer index. The network fault diagnosis device 11 based on the transmission layer index comprises an acquisition unit 110, a calculation unit 111, a detection unit 112, a filtering unit 113, an acquisition unit 114, a diagnosis unit 115 and a traversing unit 116. The module/unit referred to in the present invention refers to a series of computer program segments, which are stored in a memory, capable of being executed by a processor and of performing a fixed function. In the present embodiment, the functions of the respective modules/units will be described in detail in the following embodiments.

The collecting unit 110 is configured to bypass deployment of a probe on an access network side of a target network in response to a fault diagnosis instruction for the target network, and collect a target data packet in real time by using the probe;

the calculating unit 111 is configured to configure each transport layer indicator according to the transport layer network attribute, and calculate an index value of each transport layer indicator in real time according to the target data packet;

the detecting unit 112 is configured to detect candidate abnormal connection in real time according to an index value of each transport layer index;

the filtering unit 113 is configured to filter the transient disturbance from the candidate abnormal connection to obtain a target abnormal connection, and perform a network abnormality alarm according to the target abnormal connection;

The obtaining unit 114 is configured to obtain, as a target transport layer index value, an index value of each transport layer index associated with the target abnormal connection within a preset duration before an alarm, obtain, as a key performance index value, an index value of each key performance index associated with the target abnormal connection within the preset duration before the alarm, obtain, as a key quality index value, an index value of each key quality index associated with the target abnormal connection within the preset duration before the alarm, obtain link detection data during the alarm, and obtain network device monitoring data associated with the target abnormal connection during the alarm;

the diagnosing unit 115 is configured to input the target transport layer index value, the key performance index value, the key quality index value, the link detection data, and the network device monitoring data into a pre-trained network fault diagnosis model, to obtain a target fault root cause;

the traversing unit 116 is configured to traverse the target fault root in a pre-constructed knowledge graph by using the target fault root, so as to obtain a fault handling measure corresponding to the target fault root.

The computer device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program stored in the memory 12 and executable on the processor 13, such as a network failure diagnosis program based on transport layer metrics.

It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the computer device 1 and does not constitute a limitation of the computer device 1, the computer device 1 may be a bus type structure, a star type structure, the computer device 1 may further comprise more or less other hardware or software than illustrated, or a different arrangement of components, for example, the computer device 1 may further comprise an input-output device, a network access device, etc.

It should be noted that the computer device 1 is only used as an example, and other electronic products that may be present in the present invention or may be present in the future are also included in the scope of the present invention by way of reference.

The memory 12 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 12 may in some embodiments be an internal storage unit of the computer device 1, such as a removable hard disk of the computer device 1. The memory 12 may in other embodiments also be an external storage device of the computer device 1, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the computer device 1. The memory 12 may be used not only for storing application software installed in the computer device 1 and various types of data, such as codes of network failure diagnosis programs based on the transmission layer index, etc., but also for temporarily storing data that has been output or is to be output.

The processor 13 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, a combination of various control chips, and the like. The processor 13 is a Control Unit (Control Unit) of the computer device 1, connects the respective components of the entire computer device 1 using various interfaces and lines, executes various functions of the computer device 1 and processes data by running or executing programs or modules stored in the memory 12 (for example, executing a network failure diagnosis program based on a transport layer index, etc.), and calls data stored in the memory 12.

The processor 13 executes the operating system of the computer device 1 and various types of applications installed. The processor 13 executes the application program to implement the steps of the above-described embodiments of the network fault diagnosis method based on the transport layer indicator, such as the steps shown in fig. 1.

Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory 12 and executed by the processor 13 to complete the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing the specified functions, which instruction segments describe the execution of the computer program in the computer device 1. For example, the computer program may be divided into an acquisition unit 110, a calculation unit 111, a detection unit 112, a filtering unit 113, an acquisition unit 114, a diagnosis unit 115, a traversal unit 116.

The integrated units implemented in the form of software functional modules described above may be stored in a computer readable storage medium. The software functional module is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a computer device, or a network device, etc.) or a processor (processor) to execute the portions of the network fault diagnosis method based on the transport layer indicator according to the embodiments of the present invention.

The modules/units integrated in the computer device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on this understanding, the present invention may also be implemented by a computer program for instructing a relevant hardware device to implement all or part of the procedures of the above-mentioned embodiment method, where the computer program may be stored in a computer readable storage medium and the computer program may be executed by a processor to implement the steps of each of the above-mentioned method embodiments.

Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory, or the like.

Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

The bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one straight line is shown in fig. 3, but not only one bus or one type of bus. The bus is arranged to enable a connection communication between the memory 12 and at least one processor 13 or the like.

Although not shown, the computer device 1 may further comprise a power source (such as a battery) for powering the various components, preferably the power source may be logically connected to the at least one processor 13 via a power management means, whereby the functions of charge management, discharge management, and power consumption management are achieved by the power management means. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The computer device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described in detail herein.

Further, the computer device 1 may also comprise a network interface, optionally comprising a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the computer device 1 and other computer devices.

The computer device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the computer device 1 and for displaying a visual user interface.

It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.

Fig. 3 shows only a computer device 1 with components 12-13, it being understood by those skilled in the art that the structure shown in fig. 3 is not limiting of the computer device 1 and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.

In connection with fig. 1, the memory 12 in the computer device 1 stores a plurality of instructions to implement a network failure diagnosis method based on a transport layer indicator, the processor 13 being executable to implement:

Specifically, the specific implementation method of the above instructions by the processor 13 may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.

The data in this case were obtained legally.

In the several embodiments provided in the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.

The invention is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. The units or means stated in the invention may also be implemented by one unit or means, either by software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.

Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. The network fault diagnosis method based on the transmission layer index is characterized by comprising the following steps of:

2. The network fault diagnosis method based on the transport layer index according to claim 1, wherein the bypassing the probe on the access network side of the target network and collecting the target data packet in real time by using the probe comprises:

3. The network failure diagnosis method based on the transport layer index according to claim 1, wherein configuring each transport layer index according to the transport layer network attribute comprises:

acquiring a configured high delay threshold;

acquiring a current reporting period;

4. The network failure diagnosis method based on the transport layer indicator according to claim 3, wherein the detecting candidate abnormal connections in real time according to the indicator value of each transport layer indicator comprises:

acquiring a connection quality threshold;

5. The method for diagnosing a network failure based on a transport layer indicator according to claim 1, wherein said filtering transient disturbances from said candidate abnormal connections to obtain a target abnormal connection comprises:

acquiring a frequency threshold;

6. The network fault diagnosis method based on the transport layer index according to claim 1, wherein after the target fault root is obtained, the method further comprises:

acquiring a pre-configured training period;

7. The network fault diagnosis method based on the transport layer index according to claim 1, wherein the method further comprises, before traversing in a pre-constructed knowledge graph using the target fault root cause:

Crawling network failure data to establish an initial dataset;

preprocessing the initial data set to obtain a target data set;

8. A network failure diagnosis apparatus based on a transport layer index, the network failure diagnosis apparatus based on a transport layer index comprising:

9. A computer device, the computer device comprising:

a memory storing at least one instruction; and

A processor executing instructions stored in the memory to implement the transport layer indicator-based network failure diagnosis method according to any one of claims 1 to 7.

10. A computer-readable storage medium, characterized by: the computer-readable storage medium has stored therein at least one instruction that is executed by a processor in a computer device to implement the transport layer indicator-based network failure diagnosis method of any one of claims 1 to 7.