CN114676157A - Internet access quality monitoring analysis method, system, medium, and program - Google Patents

Internet access quality monitoring analysis method, system, medium, and program Download PDF

Info

Publication number
CN114676157A
CN114676157A CN202011555719.3A CN202011555719A CN114676157A CN 114676157 A CN114676157 A CN 114676157A CN 202011555719 A CN202011555719 A CN 202011555719A CN 114676157 A CN114676157 A CN 114676157A
Authority
CN
China
Prior art keywords
path
equipment
node
theta
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011555719.3A
Other languages
Chinese (zh)
Other versions
CN114676157B (en
Inventor
廖文昭
闻华
张玲
陆豪
钱雁
赵红蕾
任臣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202011555719.3A priority Critical patent/CN114676157B/en
Publication of CN114676157A publication Critical patent/CN114676157A/en
Application granted granted Critical
Publication of CN114676157B publication Critical patent/CN114676157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present application relates to an internet access quality detection analysis method, system, medium, and program. The method comprises the following steps: initiating probing of the monitoring nodes by deploying a number of distributed probes on the internet, the probing including a ping from the probe to the monitored node and a tracert from the probe to the monitored node, and the tracert returning the address of the node passing within the path; completing missing node addresses in the path returned by the tracert through a path completion algorithm; loading data comprising the on-off time delay data of the path and the node address passing through the path to a data model through an ETL (extract transform load), and carrying out arithmetic mean or exponential smoothing processing on the loaded data to obtain baseline data of the path; the method comprises the following steps of (1) promoting TOPN fault nodes through a network quality difference common path algorithm, wherein N is more than or equal to 3; the relevant index weight theta (theta 1 … theta n) of each fault node is determined through a machine learning algorithm, a feature vector X (X1 … xn) is formed, the feature value is processed through y ═ Sigmoid (theta 1X 1+ … theta n X n), and TOP3 of the y value is taken as the fault node.

Description

Internet access quality monitoring analysis method, system, medium, and program
Technical Field
The application belongs to the field of data communication, and particularly relates to an internet access quality monitoring and analyzing method, system, medium and program formed by integrating big data analysis and PING + Trace through an algorithm.
Background
In recent years, with the rapid development of the internet, users have no longer satisfied the availability of the network, but have further requirements on the network quality and the use experience,
microsoft pingmesh (data center network delay measurement and analysis) technology is currently operated in large client data centers such as microsoft, Tencent, Ali and the like. The technology can collect quality data between any two servers of a large-scale data center and is used for monitoring the network of an Internet company.
Pingmesh solves the requirement of Internet enterprises for network monitoring to a great extent, but the following defects still exist: the number of detection points is large, and each server needs to deploy a dialing point; the time delay data volume is large, and the data of millions of megabytes are analyzed and processed every day; the operator black box network can sense the quality difference and can not position the position.
In order to solve the problems, operators provide end-to-end monitoring for internet access quality of a data center, integrate multi-party data (self deployment and data access), add unified ETL (extraction-conversion-loading) processing, and intelligently locate poor quality points by using network topology of the operators under the condition of poor perceived quality to solve the actual operation and maintenance problem.
In the prior art, mass data can be accessed uniformly and processed in a standardized manner, but in the actual application of a data center network management platform of an operator company, due to the fact that multi-party data is accessed, the size of a data source is unstable, and certain influence is brought to the accuracy of an intelligent analysis result.
Therefore, a method and a system for improving the coverage and accuracy of monitoring the fault reporting problem of the client in the data center network are needed.
Disclosure of Invention
In view of the above technical problems, the present application proposes an internet access quality monitoring analysis method, system, medium, and program.
According to an aspect of the present application, there is provided an internet access quality detection analysis method including: initiating detection on a monitoring node by deploying a large number of distributed probes on the Internet, wherein the detection comprises ping detection from a probe to the monitored node and tracert detection from the probe to the monitored node, the ping detection returns on-off and time delay data of a path, n streams which cannot be ping-through in three continuous periods are recorded, and the tracert returns a node address passing through in the path; completing missing node addresses in the path returned by the tracert through a path completion algorithm; loading data comprising the on-off and time delay data of the path and the address of the node passing through the path to a data model through ETL, and carrying out arithmetic average or exponential smoothing processing on the loaded data to obtain baseline data of the path; the method comprises the following steps of (1) promoting TOPN fault nodes through a network quality difference common path algorithm, wherein N is more than or equal to 3; determining a relevant index weight theta (theta 1, theta 2, theta 3.. theta n) of each fault node through a machine learning algorithm, calculating each characteristic value of each fault node to form a characteristic vector X (X1, X2, x3... xn), processing the characteristic values through a Sigmoid function y which is Sigmoid (theta 1X 1+ theta 2X 2+ theta 3X 3+. theta n.. xn), and taking TOP3 of the y value as the fault node.
According to an example embodiment, the method further comprises issuing monitoring alarms and obstacle Root Cause Analysis (RCA) reports from the determined TOP3 failure nodes.
According to an example embodiment, wherein the network-poor public path algorithm comprises: periodically recording and updating the IP address of the intermediate node passed by any source address { a1, a 2., an } to any destination address { b1, b 2., bm }; finding out paths corresponding to the n flows which can not ping according to the records; converting a set of node IP addresses into a set of equipment according to information of a configuration management database, wherein the configuration management database records IP, board card and port data of each monitored equipment; and counting the probability of a certain device in the path from the source address to the destination address.
According to an example embodiment, wherein the path completion algorithm comprises a 30-bit mask based neighboring device path information completion algorithm comprising: traversing all unknown equipment, and taking the next hop; if the next hop is still unknown equipment, the completion cannot be carried out, if the next hop is the last hop in the path, the completion is not needed, if the next hop is equipment with a known port IP, the next hop port IP is taken to be converted into a binary representation, 31 and 32 bits of the binary representation are exchanged, and then the binary representation is converted into a point grid type to be used as the port IP of the unknown equipment; and inquiring the port IP from the configuration management database to obtain the corresponding equipment IP.
According to an example embodiment, wherein the path completion algorithm further comprises an alternate device path information completion algorithm based on device information, comprising: traversing a previous hop and a previous hop of the equipment complemented by a neighboring equipment path information complementing algorithm based on a 30-bit mask; if the previous hop is an unknown device and the previous hop is a known device, then: inquiring all port IPs of the completed equipment in the configuration management database, taking an adjacent port IP corresponding to each port IP, and searching the equipment IP of each adjacent port IP, thereby obtaining a set of the completed equipment; inquiring all port IPs of the previous hop equipment in the configuration management database, taking an adjacent port IP corresponding to each port IP, and searching equipment IP of each adjacent port IP to obtain a set of the previous hop equipment; and searching the intersection of the complemented set of the equipment and the set of the last hop of equipment to serve as the missing node equipment in the path.
According to an example embodiment, wherein obtaining the baseline data for the path further comprises: baseline data for paths for which a hierarchical computation is performed is obtained for a hierarchy in which the network exists.
According to an example embodiment, wherein the probing period of said ping is in the order of minutes and the probing period of said tracert is in the order of hours.
According to another aspect of the present application, there is provided an electronic device including: one or more processors, and a memory coupled with the one or more processors, the memory storing computer-readable program instructions that, when executed by the one or more processors, cause the one or more processors to perform an internet access quality detection analysis method in accordance with the present application.
According to yet another aspect of the present application, there is provided a non-transitory computer readable medium having instructions stored thereon for execution by a processor to perform an internet access quality detection analysis method according to the present application.
According to yet another aspect of the present application, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the internet access quality detection analysis method according to the present application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
For a better understanding of the present disclosure, and to show how the same may be carried into effect, reference will now be made, by way of example, to the accompanying drawings, in which:
fig. 1 shows a block diagram of an electronic device for implementing an internet access quality monitoring analysis system according to an embodiment of the present disclosure;
FIG. 2 illustrates an exemplary Internet monitoring architecture diagram, according to an embodiment of the disclosure;
FIG. 3 illustrates an exemplary path completion schematic diagram according to an embodiment of the present disclosure;
fig. 4 shows a flow diagram of an exemplary internet access quality monitoring analysis method according to an embodiment of the present disclosure.
Note that like reference numerals refer to corresponding parts throughout the drawings.
Detailed Description
The following detailed description is made with reference to the accompanying drawings and is provided to assist in a comprehensive understanding of various exemplary embodiments of the disclosure. The following description includes various details to aid understanding, but these details are to be regarded as examples only and are not intended to limit the disclosure, which is defined by the appended claims and their equivalents. The words and phrases used in the following description are used only to provide a clear and consistent understanding of the disclosure. In addition, descriptions of well-known structures, functions, and configurations may be omitted for clarity and conciseness. Those of ordinary skill in the art will recognize that various changes and modifications of the examples described herein can be made without departing from the spirit and scope of the disclosure.
Fig. 1 is an exemplary configuration block diagram illustrating an electronic device 100 according to an embodiment of the present disclosure. The electronic device 100 may be used to implement an internet access quality monitoring and analysis system according to the present application.
As shown in fig. 1, the electronic device 100 includes a user interface 20, a network interface 21, a power supply 22, an external network interface 23, a memory 24, and a processor 26. The user interface 20 may include, but is not limited to, buttons, a keyboard, a keypad, an LCD, a CRT, TFTs, LEDs, HD, or other similar display devices, including display devices having touch screen capabilities to enable interaction between a user and the gateway device. In some embodiments, the user interface 20 may be used to present a Graphical User Interface (GUI) to receive user input.
The network interface 21 may include various network cards and circuitry implemented in software and/or hardware to enable communication with user devices using wired or wireless protocols. The wired communication protocol is, for example, any one or more of an ethernet protocol, a MoCA specification protocol, a USB protocol, or other wired communication protocol. The wireless protocol is, for example, any IEEE 802.11 Wi-Fi protocol, Bluetooth Low Energy (BLE), or other short range protocol operating according to a wireless technology standard, for exchanging data over short ranges using any licensed or unlicensed frequency band, such as the national broadband radio service (CBRS) band, the 2.4GHz band, the 5GHz band, the 6GHz band, or the 60GHz band, the RF4CE protocol, the ZigBee protocol, the Z-Wave protocol, or the IEEE 802.15.4 protocol. Where the network interface 21 uses a wireless protocol, in some embodiments, the network interface 21 may also include one or more antennas (not shown) or circuit nodes for coupling to one or more antennas. The electronic device 100 may provide an internal network to the user device through the network interface 21.
The power supply 22 provides power to the internal components of the electronic device 100 through the internal bus 27. The power source 22 may be a self-contained power source, such as a battery pack, whose interface is powered by a charger connected to an outlet (e.g., directly or through other equipment). The power source 22 may also include a rechargeable battery, such as a NiCd, NiMH, Li-ion or Li-pol battery, which may be removable for replacement. The external network interface 23 may include various network cards and circuitry implemented in software and/or hardware to enable communication between the electronic device 100 and a provider of an external network, such as an internet service provider or a Multiple System Operator (MSO).
Memory 24 comprises a single memory or one or more memories or storage locations including, but not limited to, Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), Read Only Memory (ROM), EPROM, EEPROM, flash memory, logic blocks of an FPGA, a hard disk, or any other layers of a memory hierarchy. The memory 24 may be used to store any type of instructions, software, or algorithms, including software 25 for controlling the general functions and operations of the electronic device 100.
The processor 26 controls the general operation of the electronic device 100 and performs management functions related to other devices in the network, such as user equipment. The processor 26 may include, but is not limited to, a CPU, hardware microprocessor, hardware processor, multi-core processor, single-core processor, micro-controller, Application Specific Integrated Circuit (ASIC), DSP, or other similar processing device capable of executing any type of instructions, algorithms, or software for controlling the operation and function of the electronic device 100 according to embodiments described in this disclosure. The processor 26 may be various implementations of digital circuitry, analog circuitry, or mixed signal (a combination of analog and digital) circuitry that perform functions in a computing system. The processor 26 may include, for example, a system such as an Integrated Circuit (IC), a portion or circuitry of an individual processor core, an entire processor core, an individual processor, a programmable hardware device such as a Field Programmable Gate Array (FPGA), and/or multiple processors.
The internal bus 27 may be used to establish communications between components (e.g., 20-22, 24, and 26) of the electronic device 100.
Although electronic device 100 is described using specific components, in alternative embodiments, different components may be present in electronic device 100. For example, the electronic device 100 may include one or more additional controllers, memories, network interfaces, external network interfaces, and/or user interfaces. Additionally, one or more of the components may not be present in the electronic device 100. Further, in some embodiments, electronic device 100 may include one or more components not shown in fig. 1. Additionally, although separate components are shown in fig. 1, in some embodiments some or all of a given component may be integrated into one or more of the other components in electronic device 100. Further, any combination of analog and/or digital circuitry may be used to implement the circuits and components in electronic device 100.
FIG. 2 illustrates an exemplary Internet monitoring architecture diagram according to embodiments of the application. As shown in fig. 2, a large number of distributed probes are deployed in the plane of the data center network, at the exit of the machine room, at the user end, at each network node in the metropolitan area network, and so on, to initiate monitoring of the data center network and implement link-level coverage. Factors to be considered in order to achieve this include the size, topology, routing setup, etc. of the probed network.
As shown in fig. 2, the internet monitoring architecture diagram includes an internet access quality monitoring analysis system according to the present application, which may include a data receiving module, a path completion module, an ETL module, a baseline module, an evidence chain module, and an application module. Wherein the data receiving module may be implemented by a network interface 21 as shown in fig. 1; the path completion module, ETL module, baseline module, evidence chain module, and application module may all be implemented by a processor 26 as shown in fig. 1.
And (3) carrying out periodic ping test on the detected target addresses from the distributed nodes (the ping test period is in a minute level, the influence on the network load is small according to actual measurement, and the utilization rate of a Central Processing Unit (CPU) of the opposite-end server is less than 1 percent). And (5) screening and counting the test results in three continuous periods by the system, and recording n streams which cannot be PING passed in the three periods.
Meanwhile, the intermediate node addresses passed by any source address { a1, a 2., an } to any destination address { b1, b 2., bm } are periodically recorded and updated by means of tracert and the like, and the period of the path update is in the order of hours as shown in table 1.
Table 1: an address of an intermediate node traversed by the source address { a1, a 2.,. an } to the destination address { b1, b 2.,. bm }
a1->b1 c11 d11
a1->b2 c12 d12
an->om anm bnm
And (3) finding out paths corresponding to the previous n flows according to the records in the table 1, converting the set of the IP addresses of the nodes into a set of the equipment according to the information of a configuration management database (cmdb) recorded with the data of the IP, the board card, the port and the like of each monitored equipment, and counting the probability of the occurrence of a certain equipment in the path from the source address to the destination address.
The system receives various data collected by the distributed probe through the data receiving module, wherein the various data comprise on-off and time delay data of a route returned by the ping test from the probe to the monitored node and a node address passed by the route returned by the tracert from the probe to the monitored node.
And the path completion module completes the missing node address in the path returned by the tracert through a path completion algorithm.
The ETL module loads data including the on-off time delay data of the route and the address of the node passing through the route into a data model (e.g., a specific data structure), and performs arithmetic mean or exponential smoothing on the loaded data to obtain baseline data of the route. Given the hierarchy of the network (e.g., metropolitan area network, same domain, wide area network, etc.), the relevant baseline data is also computed accordingly.
The evidence chain module promotes TOPN fault nodes through a network quality difference common path algorithm, wherein N is more than or equal to 3. Then, the relevant index weight Θ (θ 1, θ 2, θ 3.. θ n) of each fault node is determined by a machine learning algorithm, each feature value of each fault node is calculated to form a feature vector X (X1, X2, x3... xn), the feature values are processed by a sigmod function y ═ Sigmoid (θ 1 × 1+ θ 2 × 2+ θ 3 × 3.. θ n × xn), and TOP3 of the y value is taken as the fault node, for example. Here, TOP3 is merely an example.
And the application module issues a monitoring alarm and an obstacle Root Cause Analysis (RCA) report according to the positioning result.
In the process of network path acquisition, partial node addresses can not be reached due to time-out or network configuration, and the node addresses are displayed as unknown states of 'Tmax'. The unknown state can affect the realization of the function of the subsequent public quality difference point positioning, so that a path automatic completion algorithm is needed to be used for backfilling and completing the unknown ports as much as possible.
The path probing process performs TRACEROUTE (i.e., tracert) from the source end to the destination end. For ports for which the middle hop is unknown (as shown in fig. 3), path completion may be performed. Taking the "path completion schematic diagram" shown in fig. 3 as an example, when the port X and the port Z are known, the port Y can be derived according to the port Z, and the accuracy verification is performed through the port X, so that the port X and the port Y are ensured to belong to the network element B.
The path completion algorithm according to the present application includes adjacent device route information completion based on a 30-bit mask and spaced device route information completion based on device information.
The main implementation process of the adjacent device routing information completion based on the 30-bit mask code is as follows: traversing all unknown equipment, and taking the next hop; if the next hop is still unknown equipment, the next hop cannot be completed, if the next hop is the last hop in the trace path, the completion is not needed (the unknown equipment is a router on the flight communication side), only when the next hop is equipment with a known port IP, the next hop port IP is taken to be converted into a binary representation, 31 and 32 bits of the binary representation are exchanged, and then the binary representation is converted into a point lattice type, namely the port IP of the unknown equipment; and inquiring the port IP from the configuration management database so as to obtain the corresponding equipment IP.
The main implementation process of the interval equipment routing information completion based on the equipment information is as follows: traversing all previous hops (B) and (A) of a device (C) complemented by 30-bit mask-based neighboring device routing information; when the previous hop is an unknown device and the previous hop is a known device (in other cases no completion is done): inquiring all port IPs of the C equipment in a configuration management database, taking an adjacent port IP corresponding to each IP (an adjacent equipment information searching method based on 30-bit mask), and searching the equipment IP of each adjacent port IP so as to obtain a name set of the link-C equipment; inquiring all port IPs of the device A in a configuration management database, taking an adjacent port IP corresponding to each IP (an adjacent device information searching method based on 30-bit mask), and searching the device IP of each adjacent port IP so as to obtain a name set of the link-A device; and finding the intersection of the two sets of link-A and link-C, namely B device.
As described above, when a plurality of link failures occur (for example, when the PING test fails or the PING test result exceeds the threshold of the baseline), weighted statistics is performed according to the frequency of occurrence of the devices included in the link paths collected in the previous period, and the device with the highest frequency of occurrence is selected as the device with the highest probability of occurrence. However, the suspected deterioration devices calculated by fault location based on the common path point algorithm are usually multiple, and the problem that the deterioration times cannot be obviously separated exists. In order to solve the problem, an intelligent fault point refining algorithm is introduced, comprehensive judgment is performed on multiple index dimensions such as PING time delay, packet loss rate, degradation times, degradation ratio and other network management system data indexes relevant to quality aiming at equipment identified as a suspected fault point, machine learning is introduced to train index weights of different dimensions, and an intelligent fault point refining model based on multi-dimensional index evaluation is formed, so that intelligent refining and elimination of non-fault points of the suspected fault point are achieved, and fault positioning accuracy is further improved.
Specifically, in the prior art, setting of weights is not based, a scoring model of the health degree of the equipment is established, each index corresponds to one weight (namely, a feature weight hereinafter), historical data is marked manually through tracking of the historical data, which indexes have problems under fault conditions (characterization) are determined, and then the data are learned through a machine to determine the weights of the indexes, so that the probability of the fault can be predicted through the change conditions of the indexes.
The fault point intelligent refining firstly traverses public suspected fault points, analyzes a suspected fault equipment list output by the process, calculates the quality difference probability for each equipment, respectively calculates each characteristic value xi of the equipment, forms a characteristic vector X (X1, X2, x3... xn) of the equipment (wherein n is the number of the characteristics or indexes), and standardizes all the characteristic values (the uniform value range is between 0 and 1). Then, the learned feature weight vector theta (theta 1, theta 2, theta 3.. theta n) is taken out, and the value of y ═ Sigmoid (theta 1 ^ x1+ theta 2 ^ x2+ theta 3 ^ x3+. theta n ^ xn) is calculated. The magnitude of the y value reflects the magnitude of the device quality difference probability, and the range is ensured to be (0, 1). Then, sorting the y value of each device from high to low, wherein the higher the ranking is, the higher the quality difference probability of the device is, and returning by taking the y value topN, wherein N is generally 3, but not limited thereto.
In embodiments of the present application, the device may be characterized by the following three (i.e., n — 3):
1) the period includes the average value of the packet loss rates of all links of the device
The formula describes:
Figure RE-GDA0002946477740000101
2) the period includes the maximum packet loss rate of all links of the device
The formula describes: max (packet loss rate ping this period)
3) The ratio of the quantity of the quality difference of the link where the equipment is located in the period
The formula describes: number of bad links containing the device/total number of links containing the device.
Fig. 4 shows a flowchart 400 of an exemplary internet access quality monitoring analysis method according to an embodiment of the application. The internet access quality monitoring analysis method shown in fig. 4 may be performed by the electronic device shown in fig. 1 and the internet access quality monitoring analysis system shown in fig. 2.
As shown in fig. 4, at step S410, a probe for a monitoring node is initiated by deploying a large number of distributed probes on the internet, the probe including a ping test from the probe to the monitored node and a tracert from the probe to the monitored node. The ping test returns on-off and time delay data of the route, n streams which cannot be ping-connected in three continuous periods are recorded, and the detection period of the ping test is in the minute level. The tracert returns the address of the node passing through the path, and the detection period of the tracert is in the order of hours.
At step S420, the missing node address in the path returned by the tracert is complemented by a path complementing algorithm. Wherein the path completion algorithm includes an adjacent device path information completion algorithm based on a 30-bit mask and an alternate device path information completion algorithm based on device information as described above.
In step S430, data including the on/off and delay data of the route and the address of the node passing through the route are loaded to the data model through the ETL, and the loaded data is subjected to arithmetic mean or exponential smoothing to obtain baseline data of the route. Optionally, for a hierarchy of network presence, obtaining baseline data for the path comprises obtaining baseline data for the path for which the hierarchical computation is performed.
At step S440, TOPN number of failed nodes are promoted by the network quality difference common path algorithm, wherein N ≧ 3. According to the embodiment of the application, the network quality difference public path algorithm is as follows: periodically recording and updating the IP address of the intermediate node passed by any source address { a1, a 2., an } to any destination address { b1, b 2., bm }; finding out paths corresponding to the n flows which can not ping according to the records; converting a set of node IP addresses into a set of equipment according to information of a configuration management database, wherein the configuration management database records IP, board card and port data of each monitored equipment; and counting the probability of a certain device in the path from the source address to the destination address.
At step S450, a correlation index weight Θ (θ 1, θ 2, θ 3.. θ n) of each fault node is determined by a machine learning algorithm, each feature value of each fault node is calculated to form a feature vector X (X1, X2, x3... xn), the feature values are processed by a sigmod function y ═ Sigmoid (θ 1 × 1+ θ 2 × 2+ θ 3 × 3.. θ n × n), and TOP3 of the y value is taken as the fault node. It should be understood that TOP3 of the y value is taken as the fault node here for example only, and in fact, the y value may take N to any value.
At step S460, monitoring alarms and obstacle Root Cause Analysis (RCA) reports are issued from the determined TOP3 failed nodes.
The method combines the mature pingmesh technology at present, utilizes the clear advantages of the topological structures of the networks such as an IDC network, a metropolitan area network and a backbone network of an operator, introduces the analysis algorithms such as a public quality difference point algorithm after the optimization of an internet manufacturer algorithm, and has the following advantages of subsection quality difference positioning, mass probe deployment, path data completion, baseline and evidence chain guidance quality problem: 1) and (3) positioning the quality difference in a segmentation manner: dividing the internet access quality of the data center into a plurality of sections, and carrying out probe dial testing and quality and path data collection aiming at each section; 2) deployment of a mass of probes: building a network probe monitoring of a path coverage level, and collecting and processing standardized data; 3) path completion: utilizing a path completion algorithm to complete missing path information in the network path data; 4) baseline: establishing a quality base line by a common quality difference point model and utilizing a traversal algorithm; 5) chain of evidence: and outputting a network quality fluctuation evidence through subsection quality monitoring and quality difference positioning, and combining multi-path evidences to form a data center internet access quality evidence chain.
The function is used online in IDC network management application of China telecom Shanghai corporation since 4 months in 2019, the problem monitoring coverage degree of customer fault reporting in the IDC network exceeds 90%, the monitoring accuracy is higher than 95%, and the aim of finding problems before users is achieved.
The present disclosure may be implemented as any combination of apparatus, systems, integrated circuits, and computer programs on non-transitory computer readable media. One or more controllers may be implemented as an Integrated Circuit (IC), an Application Specific Integrated Circuit (ASIC), or a large scale integrated circuit (LSI), a system LSI, a super LSI, or an ultra LSI package that performs some or all of the functions described in this disclosure.
The present disclosure includes the use of software, applications, computer programs or algorithms. Software, applications, computer programs, or algorithms may be stored on a non-transitory computer readable medium to cause a computer, such as one or more processors, to perform the steps described above and depicted in the figures. For example, the one or more memories store software or algorithms in executable instructions and the one or more processors may associate a set of instructions to execute the software or algorithms to provide network configuration information management functionality for a network access device according to embodiments described in the present disclosure.
Software and computer programs (which may also be referred to as programs, software applications, components, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural, object-oriented, functional, logical, or assembly language, or machine language. The term "computer-readable medium" refers to any computer program product, apparatus or device, such as magnetic disks, optical disks, solid state storage devices, memories, and Programmable Logic Devices (PLDs), used to provide machine instructions or data to a programmable data processor, including a computer-readable medium that receives machine instructions as a computer-readable signal.
By way of example, computer-readable media can comprise Dynamic Random Access Memory (DRAM), Random Access Memory (RAM), Read Only Memory (ROM), electrically erasable read only memory (EEPROM), compact disk read only memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired computer-readable program code in the form of instructions or data structures and which can be accessed by a general-purpose or special-purpose computer or a general-purpose or special-purpose processor. Disk or disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above are also included within the scope of computer-readable media.
Additionally, the above description provides examples, and does not limit the scope, applicability, or configuration set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the spirit and scope of the disclosure. Various embodiments may omit, substitute, or add various procedures or components as appropriate. For example, features described with respect to certain embodiments may be combined in other embodiments.

Claims (11)

1. An internet access quality detection analysis method, comprising:
initiating detection on a monitoring node by deploying a large number of distributed probes on the Internet, wherein the detection comprises ping detection from a probe to the monitored node and tracert from the probe to the monitored node, the ping detection returns on-off and time delay data of a path, n streams which cannot be ping-through in three continuous periods are recorded, and the tracert returns a node address passing through in the path;
completing missing node addresses in the path returned by the tracert through a path completion algorithm;
loading data comprising the on-off time delay data of the route and the address of the node passing through the route to a data model through ETL, and carrying out arithmetic mean or exponential smoothing processing on the loaded data to obtain baseline data of the route;
the method comprises the following steps of (1) promoting TOPN fault nodes through a network quality difference common path algorithm, wherein N is more than or equal to 3;
the method comprises the steps of determining relevant index weights theta (theta 1, theta 2, theta 3.. theta n) of each fault node through a machine learning algorithm, calculating each characteristic value of each fault node to form a characteristic vector X (X1, X2, x3... xn), processing the characteristic values through a Sigmoid function y ═ Sigmoid (theta 1X 1+ theta 2X 2+ theta 3X 3.. theta n X n), and taking TOP3 of the y values as the fault node.
2. The internet access quality detection analysis method of claim 1, further comprising issuing monitoring alarms and obstacle Root Cause Analysis (RCA) reports from the determined TOP3 failed nodes.
3. The internet access quality detection analysis method of claim 1, wherein the network-poor public path algorithm comprises:
periodically recording and updating the IP address of the intermediate node passed by any source address { a1, a 2., an } to any destination address { b1, b 2., bm };
finding out paths corresponding to the n flows which can not ping according to the records;
converting a set of node IP addresses into a set of equipment according to information of a configuration management database, wherein the configuration management database records IP, board card and port data of each monitored equipment; and
and counting the probability of the appearance of a certain device in the path from the source address to the destination address.
4. The internet access quality detection analysis method of claim 3, wherein the path completion algorithm comprises a 30-bit mask-based neighboring device path information completion algorithm comprising:
traversing all unknown equipment, and taking the next hop;
if the next hop is still unknown equipment, the completion cannot be carried out, if the next hop is the last hop in the path, the completion is not needed, if the next hop is equipment with a known port IP, the port IP of the next hop is converted into a binary representation, 31 and 32 bits of the binary representation are exchanged, and then the binary representation is converted into a dot lattice type to be used as the port IP of the unknown equipment; and
and inquiring the port IP from the configuration management database to obtain the corresponding equipment IP.
5. The internet access quality detection analysis method of claim 4, wherein the path completion algorithm further comprises an alternate device path information completion algorithm based on device information, comprising:
traversing a previous hop and a previous hop of the equipment complemented by a neighboring equipment path information complementing algorithm based on a 30-bit mask;
if the last hop is an unknown device and the last hop is a known device, then:
inquiring all port IPs of the completed equipment in the configuration management database, taking an adjacent port IP corresponding to each port IP, and inquiring the equipment IP of each adjacent port IP, thereby obtaining a set of the completed equipment;
inquiring all port IPs of the previous hop equipment in the configuration management database, taking an adjacent port IP corresponding to each port IP, and searching equipment IP of each adjacent port IP to obtain a set of the previous hop equipment; and
and searching the intersection of the complemented set of the equipment and the set of the previous hop of equipment to serve as the missing node equipment in the path.
6. The internet access quality detection analysis method of claim 1, wherein obtaining baseline data for a path further comprises: baseline data for paths for which a hierarchical computation is performed is obtained for a hierarchy of network presence.
7. The internet access quality detection analysis method of claim 1, wherein the ping detection period is of the order of minutes and the tracert detection period is of the order of hours.
8. An electronic device, comprising:
one or more processors, and
a memory coupled with the one or more processors, the memory storing computer-readable program instructions that, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-7.
9. An internet access quality detection analysis system comprising means for performing the steps of the method of any one of claims 1-7.
10. A non-transitory computer readable medium having instructions stored thereon for execution by a processor to perform the steps of the method of any of claims 1-7.
11. A computer program product comprising a computer program which, when executed by a processor, performs the steps of the method according to any one of claims 1-7.
CN202011555719.3A 2020-12-24 2020-12-24 Internet access quality monitoring and analyzing method, system, medium and program Active CN114676157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011555719.3A CN114676157B (en) 2020-12-24 2020-12-24 Internet access quality monitoring and analyzing method, system, medium and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011555719.3A CN114676157B (en) 2020-12-24 2020-12-24 Internet access quality monitoring and analyzing method, system, medium and program

Publications (2)

Publication Number Publication Date
CN114676157A true CN114676157A (en) 2022-06-28
CN114676157B CN114676157B (en) 2024-08-13

Family

ID=82071159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011555719.3A Active CN114676157B (en) 2020-12-24 2020-12-24 Internet access quality monitoring and analyzing method, system, medium and program

Country Status (1)

Country Link
CN (1) CN114676157B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115766396A (en) * 2022-10-28 2023-03-07 苏州浪潮智能科技有限公司 Control cluster fault detection method and device, electronic equipment and readable storage medium
CN117880055A (en) * 2024-03-12 2024-04-12 灵长智能科技(杭州)有限公司 Network fault diagnosis method, device, equipment and medium based on transmission layer index

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030204619A1 (en) * 2002-04-26 2003-10-30 Bays Robert James Methods, apparatuses and systems facilitating determination of network path metrics
US20070064715A1 (en) * 2002-07-25 2007-03-22 Avaya, Inc. Method and apparatus for the assessment and optimization of network traffic
CN101635656A (en) * 2008-07-26 2010-01-27 华为技术有限公司 Fault detection method in layered ordered address packet network, system and equipment
US20100061272A1 (en) * 2008-09-04 2010-03-11 Trilliant Networks, Inc. System and method for implementing mesh network communications using a mesh network protocol
CN111224842A (en) * 2019-12-31 2020-06-02 大唐软件技术股份有限公司 Internet service quality monitoring method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030204619A1 (en) * 2002-04-26 2003-10-30 Bays Robert James Methods, apparatuses and systems facilitating determination of network path metrics
US20070064715A1 (en) * 2002-07-25 2007-03-22 Avaya, Inc. Method and apparatus for the assessment and optimization of network traffic
CN101635656A (en) * 2008-07-26 2010-01-27 华为技术有限公司 Fault detection method in layered ordered address packet network, system and equipment
US20100061272A1 (en) * 2008-09-04 2010-03-11 Trilliant Networks, Inc. System and method for implementing mesh network communications using a mesh network protocol
CN111224842A (en) * 2019-12-31 2020-06-02 大唐软件技术股份有限公司 Internet service quality monitoring method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIAOJIANG DU等: "Improving Onboard Internet Services for High-Speed Vehicles by Multipath Transmission in Heterogeneous Wireless Networks", 《IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY》, vol. 65, no. 12, 16 June 2016 (2016-06-16), pages 9493, XP011636813, DOI: 10.1109/TVT.2016.2581020 *
黄山: "基于动态二进制程序切片技术的软件攻击诊断", 《中国优秀硕士学位论文全文数据库 信息科技辑》, 15 July 2012 (2012-07-15), pages 138 - 366 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115766396A (en) * 2022-10-28 2023-03-07 苏州浪潮智能科技有限公司 Control cluster fault detection method and device, electronic equipment and readable storage medium
CN117880055A (en) * 2024-03-12 2024-04-12 灵长智能科技(杭州)有限公司 Network fault diagnosis method, device, equipment and medium based on transmission layer index
CN117880055B (en) * 2024-03-12 2024-05-31 灵长智能科技(杭州)有限公司 Network fault diagnosis method, device, equipment and medium based on transmission layer index

Also Published As

Publication number Publication date
CN114676157B (en) 2024-08-13

Similar Documents

Publication Publication Date Title
CN112398680B (en) Fault delimiting method and equipment
CN108063676A (en) Communication network failure method for early warning and device
CN107483487B (en) TOPSIS-based multi-dimensional network security measurement method
CN104639388A (en) DNS server availability detection method based on user perception
CN108933694A (en) Data center network Fault Node Diagnosis method and system based on testing data
CN113938407A (en) Data center network fault detection method and device based on in-band network telemetry system
Xu et al. Lightweight and adaptive service api performance monitoring in highly dynamic cloud environment
WO2023207689A1 (en) Change risk assessment method and apparatus, and storage medium
CN114676157B (en) Internet access quality monitoring and analyzing method, system, medium and program
Hoarau et al. Suitability of graph representation for bgp anomaly detection
CN117692940B (en) Microwave system performance detection method based on microwave link
CN117614833B (en) Automatic regulating method and system for router signals
Alenazi et al. Evaluation and improvement of network resilience against attacks using graph spectral metrics
CN104363142A (en) Automatic data center network performance bottleneck analysis method
CN108494625A (en) A kind of analysis system on network performance evaluation
CN117291002A (en) Unmanned plane cluster network damage evaluation method based on entropy weight method-TOPSIS
CN117376084A (en) Fault detection method, electronic equipment and medium thereof
Shahraeini et al. Towards an unified dependency analysis methodology for wide area measurement systems in smart grids
Patil et al. Probe station placement algorithm for probe set reduction in network fault localization
CN114880153A (en) Data processing method and device, electronic equipment and computer readable storage medium
EP3840453A1 (en) Method for detecting anomalies in mobile telecommunication networks
CN118573608B (en) Switch reliability test method and system
Rozaki Clustering optimisation techniques in mobile networks
CN117499817B (en) Distributed ammeter acquisition system and acquisition method
CN104333491B (en) The automated testing method and device of a kind of huge system domain network availability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant