CN114676157A - Internet access quality monitoring analysis method, system, medium, and program - Google Patents
Internet access quality monitoring analysis method, system, medium, and program Download PDFInfo
- Publication number
- CN114676157A CN114676157A CN202011555719.3A CN202011555719A CN114676157A CN 114676157 A CN114676157 A CN 114676157A CN 202011555719 A CN202011555719 A CN 202011555719A CN 114676157 A CN114676157 A CN 114676157A
- Authority
- CN
- China
- Prior art keywords
- path
- equipment
- node
- theta
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 31
- 238000004458 analytical method Methods 0.000 title claims abstract description 30
- 238000001514 detection method Methods 0.000 claims abstract description 27
- 239000000523 sample Substances 0.000 claims abstract description 22
- 238000000034 method Methods 0.000 claims abstract description 21
- 102100040401 DNA topoisomerase 3-alpha Human genes 0.000 claims abstract description 10
- 101000611068 Homo sapiens DNA topoisomerase 3-alpha Proteins 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000010801 machine learning Methods 0.000 claims abstract description 6
- 238000013499 data model Methods 0.000 claims abstract description 5
- 238000009499 grossing Methods 0.000 claims abstract description 5
- 230000000977 initiatory effect Effects 0.000 claims abstract description 3
- 230000001737 promoting effect Effects 0.000 claims abstract description 3
- 238000007726 management method Methods 0.000 claims description 21
- 230000015654 memory Effects 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 9
- 238000012360 testing method Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000007670 refining Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000006866 deterioration Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 235000008694 Humulus lupulus Nutrition 0.000 description 1
- 241001112258 Moca Species 0.000 description 1
- 229910005580 NiCd Inorganic materials 0.000 description 1
- 229910005813 NiMH Inorganic materials 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000008571 general function Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 229910001416 lithium ion Inorganic materials 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The present application relates to an internet access quality detection analysis method, system, medium, and program. The method comprises the following steps: initiating probing of the monitoring nodes by deploying a number of distributed probes on the internet, the probing including a ping from the probe to the monitored node and a tracert from the probe to the monitored node, and the tracert returning the address of the node passing within the path; completing missing node addresses in the path returned by the tracert through a path completion algorithm; loading data comprising the on-off time delay data of the path and the node address passing through the path to a data model through an ETL (extract transform load), and carrying out arithmetic mean or exponential smoothing processing on the loaded data to obtain baseline data of the path; the method comprises the following steps of (1) promoting TOPN fault nodes through a network quality difference common path algorithm, wherein N is more than or equal to 3; the relevant index weight theta (theta 1 … theta n) of each fault node is determined through a machine learning algorithm, a feature vector X (X1 … xn) is formed, the feature value is processed through y ═ Sigmoid (theta 1X 1+ … theta n X n), and TOP3 of the y value is taken as the fault node.
Description
Technical Field
The application belongs to the field of data communication, and particularly relates to an internet access quality monitoring and analyzing method, system, medium and program formed by integrating big data analysis and PING + Trace through an algorithm.
Background
In recent years, with the rapid development of the internet, users have no longer satisfied the availability of the network, but have further requirements on the network quality and the use experience,
microsoft pingmesh (data center network delay measurement and analysis) technology is currently operated in large client data centers such as microsoft, Tencent, Ali and the like. The technology can collect quality data between any two servers of a large-scale data center and is used for monitoring the network of an Internet company.
Pingmesh solves the requirement of Internet enterprises for network monitoring to a great extent, but the following defects still exist: the number of detection points is large, and each server needs to deploy a dialing point; the time delay data volume is large, and the data of millions of megabytes are analyzed and processed every day; the operator black box network can sense the quality difference and can not position the position.
In order to solve the problems, operators provide end-to-end monitoring for internet access quality of a data center, integrate multi-party data (self deployment and data access), add unified ETL (extraction-conversion-loading) processing, and intelligently locate poor quality points by using network topology of the operators under the condition of poor perceived quality to solve the actual operation and maintenance problem.
In the prior art, mass data can be accessed uniformly and processed in a standardized manner, but in the actual application of a data center network management platform of an operator company, due to the fact that multi-party data is accessed, the size of a data source is unstable, and certain influence is brought to the accuracy of an intelligent analysis result.
Therefore, a method and a system for improving the coverage and accuracy of monitoring the fault reporting problem of the client in the data center network are needed.
Disclosure of Invention
In view of the above technical problems, the present application proposes an internet access quality monitoring analysis method, system, medium, and program.
According to an aspect of the present application, there is provided an internet access quality detection analysis method including: initiating detection on a monitoring node by deploying a large number of distributed probes on the Internet, wherein the detection comprises ping detection from a probe to the monitored node and tracert detection from the probe to the monitored node, the ping detection returns on-off and time delay data of a path, n streams which cannot be ping-through in three continuous periods are recorded, and the tracert returns a node address passing through in the path; completing missing node addresses in the path returned by the tracert through a path completion algorithm; loading data comprising the on-off and time delay data of the path and the address of the node passing through the path to a data model through ETL, and carrying out arithmetic average or exponential smoothing processing on the loaded data to obtain baseline data of the path; the method comprises the following steps of (1) promoting TOPN fault nodes through a network quality difference common path algorithm, wherein N is more than or equal to 3; determining a relevant index weight theta (theta 1, theta 2, theta 3.. theta n) of each fault node through a machine learning algorithm, calculating each characteristic value of each fault node to form a characteristic vector X (X1, X2, x3... xn), processing the characteristic values through a Sigmoid function y which is Sigmoid (theta 1X 1+ theta 2X 2+ theta 3X 3+. theta n.. xn), and taking TOP3 of the y value as the fault node.
According to an example embodiment, the method further comprises issuing monitoring alarms and obstacle Root Cause Analysis (RCA) reports from the determined TOP3 failure nodes.
According to an example embodiment, wherein the network-poor public path algorithm comprises: periodically recording and updating the IP address of the intermediate node passed by any source address { a1, a 2., an } to any destination address { b1, b 2., bm }; finding out paths corresponding to the n flows which can not ping according to the records; converting a set of node IP addresses into a set of equipment according to information of a configuration management database, wherein the configuration management database records IP, board card and port data of each monitored equipment; and counting the probability of a certain device in the path from the source address to the destination address.
According to an example embodiment, wherein the path completion algorithm comprises a 30-bit mask based neighboring device path information completion algorithm comprising: traversing all unknown equipment, and taking the next hop; if the next hop is still unknown equipment, the completion cannot be carried out, if the next hop is the last hop in the path, the completion is not needed, if the next hop is equipment with a known port IP, the next hop port IP is taken to be converted into a binary representation, 31 and 32 bits of the binary representation are exchanged, and then the binary representation is converted into a point grid type to be used as the port IP of the unknown equipment; and inquiring the port IP from the configuration management database to obtain the corresponding equipment IP.
According to an example embodiment, wherein the path completion algorithm further comprises an alternate device path information completion algorithm based on device information, comprising: traversing a previous hop and a previous hop of the equipment complemented by a neighboring equipment path information complementing algorithm based on a 30-bit mask; if the previous hop is an unknown device and the previous hop is a known device, then: inquiring all port IPs of the completed equipment in the configuration management database, taking an adjacent port IP corresponding to each port IP, and searching the equipment IP of each adjacent port IP, thereby obtaining a set of the completed equipment; inquiring all port IPs of the previous hop equipment in the configuration management database, taking an adjacent port IP corresponding to each port IP, and searching equipment IP of each adjacent port IP to obtain a set of the previous hop equipment; and searching the intersection of the complemented set of the equipment and the set of the last hop of equipment to serve as the missing node equipment in the path.
According to an example embodiment, wherein obtaining the baseline data for the path further comprises: baseline data for paths for which a hierarchical computation is performed is obtained for a hierarchy in which the network exists.
According to an example embodiment, wherein the probing period of said ping is in the order of minutes and the probing period of said tracert is in the order of hours.
According to another aspect of the present application, there is provided an electronic device including: one or more processors, and a memory coupled with the one or more processors, the memory storing computer-readable program instructions that, when executed by the one or more processors, cause the one or more processors to perform an internet access quality detection analysis method in accordance with the present application.
According to yet another aspect of the present application, there is provided a non-transitory computer readable medium having instructions stored thereon for execution by a processor to perform an internet access quality detection analysis method according to the present application.
According to yet another aspect of the present application, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the internet access quality detection analysis method according to the present application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
For a better understanding of the present disclosure, and to show how the same may be carried into effect, reference will now be made, by way of example, to the accompanying drawings, in which:
fig. 1 shows a block diagram of an electronic device for implementing an internet access quality monitoring analysis system according to an embodiment of the present disclosure;
FIG. 2 illustrates an exemplary Internet monitoring architecture diagram, according to an embodiment of the disclosure;
FIG. 3 illustrates an exemplary path completion schematic diagram according to an embodiment of the present disclosure;
fig. 4 shows a flow diagram of an exemplary internet access quality monitoring analysis method according to an embodiment of the present disclosure.
Note that like reference numerals refer to corresponding parts throughout the drawings.
Detailed Description
The following detailed description is made with reference to the accompanying drawings and is provided to assist in a comprehensive understanding of various exemplary embodiments of the disclosure. The following description includes various details to aid understanding, but these details are to be regarded as examples only and are not intended to limit the disclosure, which is defined by the appended claims and their equivalents. The words and phrases used in the following description are used only to provide a clear and consistent understanding of the disclosure. In addition, descriptions of well-known structures, functions, and configurations may be omitted for clarity and conciseness. Those of ordinary skill in the art will recognize that various changes and modifications of the examples described herein can be made without departing from the spirit and scope of the disclosure.
Fig. 1 is an exemplary configuration block diagram illustrating an electronic device 100 according to an embodiment of the present disclosure. The electronic device 100 may be used to implement an internet access quality monitoring and analysis system according to the present application.
As shown in fig. 1, the electronic device 100 includes a user interface 20, a network interface 21, a power supply 22, an external network interface 23, a memory 24, and a processor 26. The user interface 20 may include, but is not limited to, buttons, a keyboard, a keypad, an LCD, a CRT, TFTs, LEDs, HD, or other similar display devices, including display devices having touch screen capabilities to enable interaction between a user and the gateway device. In some embodiments, the user interface 20 may be used to present a Graphical User Interface (GUI) to receive user input.
The network interface 21 may include various network cards and circuitry implemented in software and/or hardware to enable communication with user devices using wired or wireless protocols. The wired communication protocol is, for example, any one or more of an ethernet protocol, a MoCA specification protocol, a USB protocol, or other wired communication protocol. The wireless protocol is, for example, any IEEE 802.11 Wi-Fi protocol, Bluetooth Low Energy (BLE), or other short range protocol operating according to a wireless technology standard, for exchanging data over short ranges using any licensed or unlicensed frequency band, such as the national broadband radio service (CBRS) band, the 2.4GHz band, the 5GHz band, the 6GHz band, or the 60GHz band, the RF4CE protocol, the ZigBee protocol, the Z-Wave protocol, or the IEEE 802.15.4 protocol. Where the network interface 21 uses a wireless protocol, in some embodiments, the network interface 21 may also include one or more antennas (not shown) or circuit nodes for coupling to one or more antennas. The electronic device 100 may provide an internal network to the user device through the network interface 21.
The power supply 22 provides power to the internal components of the electronic device 100 through the internal bus 27. The power source 22 may be a self-contained power source, such as a battery pack, whose interface is powered by a charger connected to an outlet (e.g., directly or through other equipment). The power source 22 may also include a rechargeable battery, such as a NiCd, NiMH, Li-ion or Li-pol battery, which may be removable for replacement. The external network interface 23 may include various network cards and circuitry implemented in software and/or hardware to enable communication between the electronic device 100 and a provider of an external network, such as an internet service provider or a Multiple System Operator (MSO).
The processor 26 controls the general operation of the electronic device 100 and performs management functions related to other devices in the network, such as user equipment. The processor 26 may include, but is not limited to, a CPU, hardware microprocessor, hardware processor, multi-core processor, single-core processor, micro-controller, Application Specific Integrated Circuit (ASIC), DSP, or other similar processing device capable of executing any type of instructions, algorithms, or software for controlling the operation and function of the electronic device 100 according to embodiments described in this disclosure. The processor 26 may be various implementations of digital circuitry, analog circuitry, or mixed signal (a combination of analog and digital) circuitry that perform functions in a computing system. The processor 26 may include, for example, a system such as an Integrated Circuit (IC), a portion or circuitry of an individual processor core, an entire processor core, an individual processor, a programmable hardware device such as a Field Programmable Gate Array (FPGA), and/or multiple processors.
The internal bus 27 may be used to establish communications between components (e.g., 20-22, 24, and 26) of the electronic device 100.
Although electronic device 100 is described using specific components, in alternative embodiments, different components may be present in electronic device 100. For example, the electronic device 100 may include one or more additional controllers, memories, network interfaces, external network interfaces, and/or user interfaces. Additionally, one or more of the components may not be present in the electronic device 100. Further, in some embodiments, electronic device 100 may include one or more components not shown in fig. 1. Additionally, although separate components are shown in fig. 1, in some embodiments some or all of a given component may be integrated into one or more of the other components in electronic device 100. Further, any combination of analog and/or digital circuitry may be used to implement the circuits and components in electronic device 100.
FIG. 2 illustrates an exemplary Internet monitoring architecture diagram according to embodiments of the application. As shown in fig. 2, a large number of distributed probes are deployed in the plane of the data center network, at the exit of the machine room, at the user end, at each network node in the metropolitan area network, and so on, to initiate monitoring of the data center network and implement link-level coverage. Factors to be considered in order to achieve this include the size, topology, routing setup, etc. of the probed network.
As shown in fig. 2, the internet monitoring architecture diagram includes an internet access quality monitoring analysis system according to the present application, which may include a data receiving module, a path completion module, an ETL module, a baseline module, an evidence chain module, and an application module. Wherein the data receiving module may be implemented by a network interface 21 as shown in fig. 1; the path completion module, ETL module, baseline module, evidence chain module, and application module may all be implemented by a processor 26 as shown in fig. 1.
And (3) carrying out periodic ping test on the detected target addresses from the distributed nodes (the ping test period is in a minute level, the influence on the network load is small according to actual measurement, and the utilization rate of a Central Processing Unit (CPU) of the opposite-end server is less than 1 percent). And (5) screening and counting the test results in three continuous periods by the system, and recording n streams which cannot be PING passed in the three periods.
Meanwhile, the intermediate node addresses passed by any source address { a1, a 2., an } to any destination address { b1, b 2., bm } are periodically recorded and updated by means of tracert and the like, and the period of the path update is in the order of hours as shown in table 1.
Table 1: an address of an intermediate node traversed by the source address { a1, a 2.,. an } to the destination address { b1, b 2.,. bm }
a1->b1 | c11 | d11 | … |
a1->b2 | c12 | d12 | … |
… | |||
an->om | anm | bnm | … |
And (3) finding out paths corresponding to the previous n flows according to the records in the table 1, converting the set of the IP addresses of the nodes into a set of the equipment according to the information of a configuration management database (cmdb) recorded with the data of the IP, the board card, the port and the like of each monitored equipment, and counting the probability of the occurrence of a certain equipment in the path from the source address to the destination address.
The system receives various data collected by the distributed probe through the data receiving module, wherein the various data comprise on-off and time delay data of a route returned by the ping test from the probe to the monitored node and a node address passed by the route returned by the tracert from the probe to the monitored node.
And the path completion module completes the missing node address in the path returned by the tracert through a path completion algorithm.
The ETL module loads data including the on-off time delay data of the route and the address of the node passing through the route into a data model (e.g., a specific data structure), and performs arithmetic mean or exponential smoothing on the loaded data to obtain baseline data of the route. Given the hierarchy of the network (e.g., metropolitan area network, same domain, wide area network, etc.), the relevant baseline data is also computed accordingly.
The evidence chain module promotes TOPN fault nodes through a network quality difference common path algorithm, wherein N is more than or equal to 3. Then, the relevant index weight Θ (θ 1, θ 2, θ 3.. θ n) of each fault node is determined by a machine learning algorithm, each feature value of each fault node is calculated to form a feature vector X (X1, X2, x3... xn), the feature values are processed by a sigmod function y ═ Sigmoid (θ 1 × 1+ θ 2 × 2+ θ 3 × 3.. θ n × xn), and TOP3 of the y value is taken as the fault node, for example. Here, TOP3 is merely an example.
And the application module issues a monitoring alarm and an obstacle Root Cause Analysis (RCA) report according to the positioning result.
In the process of network path acquisition, partial node addresses can not be reached due to time-out or network configuration, and the node addresses are displayed as unknown states of 'Tmax'. The unknown state can affect the realization of the function of the subsequent public quality difference point positioning, so that a path automatic completion algorithm is needed to be used for backfilling and completing the unknown ports as much as possible.
The path probing process performs TRACEROUTE (i.e., tracert) from the source end to the destination end. For ports for which the middle hop is unknown (as shown in fig. 3), path completion may be performed. Taking the "path completion schematic diagram" shown in fig. 3 as an example, when the port X and the port Z are known, the port Y can be derived according to the port Z, and the accuracy verification is performed through the port X, so that the port X and the port Y are ensured to belong to the network element B.
The path completion algorithm according to the present application includes adjacent device route information completion based on a 30-bit mask and spaced device route information completion based on device information.
The main implementation process of the adjacent device routing information completion based on the 30-bit mask code is as follows: traversing all unknown equipment, and taking the next hop; if the next hop is still unknown equipment, the next hop cannot be completed, if the next hop is the last hop in the trace path, the completion is not needed (the unknown equipment is a router on the flight communication side), only when the next hop is equipment with a known port IP, the next hop port IP is taken to be converted into a binary representation, 31 and 32 bits of the binary representation are exchanged, and then the binary representation is converted into a point lattice type, namely the port IP of the unknown equipment; and inquiring the port IP from the configuration management database so as to obtain the corresponding equipment IP.
The main implementation process of the interval equipment routing information completion based on the equipment information is as follows: traversing all previous hops (B) and (A) of a device (C) complemented by 30-bit mask-based neighboring device routing information; when the previous hop is an unknown device and the previous hop is a known device (in other cases no completion is done): inquiring all port IPs of the C equipment in a configuration management database, taking an adjacent port IP corresponding to each IP (an adjacent equipment information searching method based on 30-bit mask), and searching the equipment IP of each adjacent port IP so as to obtain a name set of the link-C equipment; inquiring all port IPs of the device A in a configuration management database, taking an adjacent port IP corresponding to each IP (an adjacent device information searching method based on 30-bit mask), and searching the device IP of each adjacent port IP so as to obtain a name set of the link-A device; and finding the intersection of the two sets of link-A and link-C, namely B device.
As described above, when a plurality of link failures occur (for example, when the PING test fails or the PING test result exceeds the threshold of the baseline), weighted statistics is performed according to the frequency of occurrence of the devices included in the link paths collected in the previous period, and the device with the highest frequency of occurrence is selected as the device with the highest probability of occurrence. However, the suspected deterioration devices calculated by fault location based on the common path point algorithm are usually multiple, and the problem that the deterioration times cannot be obviously separated exists. In order to solve the problem, an intelligent fault point refining algorithm is introduced, comprehensive judgment is performed on multiple index dimensions such as PING time delay, packet loss rate, degradation times, degradation ratio and other network management system data indexes relevant to quality aiming at equipment identified as a suspected fault point, machine learning is introduced to train index weights of different dimensions, and an intelligent fault point refining model based on multi-dimensional index evaluation is formed, so that intelligent refining and elimination of non-fault points of the suspected fault point are achieved, and fault positioning accuracy is further improved.
Specifically, in the prior art, setting of weights is not based, a scoring model of the health degree of the equipment is established, each index corresponds to one weight (namely, a feature weight hereinafter), historical data is marked manually through tracking of the historical data, which indexes have problems under fault conditions (characterization) are determined, and then the data are learned through a machine to determine the weights of the indexes, so that the probability of the fault can be predicted through the change conditions of the indexes.
The fault point intelligent refining firstly traverses public suspected fault points, analyzes a suspected fault equipment list output by the process, calculates the quality difference probability for each equipment, respectively calculates each characteristic value xi of the equipment, forms a characteristic vector X (X1, X2, x3... xn) of the equipment (wherein n is the number of the characteristics or indexes), and standardizes all the characteristic values (the uniform value range is between 0 and 1). Then, the learned feature weight vector theta (theta 1, theta 2, theta 3.. theta n) is taken out, and the value of y ═ Sigmoid (theta 1 ^ x1+ theta 2 ^ x2+ theta 3 ^ x3+. theta n ^ xn) is calculated. The magnitude of the y value reflects the magnitude of the device quality difference probability, and the range is ensured to be (0, 1). Then, sorting the y value of each device from high to low, wherein the higher the ranking is, the higher the quality difference probability of the device is, and returning by taking the y value topN, wherein N is generally 3, but not limited thereto.
In embodiments of the present application, the device may be characterized by the following three (i.e., n — 3):
1) the period includes the average value of the packet loss rates of all links of the device
2) the period includes the maximum packet loss rate of all links of the device
The formula describes: max (packet loss rate ping this period)
3) The ratio of the quantity of the quality difference of the link where the equipment is located in the period
The formula describes: number of bad links containing the device/total number of links containing the device.
Fig. 4 shows a flowchart 400 of an exemplary internet access quality monitoring analysis method according to an embodiment of the application. The internet access quality monitoring analysis method shown in fig. 4 may be performed by the electronic device shown in fig. 1 and the internet access quality monitoring analysis system shown in fig. 2.
As shown in fig. 4, at step S410, a probe for a monitoring node is initiated by deploying a large number of distributed probes on the internet, the probe including a ping test from the probe to the monitored node and a tracert from the probe to the monitored node. The ping test returns on-off and time delay data of the route, n streams which cannot be ping-connected in three continuous periods are recorded, and the detection period of the ping test is in the minute level. The tracert returns the address of the node passing through the path, and the detection period of the tracert is in the order of hours.
At step S420, the missing node address in the path returned by the tracert is complemented by a path complementing algorithm. Wherein the path completion algorithm includes an adjacent device path information completion algorithm based on a 30-bit mask and an alternate device path information completion algorithm based on device information as described above.
In step S430, data including the on/off and delay data of the route and the address of the node passing through the route are loaded to the data model through the ETL, and the loaded data is subjected to arithmetic mean or exponential smoothing to obtain baseline data of the route. Optionally, for a hierarchy of network presence, obtaining baseline data for the path comprises obtaining baseline data for the path for which the hierarchical computation is performed.
At step S440, TOPN number of failed nodes are promoted by the network quality difference common path algorithm, wherein N ≧ 3. According to the embodiment of the application, the network quality difference public path algorithm is as follows: periodically recording and updating the IP address of the intermediate node passed by any source address { a1, a 2., an } to any destination address { b1, b 2., bm }; finding out paths corresponding to the n flows which can not ping according to the records; converting a set of node IP addresses into a set of equipment according to information of a configuration management database, wherein the configuration management database records IP, board card and port data of each monitored equipment; and counting the probability of a certain device in the path from the source address to the destination address.
At step S450, a correlation index weight Θ (θ 1, θ 2, θ 3.. θ n) of each fault node is determined by a machine learning algorithm, each feature value of each fault node is calculated to form a feature vector X (X1, X2, x3... xn), the feature values are processed by a sigmod function y ═ Sigmoid (θ 1 × 1+ θ 2 × 2+ θ 3 × 3.. θ n × n), and TOP3 of the y value is taken as the fault node. It should be understood that TOP3 of the y value is taken as the fault node here for example only, and in fact, the y value may take N to any value.
At step S460, monitoring alarms and obstacle Root Cause Analysis (RCA) reports are issued from the determined TOP3 failed nodes.
The method combines the mature pingmesh technology at present, utilizes the clear advantages of the topological structures of the networks such as an IDC network, a metropolitan area network and a backbone network of an operator, introduces the analysis algorithms such as a public quality difference point algorithm after the optimization of an internet manufacturer algorithm, and has the following advantages of subsection quality difference positioning, mass probe deployment, path data completion, baseline and evidence chain guidance quality problem: 1) and (3) positioning the quality difference in a segmentation manner: dividing the internet access quality of the data center into a plurality of sections, and carrying out probe dial testing and quality and path data collection aiming at each section; 2) deployment of a mass of probes: building a network probe monitoring of a path coverage level, and collecting and processing standardized data; 3) path completion: utilizing a path completion algorithm to complete missing path information in the network path data; 4) baseline: establishing a quality base line by a common quality difference point model and utilizing a traversal algorithm; 5) chain of evidence: and outputting a network quality fluctuation evidence through subsection quality monitoring and quality difference positioning, and combining multi-path evidences to form a data center internet access quality evidence chain.
The function is used online in IDC network management application of China telecom Shanghai corporation since 4 months in 2019, the problem monitoring coverage degree of customer fault reporting in the IDC network exceeds 90%, the monitoring accuracy is higher than 95%, and the aim of finding problems before users is achieved.
The present disclosure may be implemented as any combination of apparatus, systems, integrated circuits, and computer programs on non-transitory computer readable media. One or more controllers may be implemented as an Integrated Circuit (IC), an Application Specific Integrated Circuit (ASIC), or a large scale integrated circuit (LSI), a system LSI, a super LSI, or an ultra LSI package that performs some or all of the functions described in this disclosure.
The present disclosure includes the use of software, applications, computer programs or algorithms. Software, applications, computer programs, or algorithms may be stored on a non-transitory computer readable medium to cause a computer, such as one or more processors, to perform the steps described above and depicted in the figures. For example, the one or more memories store software or algorithms in executable instructions and the one or more processors may associate a set of instructions to execute the software or algorithms to provide network configuration information management functionality for a network access device according to embodiments described in the present disclosure.
Software and computer programs (which may also be referred to as programs, software applications, components, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural, object-oriented, functional, logical, or assembly language, or machine language. The term "computer-readable medium" refers to any computer program product, apparatus or device, such as magnetic disks, optical disks, solid state storage devices, memories, and Programmable Logic Devices (PLDs), used to provide machine instructions or data to a programmable data processor, including a computer-readable medium that receives machine instructions as a computer-readable signal.
By way of example, computer-readable media can comprise Dynamic Random Access Memory (DRAM), Random Access Memory (RAM), Read Only Memory (ROM), electrically erasable read only memory (EEPROM), compact disk read only memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired computer-readable program code in the form of instructions or data structures and which can be accessed by a general-purpose or special-purpose computer or a general-purpose or special-purpose processor. Disk or disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above are also included within the scope of computer-readable media.
Additionally, the above description provides examples, and does not limit the scope, applicability, or configuration set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the spirit and scope of the disclosure. Various embodiments may omit, substitute, or add various procedures or components as appropriate. For example, features described with respect to certain embodiments may be combined in other embodiments.
Claims (11)
1. An internet access quality detection analysis method, comprising:
initiating detection on a monitoring node by deploying a large number of distributed probes on the Internet, wherein the detection comprises ping detection from a probe to the monitored node and tracert from the probe to the monitored node, the ping detection returns on-off and time delay data of a path, n streams which cannot be ping-through in three continuous periods are recorded, and the tracert returns a node address passing through in the path;
completing missing node addresses in the path returned by the tracert through a path completion algorithm;
loading data comprising the on-off time delay data of the route and the address of the node passing through the route to a data model through ETL, and carrying out arithmetic mean or exponential smoothing processing on the loaded data to obtain baseline data of the route;
the method comprises the following steps of (1) promoting TOPN fault nodes through a network quality difference common path algorithm, wherein N is more than or equal to 3;
the method comprises the steps of determining relevant index weights theta (theta 1, theta 2, theta 3.. theta n) of each fault node through a machine learning algorithm, calculating each characteristic value of each fault node to form a characteristic vector X (X1, X2, x3... xn), processing the characteristic values through a Sigmoid function y ═ Sigmoid (theta 1X 1+ theta 2X 2+ theta 3X 3.. theta n X n), and taking TOP3 of the y values as the fault node.
2. The internet access quality detection analysis method of claim 1, further comprising issuing monitoring alarms and obstacle Root Cause Analysis (RCA) reports from the determined TOP3 failed nodes.
3. The internet access quality detection analysis method of claim 1, wherein the network-poor public path algorithm comprises:
periodically recording and updating the IP address of the intermediate node passed by any source address { a1, a 2., an } to any destination address { b1, b 2., bm };
finding out paths corresponding to the n flows which can not ping according to the records;
converting a set of node IP addresses into a set of equipment according to information of a configuration management database, wherein the configuration management database records IP, board card and port data of each monitored equipment; and
and counting the probability of the appearance of a certain device in the path from the source address to the destination address.
4. The internet access quality detection analysis method of claim 3, wherein the path completion algorithm comprises a 30-bit mask-based neighboring device path information completion algorithm comprising:
traversing all unknown equipment, and taking the next hop;
if the next hop is still unknown equipment, the completion cannot be carried out, if the next hop is the last hop in the path, the completion is not needed, if the next hop is equipment with a known port IP, the port IP of the next hop is converted into a binary representation, 31 and 32 bits of the binary representation are exchanged, and then the binary representation is converted into a dot lattice type to be used as the port IP of the unknown equipment; and
and inquiring the port IP from the configuration management database to obtain the corresponding equipment IP.
5. The internet access quality detection analysis method of claim 4, wherein the path completion algorithm further comprises an alternate device path information completion algorithm based on device information, comprising:
traversing a previous hop and a previous hop of the equipment complemented by a neighboring equipment path information complementing algorithm based on a 30-bit mask;
if the last hop is an unknown device and the last hop is a known device, then:
inquiring all port IPs of the completed equipment in the configuration management database, taking an adjacent port IP corresponding to each port IP, and inquiring the equipment IP of each adjacent port IP, thereby obtaining a set of the completed equipment;
inquiring all port IPs of the previous hop equipment in the configuration management database, taking an adjacent port IP corresponding to each port IP, and searching equipment IP of each adjacent port IP to obtain a set of the previous hop equipment; and
and searching the intersection of the complemented set of the equipment and the set of the previous hop of equipment to serve as the missing node equipment in the path.
6. The internet access quality detection analysis method of claim 1, wherein obtaining baseline data for a path further comprises: baseline data for paths for which a hierarchical computation is performed is obtained for a hierarchy of network presence.
7. The internet access quality detection analysis method of claim 1, wherein the ping detection period is of the order of minutes and the tracert detection period is of the order of hours.
8. An electronic device, comprising:
one or more processors, and
a memory coupled with the one or more processors, the memory storing computer-readable program instructions that, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-7.
9. An internet access quality detection analysis system comprising means for performing the steps of the method of any one of claims 1-7.
10. A non-transitory computer readable medium having instructions stored thereon for execution by a processor to perform the steps of the method of any of claims 1-7.
11. A computer program product comprising a computer program which, when executed by a processor, performs the steps of the method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011555719.3A CN114676157B (en) | 2020-12-24 | 2020-12-24 | Internet access quality monitoring and analyzing method, system, medium and program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011555719.3A CN114676157B (en) | 2020-12-24 | 2020-12-24 | Internet access quality monitoring and analyzing method, system, medium and program |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114676157A true CN114676157A (en) | 2022-06-28 |
CN114676157B CN114676157B (en) | 2024-08-13 |
Family
ID=82071159
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011555719.3A Active CN114676157B (en) | 2020-12-24 | 2020-12-24 | Internet access quality monitoring and analyzing method, system, medium and program |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114676157B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115766396A (en) * | 2022-10-28 | 2023-03-07 | 苏州浪潮智能科技有限公司 | Control cluster fault detection method and device, electronic equipment and readable storage medium |
CN117880055A (en) * | 2024-03-12 | 2024-04-12 | 灵长智能科技(杭州)有限公司 | Network fault diagnosis method, device, equipment and medium based on transmission layer index |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030204619A1 (en) * | 2002-04-26 | 2003-10-30 | Bays Robert James | Methods, apparatuses and systems facilitating determination of network path metrics |
US20070064715A1 (en) * | 2002-07-25 | 2007-03-22 | Avaya, Inc. | Method and apparatus for the assessment and optimization of network traffic |
CN101635656A (en) * | 2008-07-26 | 2010-01-27 | 华为技术有限公司 | Fault detection method in layered ordered address packet network, system and equipment |
US20100061272A1 (en) * | 2008-09-04 | 2010-03-11 | Trilliant Networks, Inc. | System and method for implementing mesh network communications using a mesh network protocol |
CN111224842A (en) * | 2019-12-31 | 2020-06-02 | 大唐软件技术股份有限公司 | Internet service quality monitoring method and device |
-
2020
- 2020-12-24 CN CN202011555719.3A patent/CN114676157B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030204619A1 (en) * | 2002-04-26 | 2003-10-30 | Bays Robert James | Methods, apparatuses and systems facilitating determination of network path metrics |
US20070064715A1 (en) * | 2002-07-25 | 2007-03-22 | Avaya, Inc. | Method and apparatus for the assessment and optimization of network traffic |
CN101635656A (en) * | 2008-07-26 | 2010-01-27 | 华为技术有限公司 | Fault detection method in layered ordered address packet network, system and equipment |
US20100061272A1 (en) * | 2008-09-04 | 2010-03-11 | Trilliant Networks, Inc. | System and method for implementing mesh network communications using a mesh network protocol |
CN111224842A (en) * | 2019-12-31 | 2020-06-02 | 大唐软件技术股份有限公司 | Internet service quality monitoring method and device |
Non-Patent Citations (2)
Title |
---|
XIAOJIANG DU等: "Improving Onboard Internet Services for High-Speed Vehicles by Multipath Transmission in Heterogeneous Wireless Networks", 《IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY》, vol. 65, no. 12, 16 June 2016 (2016-06-16), pages 9493, XP011636813, DOI: 10.1109/TVT.2016.2581020 * |
黄山: "基于动态二进制程序切片技术的软件攻击诊断", 《中国优秀硕士学位论文全文数据库 信息科技辑》, 15 July 2012 (2012-07-15), pages 138 - 366 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115766396A (en) * | 2022-10-28 | 2023-03-07 | 苏州浪潮智能科技有限公司 | Control cluster fault detection method and device, electronic equipment and readable storage medium |
CN117880055A (en) * | 2024-03-12 | 2024-04-12 | 灵长智能科技(杭州)有限公司 | Network fault diagnosis method, device, equipment and medium based on transmission layer index |
CN117880055B (en) * | 2024-03-12 | 2024-05-31 | 灵长智能科技(杭州)有限公司 | Network fault diagnosis method, device, equipment and medium based on transmission layer index |
Also Published As
Publication number | Publication date |
---|---|
CN114676157B (en) | 2024-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112398680B (en) | Fault delimiting method and equipment | |
CN108063676A (en) | Communication network failure method for early warning and device | |
CN107483487B (en) | TOPSIS-based multi-dimensional network security measurement method | |
CN104639388A (en) | DNS server availability detection method based on user perception | |
CN108933694A (en) | Data center network Fault Node Diagnosis method and system based on testing data | |
CN113938407A (en) | Data center network fault detection method and device based on in-band network telemetry system | |
Xu et al. | Lightweight and adaptive service api performance monitoring in highly dynamic cloud environment | |
WO2023207689A1 (en) | Change risk assessment method and apparatus, and storage medium | |
CN114676157B (en) | Internet access quality monitoring and analyzing method, system, medium and program | |
Hoarau et al. | Suitability of graph representation for bgp anomaly detection | |
CN117692940B (en) | Microwave system performance detection method based on microwave link | |
CN117614833B (en) | Automatic regulating method and system for router signals | |
Alenazi et al. | Evaluation and improvement of network resilience against attacks using graph spectral metrics | |
CN104363142A (en) | Automatic data center network performance bottleneck analysis method | |
CN108494625A (en) | A kind of analysis system on network performance evaluation | |
CN117291002A (en) | Unmanned plane cluster network damage evaluation method based on entropy weight method-TOPSIS | |
CN117376084A (en) | Fault detection method, electronic equipment and medium thereof | |
Shahraeini et al. | Towards an unified dependency analysis methodology for wide area measurement systems in smart grids | |
Patil et al. | Probe station placement algorithm for probe set reduction in network fault localization | |
CN114880153A (en) | Data processing method and device, electronic equipment and computer readable storage medium | |
EP3840453A1 (en) | Method for detecting anomalies in mobile telecommunication networks | |
CN118573608B (en) | Switch reliability test method and system | |
Rozaki | Clustering optimisation techniques in mobile networks | |
CN117499817B (en) | Distributed ammeter acquisition system and acquisition method | |
CN104333491B (en) | The automated testing method and device of a kind of huge system domain network availability |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |