CN114257486A - Method for implementing network performance management measurement probe facing Internet of things - Google Patents

Method for implementing network performance management measurement probe facing Internet of things Download PDF

Info

Publication number
CN114257486A
CN114257486A CN202011367259.1A CN202011367259A CN114257486A CN 114257486 A CN114257486 A CN 114257486A CN 202011367259 A CN202011367259 A CN 202011367259A CN 114257486 A CN114257486 A CN 114257486A
Authority
CN
China
Prior art keywords
network
probe
measurement
data
performance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011367259.1A
Other languages
Chinese (zh)
Other versions
CN114257486B (en
Inventor
陈剑
于士浩
高连峰
王军
汤磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gannan Normal University
Original Assignee
Gannan Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gannan Normal University filed Critical Gannan Normal University
Priority to CN202011367259.1A priority Critical patent/CN114257486B/en
Publication of CN114257486A publication Critical patent/CN114257486A/en
Application granted granted Critical
Publication of CN114257486B publication Critical patent/CN114257486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/02Standardisation; Integration
    • H04L41/0213Standardised network management protocols, e.g. simple network management protocol [SNMP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/04Network management architectures or arrangements
    • H04L41/046Network management architectures or arrangements comprising network management agents or mobile agents therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • H04L43/0829Packet loss
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • H04L43/087Jitter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/12Network monitoring probes

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The network management system is an important component of network infrastructure, and plays a significant role in the operation and maintenance of the IP network in China. In order to solve the performance monitoring problem of network management and make up the deficiency of the existing network management system in the aspect of performance management, the invention finally realizes an IP network measurement probe which accords with the SNMP standard, and the IP network measurement probe can be used as a portable device to work independently to measure the network performance on an end-to-end path; the system can also be deployed in a measurement domain as a measurement agent, and is communicated with a management center through SNMP to complete monitoring and fault diagnosis of a large-scale IP network; in addition, the network management system is widely distributed in three layers of access, convergence and core of the network and is combined with the existing network management system, so that a comprehensive network management system with performance network management in the true sense can be well constructed, the problem of a network communication situation map can be well solved and realized, and decision support is provided for network management personnel. The system structure of the multipurpose IP network probe is designed in detail, a probe which accords with the SNMP standard is realized, and meanwhile, the embedded operating system is deeply analyzed, and the probe can smoothly run on the embedded operating system UCLinux. On the basis of realizing the probe, the performance management center is deeply researched and developed, and a prototype analysis system is realized.

Description

Method for implementing network performance management measurement probe facing Internet of things
Technical Field
The invention belongs to the technical research in the field of IP network performance management, and particularly relates to a performance index system of an IP network, extraction of index parameters of reactive performance from an operating network, analysis and utilization of performance index data and other related problems.
Background
The network management system is an important component of network infrastructure, and plays a significant role in the operation and maintenance of the IP network in China. On one hand, the existing IP network measuring equipment does not realize instrumentation, needs stronger professional knowledge for deployment and use, and is not beneficial to wide application. On the other hand, in 5 functional domains of network management, various comprehensive network management systems currently used in China are not perfect in performance management functional domains, and end-to-end monitoring of nodes and links of the whole network and performance monitoring of a host are lacked, so that a network communication situation map of the whole network cannot be formed, and therefore network management personnel cannot really grasp the current and future communication conditions, cannot know the running conditions of the network completely, and cannot timely and correctly troubleshoot errors, maintain and upgrade the network. The method aims to solve the problem of performance monitoring of network management and make up for the defects of the existing network management system in the aspect of performance management. The invention finally realizes an IP network measuring probe which accords with SNMP standard, and the probe can be used as a portable device to work alone to measure the network performance on an end-to-end path; the system can also be deployed in a measurement domain as a measurement agent, and is communicated with a management center through SNMP to complete monitoring and fault diagnosis of a large-scale IP network; in addition, the network management system is widely distributed in three layers of access, convergence and core of the network and is combined with the existing network management system, so that a comprehensive network management system with performance network management in the true sense can be well constructed, the problem of a network communication situation map can be well solved and realized, and decision support is provided for network management personnel.
Disclosure of Invention
The invention is mainly used for forming a measuring probe which can be resident in a Linux operating system or an embedded operating system and accords with the SNMP standard, and is used for IP network performance management. The measurement probes are deployed on network-related nodes, and can be remotely controlled through an SNMP protocol, so that performance parameters from the measurement point to other network nodes can be obtained. These parameters include: the two-way delay RTT of the ICMP data message of the Ping task, the LOSS rate LOSS of the ICMP data message of the Ping task, the two-way delay Jitter of the ICMP data message of the Ping task, the two-way delay RTT of the UDP data message of the TraceRoute task, the LOSS rate LOSS of the UDP data message of the TraceRoute task, and the two-way delay Jitter of the UDP data message of the TraceRoute task. By adopting the method, richer and more accurate network performance parameters can be obtained compared with the method of measuring from a single network endpoint, network faults can be found and positioned in time, and the network availability is improved.
The network measurement probe is developed based on open source code software NET-SNMP5.1, and comprises a snmpd main agent and a subagent of the NET-SNMP5.1 from the structural point of view (shown in figure 2). And data and control information interaction is carried out between the snmpd and the subagent through an AgentX protocol. When the snmpd receives the SNMP protocol message belonging to the subagent, the message is forwarded to the subagent through an AgentX protocol, after the subagent processes the request, the processing result is returned to the snmpd through the AgentX protocol, and finally the snmpd encapsulates the obtained processing result into an SNMP response message and returns the SNMP response message to the management center.
The MIB base in the child sub-node stores all measurement control parameters and results and is a core part of the whole measurement agent. The tasks mainly completed by the subagent include: receiving and processing an AgentX protocol message from the snmpd; scheduling and managing two measurement programs, namely Ping and TraceRoute; checking the measurement result and storing the measurement result in the corresponding MIB base of the subagent; sending corresponding SYSLOG alarm information according to the alarm condition; and sending the TRAP message according to the set conditions.
The subagent is realized by adopting a multi-process and multi-task structure, and data interaction is carried out between the measurement process and the subagent process in a memory sharing mode. And the main control module of the subagent process regularly scans CtlAdminStatus field values in all newly established rows in the control table, if the field values are changed from disabled to enabled, other field values in the row are used as measurement parameters to schedule corresponding measurement tools to execute corresponding measurement operations, and if the CtlRowStatus is changed from ACTIVE to DESTROY, the corresponding measurement operations are terminated, and all related data rows are deleted from the MIB library.
The main control module is mainly responsible for initializing the subagent, reading and analyzing the configuration file, establishing an AgentX protocol connection to the snmpd, starting to monitor and receive and send messages, registering CtlTable, ResultTable and HistoryTable into the snmpd main agent, and installing a signal processor to start to receive various signals including timing signals, signals triggered by related SNMP SET messages and the like. If a new measurement task is issued, the signal processor triggers corresponding operation, creates a new measurement process, allocates resources such as shared memory and semaphore, and finally registers all resource use information in a process control table for use by a data and state detection module.
The data and state detection module is the most complex and core module in the child agent, and has the main function of regularly inquiring the working state of the measurement process, if the measurement process exits, recovering the resources occupied by all the measurement processes, including shared memory, semaphore and various data table item resources, and the like, otherwise, obtaining measurement result information from the specified shared memory region according to the specified shared memory id and semaphore id in the process control table, and converting the information into a result table, a history table and a skip table of the child agent. The module relates to the management of multiple aspects such as access mutual exclusion, resource recovery, process management, shared memory and the like, and is a key and difficult point in the whole agent design and implementation process.
The Syslog module adopts a standard Syslog technology, and sends error information and alarm information generated in the process running process to a standard Syslog alarm processing center.
The TRAP module may send 6 kinds of alarm information, such as pingnnotifications and traceroutenocations, described in RFC2925, according to the set requirements. The RFC2925 does not specify a technology for realizing alarm, so that in order to improve the applicability, two standard alarm processing technologies are realized in sub-routines, and the design structure also adds a flexible choice to a user and can be well integrated with an alarm processing center of the existing network management platform.
The parameter detection module is responsible for checking whether the parameters from the main control module are correct or whether the parameters lack, and if the destination address is not set, the measurement task cannot be started. And after the parameter monitoring is successful, the method is responsible for calling a measurement tool and starting a measurement task.
The signal processor is an engine operated by the subagent, and the subagent mainly comprises three signals: a periodic polling signal; SNMPSET message triggered signal; a sub-process exit signal. The signal processor is arranged in the main control module, and calls the corresponding processing module to process after receiving the corresponding signal.
The Ping measurement module exists in a dynamic lib library mode, the collection task of the agent is collected in a subprocess mode when being executed, meanwhile, the collection result is placed in a designated shared memory area, and the data and state detection module is waited to map the collection result into the MIB library. The parameters are provided by a corresponding control table, and the Ping measurement module obtains related data as an acquisition result according to a Ping target router in the RFC2925 standard.
The Ping program will mainly collect three performance indicators: time delay, packet loss, and time delay jitter. And data interaction is carried out between the Ping measuring process and the subagent process in a memory sharing mode. The data interaction format is according to the specifications described in RFC2925 for pingResultsTable and pingproberstoreyltable. And the subagent process calls the Ping measurement process in a subprocess mode, the measurement process acquires a shared memory address, sends an ICMP message according to control table parameters, calls a message receiving module to receive a response message, calculates a measurement result according to the response message, and updates a result table and a history table in the shared memory. And if the measurement is finished or the subagent process terminates the measurement operation, the Ping measurement process updates the measurement data in the shared memory and then quits the measurement process.
And after receiving the acquisition command sent by the performance management center, the measurement probe starts to execute an acquisition task, and meanwhile, the acquisition result is placed in the MIB library to wait for the performance management center to read. The execution parameters of the probe acquisition task are provided by a corresponding control table and a local configuration file of the measurement probe, and the acquisition task of the measurement probe comprises two aspects: and according to the standard Ping target router and TraceRoute target router of RFC2925, obtaining relevant data as a collection result. The measurement probe will mainly collect seven performance indicators: time delay, accessibility, availability, bidirectional time delay jitter, packet loss rate, routing and routing jitter. These metrics are tested by sending specific IP packets.
Because the time and manpower for research and development are relatively tight, in the first version of the measurement probe, only the end-to-end performance of the network layer can be monitored at present, and in the next step of research, the performance of the application layer is studied more deeply, and meanwhile, some discussion work is carried out on the standardization work of the application layer, and the application layer is incorporated into the measurement probe.
On one hand, the performance management center is used as a support system of the measuring probe and can provide services of gathering, processing, analyzing, visualizing and storing measured data for the measuring probe; meanwhile, by deploying a plurality of measuring probes, under the overall arrangement of a performance management center, the network supervision task under the large-scale network environment, namely the monitoring of network running conditions, network resources and network performance, the alarming and troubleshooting of faults and abnormal conditions and the like are completed. The performance management center is a comprehensive entity with multiple functions and a user interface of the system, and specifically, the performance management center is composed of the following functional modules:
(1) and a probe management module. To accomplish the management of the availability of the probeAnd collecting the running state information of the probe through interaction with the probe deployed in the probe infrastructure, wherein the running state information comprises resource utilization conditions of a processor, a memory and the like of the probe, the working state of the probe and the like. In this management module, a probe is used as an object, availability information of the probe and the like are used as attributes thereof, and interfaces for measurement task management, measurement data access and the like mentioned later are implemented as methods of the object, so that the performance management center and the probe interact through LDAP (lightweight directory access protocol).
Figure BDA0002805387760000041
(2) And a measurement task management module. And finishing the functions of formulation, loading, execution process tracking, unloading and the like of a measurement task. And collecting task loading execution state information of the target probe through a user interface, providing reference for a user to carry out measurement work, and helping the user to formulate a measurement task and carry out remote loading. During task execution, measurement data is collected from the probe with the aid of the measurement data processing module according to the strategy of the measurement task being executed on the probe. According to the task category, the measurement task is terminated automatically or under the intervention of a user, the task process is unloaded, and related probe resources are released.
(3) And a measurement data processing module. According to the task strategy, measurement data are collected from the probe through the Get operation of the SNMP, and the data are preprocessed and stored.
(4) And a measurement data analysis module. The method is a set of multiple data analysis methods, and based on the measured application, one or more analysis methods of the module are used for deducing the performance indexes of the network and the network application from the measured data so as to help the user to evaluate the network operation state and the service quality of the network application.
(5) A measurement-based application module. Is a plurality of application function sets based on network measurement. The measurement data is the basis for such applications, which are essentially the analysis of the measurement data. From the application, the user plans and implements the measurement task, and the method has more purposiveness and use value. Besides some common applications based on network measurement, such as network availability, network response capability assessment, pathological route warning, network fault point positioning and auxiliary troubleshooting, and network attack warning, a novel application is network operation situation comprehensive analysis, which is based on network topology discovery and through measurement data collected at different time periods based on a plurality of monitoring points, a comprehensive situation graph of a network to be tested is generated.
(6) A user interface module. The interaction between the performance management center and the probe is performed implicitly and is driven by management tasks and measurement tasks. The interaction of the performance management centre with the user is displayed and it is this interaction that allows the user to control the probe remotely and to enable measurement tasks and measurement-based applications to be initiated and completed. Throughout the measurement infrastructure, the performance monitoring center is the only interface provided to the user.
In the current implementation, the performance management center is a multi-functional complex, and from a macro level, the performance management center integrates directory management, task scheduling, data processing and analysis, and measurement-based application. This centralized approach works effectively in a measurement environment of a certain scale, but is limited if faced with measurement in a large scale network environment and measurement-based network management and network optimization tasks. Therefore, a modular design with consistent description, consistent interface and consistent calling method is adopted in the design, so that the designed module can work in the same operation environment in a tightly coupled mode and can also work in a distributed environment in a loosely coupled mode.
The functional division among the constituent modules of the performance management center is clear, and each functional module can be realized independently. Through the XML-based configuration file of each functional module, whether the modules are called locally or remotely from each other during cooperation is transparent. The XML file describes the calling interface and interface parameters of the module, also describes the position information of the module, and a caller can position the called module or the cooperative opposite terminal module according to the description, thereby realizing the associated operation between the modules. The idea of the Web Services is used for reference by the design, and the maximum advantage is high expandability. On one hand, as long as the modules are realized according to the consistent calling interface, the modules can be included in the system; on the other hand, the deployment can be adjusted only by updating the configuration file of the module, so that better system performance is obtained. In addition, the maintenance and upgrade of the system can be more effectively carried out.
At present, an interface between a performance management center and a user is localized, and the performance management center cannot work in a B/S mode, which brings inconvenience that the limitation on a working place is large, and the user needs to perform network optimization in the management center if network management is required, however, in a practical situation, network management personnel are often required to monitor a network remotely, perform troubleshooting, and implement performance optimization. The system takes the factor into consideration, in the design implementation, on one hand, the modularized design is convenient for realizing the C/S and B/S modes, and on the other hand, the XML technology is adopted in the description of data, especially the analysis result, so that the display requirements under different working modes can be met through the conversion of the style sheet.
The performance indexes which can be measured by the probe of the system are mainly time delay, time delay jitter, packet loss rate, accessibility, availability, routing jitter and the like which can be obtained in an active measurement mode. The indexes have time-varying characteristics, so the analysis function of the system is mainly a statistical analysis method capable of carrying out mean value, variance, extreme value, statistical distribution characteristics, self-similarity and the like under different time scales.
Based on the above analysis function, the system realizes the following applications:
(1) and evaluating the network availability. Based on monitoring a certain period of the network to be tested, the data such as time delay, time delay jitter, packet loss rate and the like obtained by measurement are analyzed, and the indexes such as unreachable rate, non-busy rate, unpredictability and the like of the network are deduced. If the monitoring is periodic monitoring in a long period, the busy period and idle period distribution of the target network can be inferred, and network personnel can be helped to issue network service period distribution guidance opinions.
(2) And evaluating the network response capability. Through statistical analysis of the delay index and the throughput index of the network to be tested, the capability of the network to be tested for bearing the service flow can be obtained, and the evaluation of the response capability of network management personnel to the network to be tested is facilitated.
(3) And (4) warning the ill-conditioned route. And under the condition that the measured reference route or the actual topology and the route of the network are known, regularly detecting the route of the network, and sending out a ill-conditioned route warning when the path hop number is found to be changed greatly or the route jitter frequency exceeds a threshold value to prompt network management personnel to further check.
(4) And positioning a network fault point and performing auxiliary troubleshooting. When measurement data such as accessibility, packet loss rate, time delay jitter and the like of a measured network exceed a certain threshold value, fault warning information is sent out, fault point positioning measurement is started in an automatic or manual mode according to a strategy, and the measurement is performed in a segmented mode under the cooperation of a plurality of probes so as to quickly find a performance bottleneck point or a fault point and help network management personnel to further perform fault troubleshooting.
(5) And (5) network attack warning. When the performances of the monitored network, such as time delay, packet loss rate, accessibility and the like, are rapidly deteriorated, alarm information is sent out to remind network management personnel that the network is possibly attacked, and the network management personnel can further monitor the network more carefully, so that the network can be prevented in time and can be treated as early as possible.
(6) The network runs a comprehensive situation map. Based on a plurality of monitoring points, measurement data collected at different time intervals are used for generating a comprehensive situation map of the network to be detected, the map has the instant playing function of different layer attributes, and can also perform abnormity and fault alarm through color marking, sound prompt and the like, thereby providing an early warning means for preventing large-scale network attack. In addition, QoS index reference can be provided through comprehensive analysis, and a first basis is provided for network management personnel to implement flow engineering and network upgrading.
The key technical research of multipurpose IP network performance management mainly highlights the following 5 aspects: research and analysis of an open source platform, high-speed data channel design, resource manager design, probe deployment optimization and network operation situation diagram proposition and design. The research and analysis of the open source platform are the technical basis of the whole research, and the open source platform has no thick technical basis and bottom implications and is also greatly restricted in scientific research. The high-speed data channel well solves the performance and efficiency problems of the probe in the data acquisition and transmission process. The resource manager adopts the concept and idea of a design mode to provide a uniform resource management interface for upper-layer application, and the application can efficiently and safely utilize various managed resources at the bottom layer by virtue of the uniform resource management interface. The probe deployment optimization is an NP problem, an optimal solution can be obtained only under limited conditions, the research difficulty is very high, and the practical value is high. Meanwhile, the idea of the network operation situation diagram is also provided innovatively, the network operation situation diagram is put into practice, and powerful tools and help are provided for network management personnel to know and manage the network state in real time.
(1) Open source platform research and analysis
NET-SNMP is a very comprehensive development platform, and the contents of the development platform cover all SNMP software development requirements from an agent and a management system to a test, so the development platform is very complex. Because the probe is not only simply developed on a platform, but also the performance and expandability are mainly considered, the NET-SNMP kernel needs to be deeply known, the characteristics of large data volume and strong real-time performance of the probe are combined, the NET-SNMP kernel is correspondingly modified, the characteristics of complex probe logic are considered, the interface layers above the kernel are expanded, and some BUGs in the existing interface layers are modified. At present, we have developed a new NET-SNMP development library and submitted new modification suggestions and problem lists to the NET-SNMP management authority.
(2) Design of high speed data channel
The measurement module of the probe generates a large amount of measurement data, and how to timely project the measurement data in the measurement module to the MIB of the probe subagent is a big problem in the prior art. By combining the characteristics, the shared memory is selected as a basis, and a high-speed data channel is packaged on the basis. The shared memory is the one which provides the fastest efficiency and speed in all interprocess communication operations in the current operating system. Through the producer consumer model, the measuring module is used as a producer to continuously generate measuring data, the agent is used as a consumer to continuously image the data into the MIB, and a corresponding communication protocol and a data format are established between the MIB and the agent, so that a large amount of measuring data and complex control information can be quickly and accurately transmitted. By adopting a shared memory mode, the processing speed of the probe is greatly improved, and the probe can bear the message intensity of SNMP GET of 100,000 packets/second on a common Ben 3 PC.
(3) Design of resource manager
The system resources involved in the probe are relatively numerous, and how to manage and utilize the resources is a key of the probe design and is also one of the difficulties. To solve this problem, we design and implement a resource manager, which uses the session plane mode to provide a uniform interface for other subsystems or modules. The core of the resource manager is a process table, and the structure of the process table adopts an MIB library structure, so that the use condition of the internal resources of the probe can be checked in real time in a performance management center. Because the MIB base structure is adopted, the same processing and calling interfaces can be adopted as other MIB table entries, thereby unifying the structure and the interfaces. In the process table, the process ID, the shared memory ID, the semaphore index, and the pointer information of the current measurement process running state, the control table entry and the result table entry associated with the measurement process are recorded in detail. All the resources are bound with a specific process, when the process exits, the corresponding table entry is found through the process ID, the shared memory, the semaphore and the related MIB library memory are sequentially recycled by using the registered information, and the corresponding CtlTable, ResultTable, HistoryTable and traceRouteHopsTable table entries are deleted. By adopting the resource manager, the probe well solves the problem of complex resource management, maintains a good system structure, lays a solid foundation for probe development and maintenance, and simultaneously increases the stability and solves the problem of resource leakage.
Figure BDA0002805387760000071
Figure BDA0002805387760000072
(4) Design of probe deployment optimization algorithm
The distribution of probes is a complex problem, limited by many factors, such as:
1) since the probe measures the end-to-end performance of the network in an active manner, consideration needs to be given to how to place the probe to minimize intrusion on the network.
2) The placement of the probe is related to the link location to be monitored, and at the network bottleneck point, it should be where probe placement is preferred.
3) Cost-effective problem of the number of probes placed, more probes deployed necessarily increase the accuracy of the measurement, but the price and intrusion on the network must also rise.
In the measurement infrastructure of the present invention, the purpose of deploying probes is to measure the performers of the actions, and also the source of the measurement data, so our goal is how to place the probes so that they can measure all end-to-end paths; secondly, certain optimization is carried out on the distribution on the basis, and the quantity of the distribution is reduced.
The basic idea of the algorithm is as follows: under the premise that the end-to-end path performance of an end system mainly depends on the end-to-end performance of a backbone network, a network probe is placed from an edge network router (located at the exit of an IP address cluster) of a domain, the performance of a HOP-by-HOP link (HOP) is evaluated, if the HOP-by-HOP link is not a bottleneck link, the position of the probe is moved to the next router, and if the HOP-by-HOP link is not a bottleneck link, the position of the probe is not moved. When two or more network probes meet, only one network probe is reserved.
The algorithm is mainly divided into two stages: generating a probe topological graph and generating a probe address and subnet address mapping table. Preconditions based on the node convergence algorithm:
1) the undirected connectivity graph G, knowing the topology of the measured network, marks the endpoints and edges in G with R, C and E, respectively.
2) And evaluating the performance state of each hop of link, and weighting each edge in the link by 0 or 1 respectively.
Optimizing the target based on the node convergence algorithm:
1) by placing a minimum of network probes, all end-to-end paths of the network can be measured.
2) By shortening the distance between network probes, the path length invaded by active measurement can be shortened.
By TijRepresenting the communication credits of adjacent network probes, which may be described as
Figure BDA0002805387760000081
Considering that data is transmitted sometimes and is not timely and accurate, the credibility coefficient lambda can be introduced by using the credibility coefficientiThe communication credit k is represented as the number of nodes.
Figure BDA0002805387760000082
Energy credits are described as energy trust values between two adjacent probes. Belonging to any energy value between two adjacent probes
CSij=λSij+(1-λ)Sij 2
Data factor trust is described as
Figure BDA0002805387760000083
The consistency trust of the collected data of the nodes N adjacent to the node is Tj
Figure BDA0002805387760000091
(5) Network operation situation map proposing and designing
The invention also provides an application problem of how to create an application with national use value based on network measurement on the basis of realizing some common applications based on network measurement. To address this problem, we propose a new measurement-based application, the network operating situation map. The network operation situation map is a comprehensive situation map of the tested network generated based on the measurement data collected by a plurality of monitoring points in different time periods. The map can intuitively provide the instant playing function of different layer attributes, can also perform abnormity and fault alarm through color marking, sound prompt and the like, and provides an early warning means for preventing large-scale network attack. In addition, QoS index reference can be provided through comprehensive analysis, and a first basis is provided for network management personnel to implement flow engineering and network upgrading. The basis for realizing the network operation comprehensive situation map is to obtain a topological graph of the tested network, generate a reasonable probe deployment scheme through the topological graph and implement deployment. And acquiring the end-to-end performance of paths among all the probes of the network and the routing tracking information among the boundary probes of the network to be detected through the cooperative measurement of a plurality of probes deployed on the network. By integrating the measured data, the indexes such as time delay, time delay jitter, packet loss rate, throughput rate and the like are displayed on a graph as the attributes of the section-by-section path and are updated immediately; when the route changes, the instant route situation of the part which is not matched with the reference route is dynamically displayed on the topological graph by using different colors. Therefore, network management personnel can master the conditions of the network to be detected in a whole disc, find the abnormal conditions and the fault conditions at any time, and can quickly locate the fault points and the abnormal points so as to find and correct the fault points in time. Due to the fact that the measured data have time-varying characteristics, network situations in a certain past time period can be replayed through the stored measured data, possible periodic problems can be found, and a strong basis for network upgrading or network reconstruction can be obtained.
Drawings
FIG. 1: a multi-purpose IP network measurement probe architecture.
FIG. 2: network probe module frame diagram.
FIG. 3: ping measurement module structure diagram.
FIG. 4: and measuring the working state of the probe.
FIG. 5: and (4) forming a performance management center.
FIG. 6: time delay jitter-real host versus hardware probe
FIG. 7: influence of probe selection number on network measurement accuracy
FIG. 8: delay-end-to-end delay test measurement
Detailed Description
Fig. 1 is a multi-purpose IP network measurement probe architecture. The network measurement probe has the functions of receiving the collection task of the performance management center, executing corresponding Ping and TraceRoute actions, collecting corresponding network quality data, and waiting for the performance management center to actively read the measurement result data. The performance management center can be any management system or management software which meets the SNMP industry standard and can load standard MIB files and transmit and receive standard SNMP protocol messages. Through the structure mode of the management-agent, more cheap network measurement probes can be distributed in the whole network, and the number of management centers is reduced, so that the construction cost and the personnel maintenance cost of the performance management system are greatly reduced.
In fig. 2, except for the Ping measurement module and the TraceRoute measurement module, the other modules, the data table, and the shared memory are all in the subagent process. ProcessTable and CtlTable are control tables, ProcessTable is mainly used for controlling the measurement process by the proxy process, CtlTable can be pingCtlTable and TracerouteCltTable, and is mainly used for controlling the corresponding measurement operation by the measurement process. ResultTable and HistoryTable are used to store measurement result data and measurement history data. And the sub-agent process regularly scans the ProcessTable and the CtlTable to check whether a new measurement task is issued, and if so, the sub-agent process creates a process and schedules a corresponding measurement program. The CtlTable is set and deleted by the management center through snmp set operation, and the ProcessTable is a corresponding process control table entry generated by the sub-agent according to the CtlTable. Meanwhile, the sub-agent calls the state and data detection module to process the measurement result in the shared memory area according to the shared memory area appointed in the ProcessTable entry at regular time, and maps the measurement result to the corresponding ResultTable and HistoryTable in the sub-agent.
The TraceRoute measurement module in fig. 3 is identical to the Ping measurement module in terms of measurement flow, and thus its module structure is the same as that in fig. 3. The TraceRoute measurement module exists in a dynamic lib library mode, the collection task of the agent is collected in a subprocess mode when being executed, meanwhile, the collection result is placed in a designated shared memory area, and the TraceRoute measurement module waits for the data and state detection module to map the data and state detection module into the MIB library. And the TraceRoute measurement module acquires related data as an acquisition result according to a standard TraceRoute target router of RFC 2925. The Traceroute program will mainly collect three performance indicators: delay per hop, packet loss per hop, and routing tables. These indicators are obtained by sending a UDP packet specifying a specific port of TTL value, and receiving an ICMP packet that is not reachable by the responding port of the intermediate router. The interaction mode and specification between the TraceRoute measurement process and the subagent process are the same as those of the Ping measurement process, and the data interaction format is carried out according to the specifications described by TraceRoute resultstable, TraceRoute nobebesthistorytable and TraceRoute hoppstable in RFC 2925.
Fig. 4 is a state machine of operation of the measurement probe. The SET I represents a command which is sent by the management center in the form of SNMP SET and is used for driving the measurement probe to execute a certain collection task; the SET III represents a command which is sent by the management center in the form of SNMP SET and used for terminating the measurement probe to execute a certain collection task; GET I represents that the management center actively uses the command of SNMP GET to read the currently obtained acquisition result of a certain acquisition task local to the measurement probe; GET II means that the management center actively uses the command of SNMP GET to read the state parameters local to the measurement probe. Any alarms that occur therein are sent at any time via Syslog to a network alarm processing center set to comply with the Syslog standard.
Fig. 5 shows a cooperation relationship between the modules of the performance management center. There are 3 kinds of interaction modes between the probe management center and the probe, firstly, the interaction based on the purpose of measuring infrastructure resource management is realized by LDAP protocol; secondly, based on the interaction of the measurement task management purpose, a task management module of the management center sends a task related instruction to the probe through the set operation of the SNMP, and obtains the task execution state information of the probe through the get operation; and thirdly, based on the interactive behavior of the purpose of collecting the measurement data, the measurement data processing module of the management center reads the measurement data from the probe through get operation of the SNMP under the scheduling of the task management module. Another possible operation is not shown in fig. 5, if the management center is used as a service provider for data processing and analysis, then when the probe works as a single probe, the data processing and analysis can be performed remotely by calling the data analysis service provided by the management center, and finally some light-weight probe-end measurement applications are obtained. This is done by remote invocation, which has limited flexibility, is suitable for some applications that are lightweight and solid on the probe, and can only be expanded as the probe is upgraded. The performance management center is used as a unique user interface, a user can intuitively manage the probe as a resource, the task of the probe can be conveniently supervised, and most importantly, the measurement-based application can be used in a visual environment to achieve the aims of network management and network optimization.
Fig. 6, delay jitter-real host versus hardware probe. The network probe mainly detects network parameters to reflect the performance of the network, and the accuracy of the detection parameters of the probe is a great problem, so that the time delay jitter of one of the detection parameters of the probe is taken as an example, a news website is tested, and the measurement is carried out every half hour in a time period of 9:00-18: 00. As shown in fig. 6, it can be seen that the delay jitter measured by the hardware probe is small when the delay jitter of the real host is small, and the delay jitter detected by the hardware probe is large when the delay jitter of the real host is large. According to the two conditions, the measurement data of the probe is basically consistent with the measurement data of the real host, and the operation condition of the network can be detected by using the measurement data of the network probe.
Figure 7, influence of probe selection number on network measurement accuracy. The network probe plays an extremely important role in detecting network performance, and in practical application, the network probe can detect performance parameters of a network and can also be used for detecting network faults. The network probes are selected to be related to the accuracy degree of the measured parameters, the more accurate the network probes are selected, the higher the accuracy degree of the measured parameters is, and the more accurate and rapid the network fault detection is. For verification, the actual relation between the number of network probes and the optimization rate of the network performance parameters is tested, a certain local area network is tested, the network is evaluated according to the availability of the test parameters to obtain the conformity of the network parameters, as shown in fig. 7, the abscissa is the selected number of the network probes, and the ordinate is the conformity of the measured network parameters, it can be known from the figure that the conformity starts to rise along with the increase of the selected number of the network probes, and when the selection rate of the network probes increases to a certain value, the conformity of the network parameters tends to a fixed value. Therefore, the number of network probes selected in a certain network environment has an optimal number, and the optimal number of network probes selected in the local area network is optimally 15.
Fig. 8, delay-end-to-end delay test measurement results. The network probe realizes end-to-end data monitoring, utilizes the network probe to realize end-to-end link and host performance monitoring, the test condition is 500 specific IP packets 50B, the time interval is 200ms, the packet length is increased by 3B every time 50 specific IP packets are sent, a UDP protocol is adopted, actual measurement is carried out from a source host to a target host, the monitoring result is shown in figure 8, and it can be seen that the link delay variation tends to be stable, and the host performance is relatively excellent.

Claims (2)

1. A method for realizing a network performance management measurement probe based on the Internet of things is characterized in that an architecture model of a multipurpose IP network probe is designed. The system consists of a master control model, an algorithm model and an observation model, and the system is composed of and accords with SNMP standards. And the real-time playing function of different layer attributes is intuitively provided by measuring the comprehensive situation map of the network. The main control module is divided into three parts to form a sub-agent, a monitoring model and a processing model. The subagent initialization model is to read and analyze the configuration file, establish an AgentX protocol connection to the snmpd, start monitoring and transmit-receive messages, and the monitoring model is expressed in a wireless network
Figure FDA0002805387750000017
The semaphore for starting snoop is 0 when the beginning snoop is 1. The signal confidence level is expressed as a signal processing threshold and the processor confidence level is expressed as a processor communication threshold.
The calculation of signal communication degree is described as the use of signal communication degree between two adjacent probes in a wireless sensor network
Figure FDA0002805387750000018
x is represented as two adjacent signal points. Adjacent probe N1And N2The signal communication credit between is expressed as
Figure FDA0002805387750000011
According to the dynamic characteristics of the network, registering CtlTable, ResultTable and HistoryTable into the snmpd main agent, and installing the maximum value of the signal
Figure FDA0002805387750000012
Processor communication calculation is described as the processor identification processing threshold between adjacent probes is represented by f (x), N is the adjacent signal processor processing signal, and μ represents the number of signal processing. T represents the direct processing confidence level of the adjacent processors, the processors begin to receive various signals, including timing signals and related signals triggered by SNMP SET messages, and the like, if a new measurement task is issued,
Figure FDA0002805387750000013
the processor processing of the data change is represented as
Figure FDA0002805387750000014
Figure FDA0002805387750000015
The neighboring probe processor has a confidence of
Figure FDA0002805387750000016
The signal processor will trigger corresponding operation, create new measurement process, allocate resources such as shared memory and semaphore, and finally register all resource use information in the process control table for the data and state detection module to use.
The data and state detection module is the most complex and central module in the descendants, and the detection of data and state determines the accuracy evaluation function of the measuring probe, which is expressed as G(s).
sGS=∮(S2+2S)ds+Fi+S(x)i
The constraint condition of the evaluation function is probe utility detection, and the constraint condition is expressed as
Figure FDA0002805387750000021
Is the current situation.
The main function is to periodically inquire the working state of the measurement process, and if the measurement process is exited, the resources occupied by all the measurement processes are recovered.
The traditional memory has long time and low efficiency, so the introduction of a shared memory improves the use efficiency of the memory, the shared memory is represented as F, and the shared memory is represented as F
Figure FDA0002805387750000022
By sharing memory semaphores. The semaphore is represented as
Figure FDA0002805387750000023
And various data entry resources pnx, and the like.
According to the shared memory id and semaphore id appointed in the process control table, the measurement result information is obtained from the appointed shared memory area, and the information is converted into a result table, a history table and a skip table of the sub-agent.
We choose shared memory as the basis above which to encapsulate a high-speed data channel, which is the one that provides the fastest efficiency and speed in all inter-process communication operations in current operating systems. Shared memory efficiency is denoted as It
Figure FDA0002805387750000024
When using shared memory, epsilonitAnd the efficiency is increased.
According to the shared memory, the semaphore and the efficiency of the shared memory, a sharing factor W is obtained by integration and is expressed as
W=aFi+bS(x)+cIt
A large amount of measurement data and complex control information can be rapidly and accurately transmitted, the processing speed of the probe is greatly improved by adopting a shared memory mode, and the message strength of 100,000 packets/second SNMP GET can be borne on a common Ben 3 PC.
2. The method as claimed in claim 1, wherein a resource manager R is designed and implemented for more reasonable resource management, and by using the resource manager, the probe solves the problem of complex resource management well, maintains a good system structure, lays a solid foundation for probe development and maintenance, and also increases stability and solves the problem of resource leakage.
Figure FDA0002805387750000031
The resource problem can be expressed as
Figure FDA0002805387750000032
Adopting session surface mode to provide uniform interface for other subsystems or modules, the core of the resource manager is process table, its structure adopts MIB base structure, so that it can real-time check internal resource C of probe in performance management centeriThe use condition is as follows:
Figure FDA0002805387750000033
by adopting the resource manager, the probe well solves the problem of complex resource management, maintains a good system structure, lays a solid foundation for probe development and maintenance, and simultaneously increases the stability and solves the problem of resource leakage.
The method for implementing the internet-of-things-oriented network performance management measurement probe as recited in claim 1, wherein a probe deployment optimization algorithm is designed, X and Y represent data acquired by different probe deployments, and j is an acquisition number.
Figure FDA0002805387750000034
Figure FDA0002805387750000035
Figure FDA0002805387750000036
The probe deployment optimization algorithm is mainly divided into two stages: generating a probe topological graph and a probe address and subnet address mapping table, and based on the precondition of a node convergence algorithm:
1) a directed-less connectivity graph G of the topology of the known measured network, with endpoints and edges in G labeled R, C and E, respectively;
2) evaluating the performance state of each hop of link, and respectively using 0 or 1 as each edge in the link to assign a weight value;
optimizing the target based on the node convergence algorithm:
1) all end-to-end paths of the network can be measured by placing a minimum of network probes;
2) by shortening the distance between network probes, the path length invaded by active measurement can be shortened.
The game type based on which the network probe plays games comprises communication trust and energy trust
By TijRepresenting the communication credits of adjacent network probes, which may be described as
Figure FDA0002805387750000041
Considering that data is transmitted sometimes and is not timely and accurate, the credibility coefficient lambda can be introduced by using the credibility coefficientiThe communication credit k is represented as the number of nodes.
Figure FDA0002805387750000042
Energy credits are described as energy trust values between two adjacent probes. Belonging to any energy value between two adjacent probes
CSij=λSij+(1-λ)Sij 2
Data factor trust is described as
Figure FDA0002805387750000043
The consistency trust of the collected data of the nodes N adjacent to the node is Tj
Figure FDA0002805387750000044
And the number of times of data acquisition consistency of the probe n is equal to the number of times of data acquisition inconsistency of the probe n.
CN202011367259.1A 2020-11-29 2020-11-29 Implementation method of network performance management measurement probe for Internet of things Active CN114257486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011367259.1A CN114257486B (en) 2020-11-29 2020-11-29 Implementation method of network performance management measurement probe for Internet of things

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011367259.1A CN114257486B (en) 2020-11-29 2020-11-29 Implementation method of network performance management measurement probe for Internet of things

Publications (2)

Publication Number Publication Date
CN114257486A true CN114257486A (en) 2022-03-29
CN114257486B CN114257486B (en) 2024-06-04

Family

ID=80789529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011367259.1A Active CN114257486B (en) 2020-11-29 2020-11-29 Implementation method of network performance management measurement probe for Internet of things

Country Status (1)

Country Link
CN (1) CN114257486B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115102828A (en) * 2022-08-26 2022-09-23 歌尔股份有限公司 Fault analysis method and device
CN116232956A (en) * 2023-05-06 2023-06-06 国网智能电网研究院有限公司 Network time delay in-band telemetry method, device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013975A (en) * 2007-01-24 2007-08-08 中国人民解放军理工大学指挥自动化学院 Method and system for testing performance parameter between random two terminal systems in IP network
US20070250625A1 (en) * 2006-04-25 2007-10-25 Titus Timothy G Real-time services network quality control
CN102801587A (en) * 2012-08-29 2012-11-28 北京邮电大学 Large-scale network-oriented virtualized monitoring system and dynamic monitoring method thereof
CN104618128A (en) * 2014-06-30 2015-05-13 北京阅联信息技术有限公司 Multi-thread based node network detecting and analyzing method and system
CN107046481A (en) * 2017-04-18 2017-08-15 国网福建省电力有限公司 A kind of information system integrated network management system comprehensive analysis platform
CN110380912A (en) * 2019-08-16 2019-10-25 上海锵戈科技有限公司 A kind of large scale network link performance measurement method and system based on SNMP

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070250625A1 (en) * 2006-04-25 2007-10-25 Titus Timothy G Real-time services network quality control
CN101013975A (en) * 2007-01-24 2007-08-08 中国人民解放军理工大学指挥自动化学院 Method and system for testing performance parameter between random two terminal systems in IP network
CN102801587A (en) * 2012-08-29 2012-11-28 北京邮电大学 Large-scale network-oriented virtualized monitoring system and dynamic monitoring method thereof
CN104618128A (en) * 2014-06-30 2015-05-13 北京阅联信息技术有限公司 Multi-thread based node network detecting and analyzing method and system
CN107046481A (en) * 2017-04-18 2017-08-15 国网福建省电力有限公司 A kind of information system integrated network management system comprehensive analysis platform
CN110380912A (en) * 2019-08-16 2019-10-25 上海锵戈科技有限公司 A kind of large scale network link performance measurement method and system based on SNMP

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李宏;左延智;张宗鹏;吴训吉;: "一种有效的网络运维管理系统的设计与实现", 数字通信世界, no. 04 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115102828A (en) * 2022-08-26 2022-09-23 歌尔股份有限公司 Fault analysis method and device
CN116232956A (en) * 2023-05-06 2023-06-06 国网智能电网研究院有限公司 Network time delay in-band telemetry method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114257486B (en) 2024-06-04

Similar Documents

Publication Publication Date Title
Werner-Allen et al. Motelab: A wireless sensor network testbed
CN104270268B (en) A kind of distributed system network performance evaluation and method for diagnosing faults
US5812529A (en) Method and apparatus for network assessment
Kona et al. A Framework for Network Management Using Mobile Agents.
CN114257486B (en) Implementation method of network performance management measurement probe for Internet of things
Dumitrescu et al. Diperf: An automated distributed performance testing framework
JP2002508555A (en) Dynamic Modeling of Complex Networks and Prediction of the Impact of Failures Within
Lundgren et al. Experiences from the design, deployment, and usage of the UCSB MeshNet testbed
Des Rosiers et al. Senslab
CN117751567A (en) Dynamic process distribution for utility communication networks
CN108833168A (en) A kind of server cluster environment network system and Detection of Stability method
Al-Kasassbeh et al. Analysis of mobile agents in network fault management
CN1697407A (en) Implementation method and system for testing consistency of border gateway protocol of supporting IPv6
US20050157654A1 (en) Apparatus and method for automated discovery and monitoring of relationships between network elements
Wu et al. A survey on the progress of testing techniques and methods for wireless sensor networks
CN117370053A (en) Information system service operation-oriented panoramic monitoring method and system
Adhicandra et al. Using mobile agents to improve performance of network management operations
CN101217419B (en) A distributed ip network performance test method
JP4117291B2 (en) Device for collecting and analyzing network information and creating network configuration information, method for creating network configuration information, and program for creating network configuration information
CN114268560A (en) Agent framework design method for network end-to-end performance monitoring
CN112950916A (en) ZigBee-based wireless meter reading system and application method thereof
CN106612213A (en) An equipment test method and apparatus
CN116074178A (en) Digital twin architecture of network, network session processing method and device
Burin des Rosiers et al. SensLAB: Very Large Scale Open Wireless Sensor Network Testbed
CN115396348B (en) Test system, method and computer readable storage medium for wireless sensor network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant