CN112532433B

CN112532433B - Universal network equipment fault analysis method based on ping and current characteristics

Info

Publication number: CN112532433B
Application number: CN202011302531.8A
Authority: CN
Inventors: 傅如毅; 宣仲飞
Original assignee: ZHEJIANG YUANWANG COMMUNICATION TECHNOLOGY CO LTD
Current assignee: ZHEJIANG YUANWANG COMMUNICATION TECHNOLOGY CO LTD
Priority date: 2020-11-19
Filing date: 2020-11-19
Publication date: 2023-04-07
Anticipated expiration: 2040-11-19
Also published as: CN112532433A

Abstract

The invention provides a universal network equipment fault analysis method based on ping and current characteristics, which comprises the steps of carrying out periodic current collection, calibration and analysis on an IO power supply port through a front-end management host, carrying out fault detection and analysis on network terminal equipment according to an equipment current empirical value, a network packet loss rate and network delay, and carrying out equipment fault alarm reporting and automatic restart processing according to user configuration. The method can realize the monitoring of the faults of the general network equipment, and can greatly improve the operation and maintenance efficiency of the network equipment through the functions of remote acquisition, alarming, control, restart recovery and the like, avoid manual inspection and save the operation and maintenance cost.

Description

Universal network equipment fault analysis method based on ping and current characteristics

[ technical field ] A method for producing a semiconductor device

The invention relates to the technical field of network equipment monitoring, in particular to a universal network equipment fault analysis method based on ping and current characteristics.

[ background of the invention ]

In recent years, in order to meet the requirements of increasingly complex public security situations, governments all over the world have vigorously promoted the construction of video monitoring systems. Video monitoring points are set in places such as public security complex places, case sections with frequent occurrence, main streets, key parts, important intersections, bayonets and the like, a video monitoring network covering the whole city and villages and towns is gradually formed, and the social efficiency of the monitoring system is effectively exerted. The use of the video private network monitoring system greatly improves the capability of timely discovering and disposing various illegal crimes and provides first-hand site data for detecting the illegal crimes; secondly, the traffic management capability is strengthened powerfully; thirdly, various illegal criminal activities are frightened effectively, and the safety of people is improved. The public security organization intuitively understands and masters the regional public security dynamics by browsing, video playback, data downloading and application monitoring videos, and comprehensively improves the response capability and monitoring strength for dealing with emergency cases, group events and major security activities.

In the snow project and the skynet project, besides the video monitoring equipment, other network equipment such as network light supplement lamp equipment and wifi probes are deployed in many places. Since different network devices have different message formats, communication protocols and service characteristics, if each network device performs message characteristic analysis, the service characteristic analysis is not practical or necessary. However, for the front-end management host dedicated to operation and maintenance management of the video monitoring front-end terminal device, hierarchical fault detection is required according to the emphasis of the service. The core service device performs special service analysis and the universal feature analysis, and of course, the core service device is also applicable to the universal device analysis method. Therefore, in order to improve the operation and maintenance efficiency of the video monitoring front-end management device, the fault detection and analysis of the universal network device is also important, and a universal network device fault analysis method based on ping and current characteristics is provided.

[ summary of the invention ]

The invention aims to solve the problems in the prior art, and provides a universal network equipment fault analysis method based on ping and current characteristics, which can monitor the fault of the universal network equipment, improve the operation and maintenance efficiency of the network equipment and save the operation and maintenance cost.

In order to achieve the purpose, the invention provides a universal network equipment fault analysis method based on ping and current characteristics, which sequentially comprises the following steps:

s1, configuring access equipment parameter information: parameter configuration is carried out on access equipment accessed to an IO power supply port on the front-end management host through the central platform;

s2, automatic collection of starting current and automatic learning of experience values: starting an automatic current acquisition mode of the access equipment and an automatic empirical value learning function through the central platform, periodically acquiring a current effective value of the network equipment by the front-end management host, and performing data analysis through the central platform to obtain a current empirical value range;

s3, issuing the fault detection parameters, the fault reporting parameters and the current experience values to the front-end management host through the central platform;

s4, power supply fault detection: the front-end management host compares and analyzes the access equipment parameters, the current empirical value range, the fault detection parameters and the current effective value of the network equipment which is periodically collected according to the user, if the power supply is normal, the step S5 is carried out to start the network fault detection, otherwise, whether the equipment needs to be restarted is determined according to the user configuration;

s5, network fault detection: the method comprises the steps of carrying out unified detection through a central platform, firstly storing configuration information of network equipment into a network equipment fault detection data table, then starting a group of network fault detection threads, periodically sending an ICMP message to the network equipment, reporting a suspected network abnormity alarm if the network packet loss rate and the network delay of the network equipment exceed a threshold value configured by a user, and determining whether the equipment needs to be restarted or not according to the user configuration.

Preferably, the parameter configuration in step S1 includes: the device type, the device name, the brand model, the rated power, the error range, the power switch, the switching time, the ip address, the login user name and the password (optional) and the like.

Preferably, in step S2, the front-end management host collects the current effective value of the network device every 200ms through 485 (the current effective value is collected every 20ms inside the IO power supply detection), and generates a maximum value, a minimum value and an average value; and then reporting data to a central platform every 10 minutes for data analysis to obtain the current empirical value range (maximum value and minimum value) of the brand model specification.

Preferably, in step S3, the fault detection parameter includes: detecting an enabling switch, a network detection period, a network packet loss rate, network delay and the like.

Preferably, in step S3, the fault reporting parameter includes alarm reporting information and whether the fault is restarted, and the alarm reporting information includes an alarm type, an alarm name, a severity level, whether to report, a remark description, and the like.

Preferably, in step S3, the empirical current value includes a maximum current value and a minimum current value.

Preferably, in step S4, if the collected current effective value does not fall within the empirical value range, it indicates that the network device is powered abnormally, and the counter is incremented by one; and if the power supply of the network equipment is abnormal for 3 times continuously, reporting that the power supply of the network equipment is abnormal, and judging whether the equipment needs to be restarted according to user configuration.

Preferably, in step S5, the main fields of the network device fault detection data table include a host number, a slot number, a port number, a device type, a device name, an IP address, a network packet loss rate, a network delay, a user name, and a password (which may be selected according to actual requirements).

Preferably, in step S5, the network failure detection thread uses a thread pool, and the number of threads can be automatically increased or decreased.

The invention carries out periodic current collection, calibration and analysis on the IO power supply port through the front-end management host, carries out fault detection and analysis on network terminal equipment according to the equipment current empirical value, the network packet loss rate and the network delay, and carries out equipment fault alarm reporting and automatic restart processing according to user configuration. For general network equipment, fault detection is carried out from two aspects of network and power supply respectively, and inhibition processing is carried out according to service priority. Therefore, during configuration, the network fault detection parameters and the power failure detection parameters need to be configured respectively. And the priority of the power supply abnormality alarm of the network equipment is greater than that of the network abnormality alarm, so that a suppression relation exists during alarm reporting.

The invention has the beneficial effects that:

1. errors caused by environmental factors and current collection components can be calibrated through automatic learning and configuration of the current empirical value, and therefore misinformation is prevented.

2. And comparing the real-time current value of the network equipment with the working experience value to judge whether the network equipment is normal in power supply.

3. The ICMP message is periodically sent to the network equipment, and the network packet loss rate and the network delay are counted to judge whether the network equipment is normal in the aspect of network communication.

4. Through remote acquisition, report to the police, control, restart functions such as resume can improve network equipment's fortune dimension efficiency greatly, avoid artifical the patrolling and examining, save fortune dimension cost.

The features and advantages of the present invention will be described in detail by embodiments in conjunction with the accompanying drawings.

[ description of the drawings ]

FIG. 1 is a system layout framework diagram of the present invention;

FIG. 2 is a flow chart of parameter configuration of the present invention;

FIG. 3 is a flow chart of the current collection and empirical value auto-learning of the present invention;

FIG. 4 is a flow chart of the device power anomaly detection of the present invention;

fig. 5 is a flow chart of the device network anomaly detection of the present invention.

[ detailed description ] A

1. Environment building and configuration

First, the RJ45 port (WAN port) of the head end management host 1 is connected to the optical modem (as shown in fig. 1). And secondly, connecting an IO expansion power supply board card (such as IO _ P8_ AC220 VB) to a 485 bus of the front-end management host 1 through 485. And then, the power supplies of the network device 1 and the network device 2 are respectively plugged into the power supply port 1 and the power supply port 2 in the IO _ P8_ AC220VB board card. And then the network ports of the network device 1 and the network device 2 are respectively connected to the LAN1 port and the LAN2 port of the front-end management host 1. And finally, configuring information of the host and the central platform, wherein the information comprises the ip of the host, a mask, a gateway, the ip of the central platform, a logic board card in the IO _ P8_ AC220VB, the network equipment 1 and the basic configuration of the network equipment in the IO board card.

2. Testing whether network equipment is normal

Switching the network equipment to a normal working mode, for example, turning on a light supplement lamp and turning on the lamp, and starting video recording by a camera; and then detecting the current and voltage values of the network equipment 1 and the network equipment 2 through a universal meter, and if the obtained values are consistent with the specification, indicating that the power supply is normal.

And respectively ping the network device 1 and the network device 2 from the central platform server and the front end management host 1, if the both ok and the service are normal, the network is basically normal.

3. Acquisition and automatic learning of network equipment current empirical value

Due to the influence of the actual working environment and the error of the current detection component, the actual standby current and the actual working current have certain deviation from the specification of the equipment. Calibration by current empirical values is therefore required. Since the network device has been verified to be normal in step two, current value sampling and empirical value auto-learning and configuration follows.

And C, switching the network equipment to a working mode to operate for 24 hours to 48 hours to obtain a group of current sampling values of the equipment with the brand model in the working mode, and comparing and processing the current sampling values according to the current reference values and the deviation range obtained by measurement in the step II to obtain current experience values in the working mode. Note that: (1) The input voltage needs a voltage stabilizer, and the mains supply is basically normal. (2) The system runs under the environment of the front end management host machine such as nominal magnetic field, air pressure, temperature, humidity and the like.

4. Empirical value configuration and fault detection parameter configuration

And issuing the current experience values learned by the network device 1 and the network device 2 in the step three to the front-end management host 1. Then, parameters of the fault such as a fault detection enabling switch, a detection period, debounce times, an alarm name, an alarm type, an alarm level, whether an alarm is reported or not, whether a fault is automatically processed or not and the like are configured into the front-end management host 1. And storing network detection parameters such as host serial number, slot number, port number, equipment type, ip address, network packet loss rate, network delay and the like into a data table.

5. Fault detection

The IO _ P8_ AC220VB board card is provided with 8 power ports of AC220VB, and each power port supports output voltage and current detection. When the system starts the fault detection of the network equipment, 64 values in a current oscillogram are collected every 20ms at the IO power supply board card side, and an effective current value is calculated according to the 64 values. At the moment, the front-end management host reads a group of current values in the IO _ P8_ AC220VB once every 200ms, namely, one network device acquires 10 effective current values, and compares the effective current values according to current experience values, if the effective current values are not in the experience value range, a fault counter is increased by one, if the count of the fault counter reaches the fault detection debounce times, a suspected network device power supply abnormity alarm is reported, and self-repairing processing is carried out according to user configuration.

If the power supply is normal, the central platform can automatically start a group of threads according to the network equipment fault detection data table, and can adjust the number of the threads in the thread pool according to the dynamic change of the data table. Each thread reads N IP addresses once (which can be set, defaulted to 50), one IP address is sent once per second, 3 ICMP requests are sent once, 3 times in succession, each time 1 second is timed out. And polling a period for 150 seconds, if the network packet loss rate and the network delay of the network equipment exceed the threshold value configured by the user in continuous 3 periods, reporting a suspected network abnormality alarm, and restarting the network equipment according to the user configuration.

The above embodiments are illustrative of the present invention, and are not intended to limit the present invention, and any simple modifications of the present invention are within the scope of the present invention.

Claims

1. A universal network equipment fault analysis method based on ping and current characteristics is characterized in that: the method sequentially comprises the following steps:

s1, configuring access equipment parameter information: parameter configuration is carried out on access equipment accessed to an IO power supply port on a front-end management host through a central platform;

s2, automatic collection of starting current and automatic learning of experience values: starting an automatic current acquisition mode and an automatic empirical value learning function of the access equipment through a central platform, periodically acquiring a current effective value of the network equipment by a front-end management host, carrying out data analysis through the central platform to obtain a current empirical value range, acquiring the current effective value of the network equipment every 200ms through 485 by the front-end management host, and generating a maximum value, a minimum value and an average value; then reporting data to a central platform every 10 minutes for data analysis to obtain the current experience value range of the brand model specification of the network equipment;

s3, issuing fault detection parameters, fault reporting parameters and current experience values to a front-end management host through a central platform, wherein the fault reporting parameters comprise alarm reporting information and whether the fault is restarted, the alarm reporting information comprises an alarm type, an alarm name, a severity level, whether the fault is reported or not, remarks are stated, and the current experience values comprise a current maximum value and a current minimum value;

s4, power supply fault detection: the front-end management host compares and analyzes the access equipment parameters, the current empirical value range, the fault detection parameters and the current effective value periodically acquired by the network equipment according to the access equipment parameters, the current empirical value range and the fault detection parameters issued by a user, if the power supply is normal, the step S5 is carried out to start the network fault detection, and if the power supply is not normal, the user configuration is used for determining whether the equipment needs to be restarted; if the collected current effective value does not fall within the empirical value range, the power supply of the network equipment is abnormal, and the counter is increased by one; if the power supply of the network equipment is abnormal for 3 times continuously, reporting that the power supply of the network equipment is abnormal, and according to user configuration, whether the equipment needs to be restarted or not;

2. The method of claim 1, wherein the method comprises the following steps: in step S1, parameter configuration is performed, including: the device type, the device name, the brand model, the rated power, the error range, the power switch, the switching time, the ip address, the login user name and the password.

3. The method of claim 1, wherein the method comprises the following steps: in step S3, the fault detection parameters include: detecting an enabling switch, a network detection period, a network packet loss rate and network delay.

4. The method of claim 1, wherein the method comprises the following steps: in step S5, the main fields of the network device fault detection data table include a host number, a slot number, a port number, a device type, a device name, an IP address, a network packet loss rate, a network delay, a user name, and a password.

5. The method of claim 1, wherein the method comprises the following steps: in step S5, the network fault detection thread adopts a thread pool.