CN118118319B

CN118118319B - Intelligent diagnosis method and system for network equipment based on big data

Info

Publication number: CN118118319B
Application number: CN202410533824.9A
Authority: CN
Inventors: 蒋浩; 唐彬; 叶勇围; 董斌; 杨明秋
Original assignee: Nanjing Fengchuan Yunju Information Technology Co ltd
Current assignee: Nanjing Fengchuan Yunju Information Technology Co ltd
Filing date: 2024-04-30
Publication date: 2024-06-28
Anticipated expiration: 2044-04-30

Abstract

The invention belongs to the technical field of network equipment monitoring, and particularly relates to a network equipment intelligent diagnosis method and system based on big data. According to the invention, through analyzing the fault information of the network equipment, other abnormal reasons corresponding to the historical fault information can be output, and the abnormal reasons are specifically classified into the induction parameters with the induction relation with the historical fault information and the trigger parameters with the trigger relation with the historical fault information, so that other potential risks when the network equipment is abnormal can be positioned, and a clear maintenance direction is provided for operation and maintenance personnel of the network equipment, so that the relevant personnel are reminded of timely treatment, the occurrence or expansion of the fault is prevented, and the stability and reliability of the network equipment are effectively improved.

Description

Intelligent diagnosis method and system for network equipment based on big data

Technical Field

The invention belongs to the technical field of network equipment monitoring, and particularly relates to a network equipment intelligent diagnosis method and system based on big data.

Background

With the rapid development of internet technology, network devices such as servers, routers, switches, etc. play an increasingly important role in daily life and industrial production, and stable operation of these devices is crucial to ensure continuity and reliability of network services, however, due to complexity of network environments and proliferation of the number of devices, conventional network device fault detection and maintenance means have been difficult to meet the requirements of modern network systems.

The traditional network equipment diagnosis method generally depends on manual experience and regular inspection, is low in efficiency, is difficult to deal with real-time fault detection and treatment in a large-scale network environment, and meanwhile, maintenance staff can carry out corresponding maintenance and investigation on a fault point after equipment faults occur, but the faults of the fault point are not necessarily caused by self reasons, other inducibility factors can also exist, and similarly, after the fault point breaks down, other parameter anomalies can also be caused, even if the equipment fault point maintenance is completed, the existence of the other parameter anomalies clearly causes potential risks to the operation of equipment, and based on the scheme, the intelligent network equipment diagnosis method for the big data is provided, so that the problems are solved.

Disclosure of Invention

The invention aims to provide a network equipment intelligent diagnosis method and system based on big data, which can analyze the fault information of network equipment, locate the fault cause of the network equipment and other anomalies caused by the fault of the network equipment, and provide a maintenance direction for operation and maintenance personnel of the network equipment.

The technical scheme adopted by the invention is as follows:

a network equipment intelligent diagnosis method based on big data comprises the following steps:

acquiring the operated parameters of the network equipment, and classifying the operated parameters according to the parameter characteristics to obtain a plurality of primary characteristic parameters, wherein each primary characteristic parameter corresponds to a recording subset;

Acquiring historical fault information of the network equipment, outputting a fault starting point according to the historical fault information, and calibrating primary characteristic parameters below the fault starting point as reference parameters;

Performing bidirectional offset on the fault starting point, constructing a sampling period according to an offset result, and calibrating the running parameters of network equipment in the sampling period into secondary characteristic parameters;

Acquiring the operation fluctuation quantity of the secondary characteristic parameter in the sampling period, outputting the correlation between the secondary characteristic parameter and the historical fault information according to the operation fluctuation quantity, and determining the fault condition parameter according to the correlation between the secondary characteristic parameter and the historical fault information, wherein the correlation comprises an induction relation and a triggering relation;

and acquiring real-time operation parameters of the network equipment, and diagnosing the operation state of the network equipment according to the fault condition parameters, wherein the operation state comprises a normal state and a risk state, and synchronously sending out early warning signals under the risk state.

In a preferred embodiment, the step of obtaining the running parameters of the network device and classifying the running parameters according to the parameter characteristics to obtain a plurality of primary feature parameters includes:

acquiring the running parameters of the network equipment;

constructing a plurality of recording subsets, and adding classification identifiers to each recording subset according to the parameter characteristics;

and acquiring the parameter characteristics of the operated parameters, matching the parameter characteristics to corresponding recording subsets one by one, and recording the operated parameters in each recording subset as primary characteristic parameters.

In a preferred embodiment, the step of obtaining the historical fault information of the network device and outputting the fault starting point according to the historical fault information includes:

acquiring the historical fault information, wherein the historical fault information comprises a fault occurrence time node, a fault type and a fault description;

Taking the fault occurrence time node as a reference point, reversely sampling the primary characteristic parameters, and arranging the primary characteristic parameters according to an occurrence time sequence;

performing difference processing on the adjacent first-level characteristic parameters, and calibrating a difference result of the adjacent first-level characteristic parameters as a reference fluctuation parameter;

acquiring an algorithm function, inputting the reference fluctuation parameters into the algorithm function, and calibrating an output result as a fault hiding period;

and based on the fault hiding period, performing equivalent offset on the fault occurrence time node, and calibrating an offset result as a fault starting point.

In a preferred embodiment, the step of bi-directionally shifting the fault starting point and constructing a sampling period according to the shift result includes:

Acquiring the fault starting point, and judging whether equipment faults exist between the fault starting point and a fault occurrence time node;

if yes, recording the fault starting point as an invalid node, and not performing bidirectional offset operation on the invalid node;

if the deviation value does not exist, acquiring a required deviation value, wherein the required deviation value comprises a positive deviation value and a negative deviation value;

taking the fault starting point as a reference, and carrying out forward deviation on the fault starting point according to the forward deviation amount to obtain a forward deviation point;

Taking the fault starting point as a reference, and carrying out negative offset on the fault starting point according to the negative offset to obtain a negative offset point;

and constructing a sampling period according to the positive offset point and the negative offset point, wherein the sampling period comprises a positive sampling period and a negative sampling period, the positive sampling period is a period from a fault starting point to the positive offset point, and the negative sampling period is a period from the negative offset point to the fault starting point.

In a preferred embodiment, the step of collecting the operation fluctuation amount of the secondary characteristic parameter in the sampling period and outputting the correlation between the secondary characteristic parameter and the historical fault information according to the operation fluctuation amount includes:

Acquiring secondary characteristic parameters in the positive sampling period and the negative sampling period, sequencing the secondary characteristic parameters according to a generation time sequence, and performing difference processing on adjacent secondary characteristic parameters to obtain operation fluctuation quantity;

Respectively acquiring allowable fluctuation threshold values of the secondary characteristic parameters, and comparing the allowable fluctuation threshold values with the running fluctuation quantity;

If the running fluctuation amount is larger than the allowable fluctuation threshold, counting time nodes with the running fluctuation amount larger than the allowable fluctuation threshold, calibrating the time nodes as nodes to be evaluated, calibrating the correlation between the corresponding secondary characteristic parameters and the historical fault information as a triggering relationship when the nodes to be evaluated belong to a positive sampling period, and calibrating the correlation between the corresponding secondary characteristic parameters and the historical fault information as an inducing relationship when the nodes to be evaluated belong to a negative sampling period;

And if the running fluctuation quantity is always smaller than the allowable fluctuation threshold value, indicating that no correlation exists between the corresponding secondary characteristic parameter and the historical fault information.

In a preferred embodiment, the historical fault information is caused by abnormal fluctuation of a secondary characteristic parameter under the evoked relationship;

Under the triggering relationship, the historical fault information causes abnormal fluctuation of the secondary characteristic parameters;

and after the secondary characteristic parameters under the induction relation and the triggering relation are output, calibrating the secondary characteristic parameters as the induction parameters and the triggering parameters respectively.

In a preferred embodiment, the step of determining the fault condition parameter according to the correlation between the secondary characteristic parameter and the historical fault information includes:

counting the occurrence frequency of the induction parameters under the same type of historical fault information, and calibrating the induction parameters as parameters to be checked;

Acquiring a verification threshold value, and comparing the verification threshold value with a parameter to be verified;

If the parameter to be checked is larger than or equal to the check threshold, indicating that the corresponding induction parameter is a main induction factor of the historical fault information, and recording the fault condition parameter as a conventional fault parameter;

if the parameter to be checked is smaller than the check threshold, indicating that the corresponding induction parameter has contingency, and recording the fault condition parameter as an occasional fault parameter;

wherein, the conventional fault parameters are higher than the occasional fault parameters in the troubleshooting priority.

In a preferred embodiment, the step of acquiring the real-time operation parameter of the network device and diagnosing the operation state of the network device according to the fault condition parameter includes:

acquiring real-time operation parameters of the network equipment, and comparing and analyzing the real-time operation parameters with fault condition parameters;

If the real-time operation parameters are matched with the fault condition parameters, the corresponding network equipment is indicated to be in a risk state, and a diagnosis result is output according to the type of the fault condition parameters, wherein when the diagnosis result is an occasional fault parameter, a secondary early warning signal is sent out, and when the diagnosis result is a conventional fault parameter, a primary early warning signal is sent out;

And if the real-time operation parameter is not matched with the fault condition parameter, indicating that the operation state of the corresponding network equipment is a normal state.

The invention also provides a network equipment intelligent diagnosis system based on big data, which is applied to the network equipment intelligent diagnosis method based on big data, and comprises the following steps:

The data classification module is used for acquiring the operated parameters of the network equipment, classifying the operated parameters according to the parameter characteristics, and obtaining a plurality of primary characteristic parameters, wherein each primary characteristic parameter corresponds to one recording subset;

The fault extraction module is used for acquiring historical fault information of the network equipment, outputting a fault starting point according to the historical fault information, and calibrating primary characteristic parameters below the fault starting point as reference parameters;

the feature extraction module is used for carrying out bidirectional offset on the fault starting point, constructing a sampling period according to an offset result, and calibrating the running parameters of network equipment in the sampling period into secondary feature parameters;

The correlation analysis module is used for collecting the operation fluctuation quantity of the secondary characteristic parameters in the sampling period, outputting the correlation between the secondary characteristic parameters and the historical fault information according to the operation fluctuation quantity, and determining the fault condition parameters according to the correlation between the secondary characteristic parameters and the historical fault information, wherein the correlation comprises an induction relation and a triggering relation;

the state monitoring module is used for acquiring real-time operation parameters of the network equipment and diagnosing the operation state of the network equipment according to the fault condition parameters, wherein the operation state comprises a normal state and a risk state, and early warning signals are synchronously sent out in the risk state.

And an electronic device, the electronic device comprising:

At least one processor;

And a memory communicatively coupled to the at least one processor;

The memory stores a computer program executable by the at least one processor, and the computer program is executed by the at least one processor, so that the at least one processor can execute the intelligent diagnosis method of the network device based on big data.

The invention has the technical effects that:

The invention can output other abnormal reasons corresponding to the historical fault information by analyzing the fault information of the network equipment, and specifically classifies the abnormal reasons into the induction parameters with the induction relation with the historical fault information and the trigger parameters with the trigger relation with the historical fault information, so that other potential risks when the network equipment is abnormal can be positioned, and a clear maintenance direction is provided for operation and maintenance personnel of the network equipment, thereby reminding related personnel to process in time, preventing the occurrence or expansion of the fault and effectively improving the stability and reliability of the network equipment.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a system block diagram of the present invention;

Fig. 3 is a structural diagram of an electronic device of the present invention.

Detailed Description

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.

Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one preferred embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.

Referring to fig. 1, the invention provides a network device intelligent diagnosis method based on big data, comprising the following steps:

s1, acquiring running parameters of network equipment, and classifying the running parameters according to parameter characteristics to obtain a plurality of primary characteristic parameters, wherein each primary characteristic parameter corresponds to a recording subset;

S2, acquiring historical fault information of the network equipment, outputting a fault starting point according to the historical fault information, and calibrating primary characteristic parameters below the fault starting point as reference parameters;

S3, performing bidirectional offset on the fault starting point, constructing a sampling period according to an offset result, and calibrating the running parameters of the network equipment in the sampling period as secondary characteristic parameters;

S4, acquiring the operation fluctuation quantity of the secondary characteristic parameter in a sampling period, outputting the correlation between the secondary characteristic parameter and the historical fault information according to the operation fluctuation quantity, and determining the fault condition parameter according to the correlation between the secondary characteristic parameter and the historical fault information, wherein the correlation comprises an induction relation and a triggering relation;

S5, acquiring real-time operation parameters of the network equipment, and diagnosing the operation state of the network equipment according to the fault condition parameters, wherein the operation state comprises a normal state and a risk state, and synchronously sending out early warning signals under the risk state.

As described in the above steps S1-S5, with the rapid development of information technology, network devices play an increasingly important role in daily life and industrial production, the stability and efficiency of the network are directly related to the operation and personal use experience of enterprises, so that the maintenance and fault diagnosis of the network devices are particularly important, in the field of network device diagnosis, the application of big data technology is increasingly wide, in this embodiment, firstly, the running parameters of the network devices are obtained, which are important bases for evaluating the running state of the devices, then, the running parameters are classified and processed, and are classified into a plurality of primary characteristic parameters according to the parameter characteristics, wherein the parameter characteristics are the parameter types, and each primary characteristic parameter is recorded into a recording subset, so that the subsequent call is facilitated, and then, the historical fault information of the network devices is obtained, determining a fault starting point according to historical fault information, wherein in general, the operating parameters of the equipment can correspondingly abnormally fluctuate before the equipment fails, based on the abnormal fluctuation, the embodiment can output the fault starting point, namely an initial abnormal fluctuation node of the fault, then calibrating the primary characteristic parameters below the fault starting point as reference parameters, and then performing bidirectional deviation on the fault starting point to construct a sampling period, wherein the setting of the sampling period ensures that the primary characteristic parameter change of the equipment can be captured before or during the fault, the embodiment calibrates the operating parameters of the equipment in the sampling period as secondary characteristic parameters, the secondary characteristic parameters reflect the refinement index of the operating state of the equipment, the refinement index is helpful for analyzing the cause of the fault, and then determining the operating fluctuation quantity of the secondary characteristic parameters in the sampling period, and the correlation between the secondary characteristic parameter and the historical fault information is output according to the operation fluctuation quantity, wherein the correlation comprises an induction relation and a triggering relation, which is helpful for revealing the root cause of the fault, finally determining the fault condition parameter, diagnosing the operation state of the network equipment according to the fault condition parameter and making a preventive measure, in particular, the real-time operation parameter of the network equipment is obtained by comparing the real-time operation parameter with the fault condition parameter, the operation state packet is classified into a normal state and a risk state, the risk state refers to the potential fault risk of the equipment, and early warning signals are synchronously sent out to remind related personnel to timely process so as to prevent the occurrence or expansion of the fault and effectively improve the stability and reliability of the network equipment.

In a preferred embodiment, the steps of acquiring the running parameters of the network device, classifying the running parameters according to the parameter characteristics, and obtaining a plurality of primary characteristic parameters include:

s101, acquiring the running parameters of network equipment;

S102, constructing a plurality of recording subsets, and adding classification identifiers to each recording subset according to parameter characteristics;

S103, acquiring parameter characteristics of the operated parameters, matching the parameter characteristics to corresponding recording subsets one by one, and recording the operated parameters in each recording subset as first-level characteristic parameters.

In the above steps S101-S103, in the management and maintenance of the network device, the acquiring and processing of the executed parameters are pre-tasks, firstly, the executed parameters of the network device are acquired, where the executed parameters include, but are not limited to, configuration information, operation temperature, performance data of the device, etc., these data may be acquired through corresponding sensors or third party management software, after the acquiring of the executed parameters is completed, corresponding preprocessing should be performed, such as cleaning, filling missing values, removing abnormal values, etc., then a plurality of recording subsets are constructed, where the recording subsets may be divided according to the characteristics of the executed parameters, for example, the parameters of the network device are divided into a plurality of categories, such as hardware parameters, software parameters, performance parameters, etc., after the recording subsets are constructed, classification identifier needs to be added for each recording subset to identify and distinguish different types of parameters more quickly, then, each executed parameter is matched to a corresponding recording subset one by one according to the parameter characteristics, and the executed parameter in each recording subset is recorded as a first-level characteristic parameter.

s201, acquiring historical fault information, wherein the historical fault information comprises a fault occurrence time node, a fault type and a fault description;

s202, taking a fault occurrence time node as a reference point, reversely sampling the primary characteristic parameters, and arranging the primary characteristic parameters according to an occurrence time sequence;

S203, performing difference processing on adjacent first-level characteristic parameters, and calibrating difference results of the adjacent first-level characteristic parameters as reference fluctuation parameters;

S204, acquiring an algorithm function, inputting reference fluctuation parameters into the algorithm function, and calibrating an output result as a fault hiding period;

s205, based on the fault hiding period, performing equivalent offset on the fault occurrence time node, and calibrating an offset result as a fault starting point.

As described in the above steps S201-S205, after the historical fault information is extracted, the time node, the fault type and the fault description of the fault occurrence included in the historical fault information are collected, by collecting these information, we can know the fault condition of the device in detail, on the basis of obtaining the historical fault information, it is also necessary to process the primary characteristic parameter, take the fault occurrence time node as a reference point, reversely sample the primary characteristic parameter, and then arrange the primary characteristic parameter according to the occurrence time sequence, where the purpose of this step is to find the characteristic change before and after the fault occurrence, thereby providing a basis for the subsequent fault analysis, then perform the difference processing on the adjacent primary characteristic parameter, and calibrate the difference result of the adjacent primary characteristic parameter as a reference fluctuation parameter, where the reference fluctuation parameter can reflect the fluctuation condition in the running process of the device, so as to help us identify the potential cause of the fault, and input and calculate the reference fluctuation parameter by a measurement and calculation function, and output result as the hidden period of the fault, where the expression of the measurement and calculation function is: In which, in the process, Indicating a period of time of failure concealment,The fault parameters are indicated to be indicative of a fault,The standard operating parameters are indicated to be such that,Representing the reference fluctuation parameter(s),The time length of the reverse sampling of the primary characteristic parameters is represented, the equivalent offset is carried out on the fault occurrence time node based on the fault hiding period, and the offset result is calibrated as the fault starting point, so that the accurate positioning of the fault occurrence position is facilitated, and a reference basis is provided for subsequent fault detection and processing.

In a preferred embodiment, the step of bi-directionally shifting the fault origin and constructing the sampling period based on the shift result comprises:

S301, acquiring a fault starting point, and judging whether equipment faults exist between the fault starting point and a fault occurrence time node;

s302, carrying out forward deviation on a fault starting point according to a forward deviation amount by taking the fault starting point as a reference to obtain a forward deviation point;

S303, carrying out negative offset on the fault starting point according to the negative offset by taking the fault starting point as a reference to obtain a negative offset point;

S304, constructing a sampling period according to the positive offset point and the negative offset point, wherein the sampling period comprises a positive sampling period and a negative sampling period, the positive sampling period is a period from the fault starting point to the positive offset point, and the negative sampling period is a period from the negative offset point to the fault starting point.

In the fault analysis process, as described in the above steps S301-S304, firstly, a fault starting point is obtained, and it is required to determine whether there is a device fault between the fault starting point and a fault occurrence time node, if there is a device fault, the fault starting point should be recorded as an invalid node, and no bidirectional offset operation is performed, because the device fault near the fault starting point may interfere with the offset result, resulting in inaccurate analysis results, otherwise, if there is no device fault between the fault starting point and the fault occurrence time node, then, the next operation may be performed, where a required offset amount is required to be obtained, including a positive offset amount and a negative offset amount, which are used for positive and negative offsets of the fault starting point, respectively, and then, based on the fault starting point, the positive offset amount and the negative offset amount are used for positive and negative offsets of the fault starting point, the process is performed with a positive offset amount, which is a new position of the fault starting point along the direction of the positive offset amount, and likewise, the fault starting point may be performed with a negative offset amount, and finally, based on the offset point and a negative offset amount, the positive offset amount and the negative offset amount are used for providing accurate sampling support periods from the positive offset point and the positive offset period and the negative offset period to the negative offset period, respectively, and the sampling period is further, and the sampling support period is divided from the positive offset period and the negative offset period to the sampling period to the fault support.

S401, acquiring secondary characteristic parameters in a positive sampling period and a negative sampling period, sequencing the secondary characteristic parameters according to an occurrence time sequence, and performing difference processing on adjacent secondary characteristic parameters to obtain operation fluctuation quantity;

s402, respectively acquiring allowable fluctuation threshold values of the secondary characteristic parameters, and comparing the allowable fluctuation threshold values with the running fluctuation quantity;

if the running fluctuation amount is greater than the allowable fluctuation threshold, counting time nodes with the running fluctuation amount greater than the allowable fluctuation threshold, calibrating the time nodes as to-be-evaluated nodes, calibrating the correlation between the corresponding secondary characteristic parameters and the historical fault information as a triggering relationship when the to-be-evaluated nodes are in a positive sampling period, and calibrating the correlation between the corresponding secondary characteristic parameters and the historical fault information as an evocative relationship when the to-be-evaluated nodes are in a negative sampling period;

In the process of collecting and analyzing the secondary characteristic parameters, as described in the above steps S401-S402, attention is required to be paid to the operation fluctuation amount of each secondary characteristic parameter in the sampling period, and the correlation between the operation fluctuation amount and the history fault information is analyzed, firstly, the secondary characteristic parameters in the positive sampling period and the negative sampling period need to be obtained, the secondary characteristic parameters are ordered according to the occurring time sequence so as to facilitate the subsequent analysis, then, the adjacent secondary characteristic parameters are subjected to differential processing so as to obtain the operation fluctuation amount of the secondary characteristic parameters, secondly, the allowable fluctuation threshold value of each secondary characteristic parameter needs to be determined, the actual operation fluctuation amount is compared with the allowable fluctuation threshold value so as to judge whether an abnormal condition exists, if the operation fluctuation amount is larger than the allowable fluctuation threshold value, the time nodes with the operation fluctuation amount exceeding the allowable fluctuation threshold value are counted, the time nodes are marked as nodes to be evaluated, if the nodes to be evaluated are located in the positive sampling period, the correlation between the secondary characteristic parameters corresponding to the nodes and the history fault information is marked as a trigger relation, if the nodes to be evaluated are located in the negative sampling period, the correlation between the secondary characteristic parameters corresponding to the nodes and the history fault information is always marked as a correlation between the secondary characteristic parameters and the history fault information, if the correlation between the node and the operation fluctuation amount is not smaller than the allowable fluctuation threshold value is judged to exist.

In a preferred embodiment, the historical fault information is caused by abnormal fluctuations in the secondary characteristic parameter under the evoked relationship;

Under the triggering relation, the historical fault information causes abnormal fluctuation of the secondary characteristic parameters;

And after the secondary characteristic parameters under the induction relation and the triggering relation are output, calibrating the secondary characteristic parameters as the induction parameter and the triggering parameter respectively.

In this embodiment, in the evoked relationship stage, the generation of the historical fault information is caused by the abnormal fluctuation of the secondary characteristic parameter, the abnormal fluctuation may be caused by various factors, such as equipment aging, temperature abnormality or external environment change, etc., the abnormal fluctuation of the secondary characteristic parameter is used as an evoked factor to trigger the generation of the historical fault information, and in the trigger relationship stage, the historical fault information is in turn the cause of the abnormal fluctuation of the secondary characteristic parameter, so that the phenomenon can be understood that once the equipment breaks down, the abnormal fluctuation of the secondary characteristic parameter is indirectly or directly caused, the secondary characteristic parameter tends to be abnormal, and when the equipment is subjected to fault investigation, the equipment characteristics corresponding to the secondary characteristic parameter under the trigger relationship should also be subjected to investigation treatment, so as to avoid continuous faults.

In a preferred embodiment, the step of determining the fault condition parameter based on a correlation between the secondary characteristic parameter and the historical fault information comprises:

s403, counting occurrence frequency of the induction parameters under the same type of historical fault information, and calibrating the occurrence frequency as the parameters to be checked;

s404, acquiring a check threshold value, and comparing the check threshold value with a parameter to be checked;

If the parameter to be checked is greater than or equal to the check threshold, indicating that the corresponding induction parameter is a main induction factor of the historical fault information, and recording the fault condition parameter as a conventional fault parameter;

if the parameter to be checked is smaller than the check threshold, indicating that the corresponding induction parameter has contingency, and recording the fault condition parameter as a contingent fault parameter;

wherein, the troubleshooting priority of the conventional fault parameters is higher than the troubleshooting priority of the occasional fault parameters.

In the process of determining the fault condition parameters, as described in the above steps S403 to S404, the judgment is performed according to the correlation between the secondary characteristic parameters and the historical fault information, firstly, it is required to perform statistical analysis on the historical fault information of the same type, particularly focus on the occurrence frequency of the induced parameters, find out the secondary characteristic parameters with higher correlation with the faults, then calibrate the induced parameters as parameters to be verified, and then call the corresponding verification threshold, where the verification threshold is set to distinguish the primary induced factors and the secondary induced factors, the secondary induced factors correspond to the occasional fault parameters, then compare the occasional fault parameters with the parameters to be verified, if the value of the parameter to be verified is greater than or equal to the verification threshold, then indicate that the induced parameters corresponding to the historical fault information are the primary induced factors of the historical fault information, in this case, record the fault condition parameters as normal fault parameters, the investigation priority of the normal fault parameters is higher, because they are more likely to cause the equipment faults, otherwise, if the value of the parameter to be verified is less than the verification threshold, the corresponding induced parameters have the property, and in this case that the occasional fault condition parameters are relatively less than the occasional fault parameters are recorded as normal fault parameters.

In a preferred embodiment, the step of acquiring real-time operation parameters of the network device and diagnosing the operation state of the network device according to the fault condition parameters includes:

s501, acquiring real-time operation parameters of network equipment, and comparing and analyzing the real-time operation parameters with fault condition parameters;

s502, if the real-time operation parameters are matched with the fault condition parameters, indicating that the corresponding network equipment is in a risk state, and outputting a diagnosis result according to the type of the fault condition parameters, wherein when the diagnosis result is an occasional fault parameter, a secondary early warning signal is sent out, and when the diagnosis result is a conventional fault parameter, a primary early warning signal is sent out;

S503, if the real-time operation parameter is not matched with the fault condition parameter, indicating that the operation state of the corresponding network equipment is a normal state.

In order to ensure stable operation of the network device, the real-time operation parameters of the network device need to be collected in real time, and the operation state of the network device is diagnosed according to the fault condition parameters, the real-time operation parameters are compared with the preset fault condition parameters to find possible fault risks of the network device in time, and by comparing the real-time parameters with the fault condition parameters, whether the network device is in a risk state can be rapidly judged, so that a basis is provided for subsequent fault detection and processing, when the real-time operation parameters are matched with the fault condition parameters, the corresponding network device is in a risk state, at this time, the diagnosis results need to be output according to the types of the fault condition parameters, and the diagnosis results are divided into two categories: one is an occasional fault parameter, the other is a conventional fault parameter, for a network device with an occasional fault parameter as a diagnosis result, a second-level early warning signal should be sent out, which means that the device has a certain fault risk, but has not yet seriously affected the whole network, at this time, operation staff needs to pay close attention to the operation condition of the device, such as increasing the sampling frequency of the device parameter, if the risk is continuous, the second-level early warning signal needs to be updated into the first-level early warning signal, and for a network device with a conventional fault parameter as a diagnosis result, the first-level early warning signal should be sent out, which means that the device has a potential abnormality, which may have a great influence on the network performance and stability, in this case, maintenance measures need to be immediately taken to repair or replace the fault device to ensure the normal operation of the network, and finally, if the real-time operation parameter does not match with the fault condition parameter, the operation state of the corresponding network device is a normal state, in this case, the network device is considered to be currently running well, and special handling is not required. At the same time, however, there is a need to continue to maintain monitoring of the device in order to discover and resolve potential problems in time.

Referring to fig. 2, an intelligent diagnosis system for a network device based on big data is applied to the above intelligent diagnosis method for a network device based on big data, and includes:

The data classification module is used for acquiring the operated parameters of the network equipment, classifying the operated parameters according to the parameter characteristics to obtain a plurality of primary characteristic parameters, wherein each primary characteristic parameter corresponds to one recording subset;

The feature extraction module is used for carrying out bidirectional offset on the fault starting point, constructing a sampling period according to an offset result, and calibrating the running parameters of the network equipment in the sampling period into secondary feature parameters;

The correlation analysis module is used for collecting the operation fluctuation quantity of the secondary characteristic parameters in the sampling period, outputting the correlation between the secondary characteristic parameters and the historical fault information according to the operation fluctuation quantity, and determining fault condition parameters according to the correlation between the secondary characteristic parameters and the historical fault information, wherein the correlation comprises an induction relation and a triggering relation;

The state monitoring module is used for acquiring real-time operation parameters of the network equipment and diagnosing the operation state of the network equipment according to the fault condition parameters, wherein the operation state comprises a normal state and a risk state, and in the risk state, early warning signals are synchronously sent out.

The system comprises a data classifying module, a fault extracting module, a feature extracting module, a correlation analyzing module and a state monitoring module, wherein the data classifying module is responsible for collecting the operation parameters of the network equipment, classifying the equipment by analyzing the characteristics of the parameters, generating a plurality of primary feature parameters according to the classified results, providing basic data for subsequent fault diagnosis, the fault extracting module is responsible for collecting the historical fault information of the network equipment, determining the starting point of the fault by analyzing the historical fault data, setting the next-stage feature parameter of the starting point of the fault as a reference parameter, helping us to know the past fault condition of the equipment, providing reference for diagnosing the existing fault, performing bidirectional offset on the starting point of the fault by the feature extracting module, the sampling period is constructed according to the offset result, the operating parameters of the network equipment are collected and analyzed in the sampling period, the parameters are calibrated into secondary characteristic parameters, the aim is to analyze the operating conditions of the equipment from multiple dimensions, richer data support is provided for fault diagnosis, a correlation analysis module collects the operating fluctuation quantity of the secondary characteristic parameters, analyzes the correlation between the secondary characteristic parameters and historical fault information and synchronously determines fault condition parameters, the correlation between the secondary characteristic parameters and the historical fault information comprises an induction relation and a triggering relation, the network equipment is helped to know the cause and the process of fault generation, a state monitoring module is responsible for acquiring the operating parameters of the network equipment in real time, and diagnosing the operating states of the equipment according to the fault condition parameters, wherein the operating states comprise a normal state and a risk state, the state monitoring module synchronously sends out early warning signals to remind related personnel to take measures in time, so that the expansion of faults is prevented, and the stability of the running process of the network equipment is ensured.

Referring to fig. 3, an electronic device includes:

At least one processor;

And a memory communicatively coupled to the at least one processor;

The memory stores a computer program executable by the at least one processor, so that the at least one processor can execute the intelligent diagnosis method of the network equipment based on big data.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that comprises the element.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention. Structures, devices and methods of operation not specifically described and illustrated herein, unless otherwise indicated and limited, are implemented according to conventional means in the art.

Claims

1. A network equipment intelligent diagnosis method based on big data is characterized in that: comprising the following steps:

2. The intelligent diagnosis method for the network equipment based on the big data according to claim 1, wherein the intelligent diagnosis method is characterized by comprising the following steps of: the step of obtaining the running parameters of the network equipment and classifying the running parameters according to the parameter characteristics to obtain a plurality of first-level characteristic parameters comprises the following steps:

acquiring the running parameters of the network equipment;

3. The intelligent diagnosis method for the network equipment based on the big data according to claim 1, wherein the intelligent diagnosis method is characterized by comprising the following steps of: the step of obtaining the historical fault information of the network equipment and outputting a fault starting point according to the historical fault information comprises the following steps:

4. The intelligent diagnosis method for the network equipment based on the big data according to claim 3, wherein the intelligent diagnosis method is characterized by comprising the following steps of: the step of performing bidirectional offset on the fault starting point and constructing a sampling period according to an offset result comprises the following steps:

5. The intelligent diagnosis method for the network equipment based on the big data according to claim 4, wherein the intelligent diagnosis method is characterized by comprising the following steps of: and in the sampling period, acquiring the operation fluctuation quantity of the secondary characteristic parameter, and outputting the correlation between the secondary characteristic parameter and the historical fault information according to the operation fluctuation quantity, wherein the step comprises the following steps:

6. The intelligent diagnosis method for the network equipment based on the big data according to claim 1, wherein the intelligent diagnosis method is characterized by comprising the following steps of: under the induction relation, the historical fault information is caused by abnormal fluctuation of secondary characteristic parameters;

7. The intelligent diagnosis method for the network equipment based on the big data according to claim 6, wherein the intelligent diagnosis method is characterized by comprising the following steps of: the step of determining the fault condition parameter according to the correlation between the secondary characteristic parameter and the historical fault information comprises the following steps:

8. The intelligent diagnosis method for the network equipment based on the big data according to claim 1, wherein the intelligent diagnosis method is characterized by comprising the following steps of: the step of obtaining the real-time operation parameters of the network equipment and diagnosing the operation state of the network equipment according to the fault condition parameters comprises the following steps:

9. The intelligent diagnosis system for network equipment based on big data, which is applied to the intelligent diagnosis method for network equipment based on big data as set forth in any one of claims 1 to 8, is characterized in that: comprising the following steps:

10. An electronic device, characterized in that: the electronic device includes:

At least one processor;

And a memory communicatively coupled to the at least one processor;

Wherein the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the big data based network device intelligent diagnostic method of any of claims 1 to 8.