CN115580526A

CN115580526A - Communication network fault diagnosis method, system, electronic equipment and storage medium

Info

Publication number: CN115580526A
Application number: CN202211209903.1A
Authority: CN
Inventors: 尹文龙; 郭宝锋; 崔佩璋; 李召瑞; 孙慧贤; 李晓辉; 周永学; 郄龙; 王文娟; 陶杰
Original assignee: Army Engineering University of PLA
Current assignee: Army Engineering University of PLA
Priority date: 2022-09-30
Filing date: 2022-09-30
Publication date: 2023-01-06
Anticipated expiration: 2042-09-30
Also published as: CN115580526B

Abstract

The invention relates to a communication network fault diagnosis method, a system, electronic equipment and a storage medium, which relate to the field of electronic equipment communication, and the method comprises the following steps: acquiring historical electronic equipment communication network state data; clustering historical electronic equipment communication network state data by using an improved K-means clustering algorithm to obtain a clustering result; determining the structure of the Elman neural network by using a fitting error method, and training the Elman neural network by using a clustering result as a training set to obtain a fault prediction model; and taking the real-time monitoring electronic equipment communication network state data as a test set, and predicting by using a fault prediction model to obtain a fault diagnosis model. The invention solves the problems of difficult fault location and difficult prediction of the electronic equipment communication network in the guarantee process.

Description

Communication network fault diagnosis method, system, electronic equipment and storage medium

Technical Field

The present invention relates to the field of electronic equipment communications, and in particular, to a method, a system, an electronic device, and a storage medium for diagnosing a communication network failure.

Background

The electronic equipment communication network is complex in composition, information coupling relations among components are close, fault diagnosis and prediction are carried out on the electronic equipment communication network, and the method relates to a plurality of technologies such as computer technology, information processing technology, communication technology and the like.

Disclosure of Invention

The invention aims to provide a communication network fault diagnosis method, a communication network fault diagnosis system, electronic equipment and a storage medium, and solves the problems that the fault location is difficult and the prediction is difficult in the process of ensuring an electronic equipment communication network.

In order to achieve the purpose, the invention provides the following scheme:

a method of communication network fault diagnosis, comprising:

acquiring historical electronic equipment communication network state data;

clustering the historical electronic equipment communication network state data by using an improved K-means clustering algorithm to obtain a clustering result; the clustering criterion function of the improved K-means clustering algorithm is a sum of square errors criterion function; the improved K-means clustering algorithm determines an initial clustering center by using a density partitioning algorithm; the improved K-means clustering algorithm utilizes a K value-SSE line image algorithm to determine the clustering number;

determining the structure of the Elman neural network by using a fitting error method, and training the Elman neural network by using the clustering result as a training set to obtain a fault prediction model;

and taking the real-time monitoring electronic equipment communication network state data as a test set, and predicting by using the fault prediction model to obtain a fault diagnosis model.

Optionally, the determining, by the improved K-means clustering algorithm, an initial clustering center by using a density partition algorithm specifically includes:

calculating density threshold values under different clustering number values by using the density threshold value coefficient and the Euclidean distance;

determining an initial clustering center according to the historical electronic equipment communication network state data and the density threshold.

Optionally, the improved K-means clustering algorithm determines the number of clusters by using a K-value-SSE line image algorithm, and specifically includes:

calculating the sum of squares of errors under different clustering numbers according to the historical electronic equipment communication network state data;

determining a k value-SSE line graph according to the clustering number and the sum of squared errors;

and determining the clustering number according to the inflection point of the k value-SSE line graph.

Optionally, the activation function of the Elman neural network is a hyperbolic tangent sigmoid function, and an expression of the hyperbolic tangent sigmoid function is as follows:

wherein f (sigma) is a hyperbolic tangent sigmoid function, and sigma is output of an input layer of the Elman neural network.

The invention also provides a communication network fault diagnosis system, comprising:

the acquisition module is used for acquiring historical electronic equipment communication network state data;

the clustering module is used for clustering the historical electronic equipment communication network state data by utilizing an improved K-means clustering algorithm to obtain a clustering result; the clustering criterion function of the improved K-means clustering algorithm is a sum of square errors criterion function; the improved K-means clustering algorithm determines an initial clustering center by using a density partitioning algorithm; the improved K-means clustering algorithm determines the clustering number by using a K value-SSE (steady state wavelet transform) line image algorithm;

the training module is used for determining the structure of the Elman neural network by utilizing a fitting error method, and training the Elman neural network by taking the clustering result as a training set to obtain a fault prediction model;

and the prediction module is used for predicting the real-time monitoring electronic equipment communication network state data as a test set by using the fault prediction model to obtain a fault diagnosis model.

Optionally, the determining a sub-module of an initial clustering center in the clustering module specifically includes:

the density threshold value determining unit is used for calculating density threshold values under different clustering number values by using a density threshold value coefficient and Euclidean distance;

and the initial clustering center determining unit is used for determining an initial clustering center according to the historical electronic equipment communication network state data and the density threshold.

Optionally, the word module for determining the number of clusters in the clustering module specifically includes:

the error square sum determining unit is used for calculating the error square sum under different clustering numbers according to the historical electronic equipment communication network state data;

the k value-SSE line graph determining unit is used for determining a k value-SSE line graph according to the clustering number and the sum of squares of errors;

and the clustering number determining unit is used for determining the clustering number according to the inflection point of the k value-SSE line graph.

The present invention also provides an electronic device comprising:

one or more processors;

a storage device having one or more programs stored thereon;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as in any one of the above.

The invention also provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements a method as set forth in any one of the above.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the method comprises the steps of obtaining historical electronic equipment communication network state data; clustering the historical electronic equipment communication network state data by using an improved K-means clustering algorithm to obtain a clustering result; the clustering criterion function of the improved K-means clustering algorithm is a sum of square errors criterion function; the improved K-means clustering algorithm determines an initial clustering center by using a density partitioning algorithm; the improved K-means clustering algorithm determines the clustering number by using a K value-SSE (steady state wavelet transform) line image algorithm; determining the structure of the Elman neural network by using a fitting error method, and training the Elman neural network by using the clustering result as a training set to obtain a fault prediction model; and taking the real-time monitoring electronic equipment communication network state data as a test set, and predicting by using the fault prediction model to obtain a fault diagnosis model. According to the method, the communication network state data are analyzed by using a clustering algorithm, the traditional K-means algorithm is optimized based on an improved K-means clustering algorithm, the electronic equipment communication network fault diagnosis is completed, the Elman neural network algorithm is used for carrying out fault prediction on the electronic equipment communication network, and the structure of an Elman neural network hidden layer is determined by a fitting error analysis method aiming at the problem that the Elman neural network structure is difficult to select, so that the problems that the electronic equipment communication network is difficult to locate and predict in the process of guaranteeing are solved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

Fig. 1 is a flowchart of a communication network fault diagnosis method provided by the present invention;

FIG. 2 is a schematic diagram of the operation of the electronic equipment;

FIG. 3 is a schematic structural diagram of a conventional Elman neural network;

FIG. 4 is a flow chart of Elman neural network prediction;

FIG. 5 is a schematic diagram of an experimental environment;

FIG. 6 is a k-value-SSE line graph;

FIG. 7 is a schematic diagram of clustering results;

FIG. 8 is a comparison graph of clustering results;

FIG. 9 is a diagram illustrating predicted results;

FIG. 10 is a comparison graph of prediction errors;

FIG. 11 is a schematic of a sample region;

FIG. 12 is a graph illustrating threshold ranges when k is 1;

FIG. 13 is a graph illustrating the threshold range when k is 2;

FIG. 14 is a graph illustrating the threshold range when k is 3;

FIG. 15 is a graph illustrating threshold ranges when k is 4;

FIG. 16 is a diagram illustrating the threshold range when k is 5.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

As shown in fig. 1, a method for diagnosing a fault in a communication network according to the present invention includes:

step 101: historical electronic equipment communication network status data is obtained.

Through research on the operation mechanism of the electronic equipment communication network, the basic structure of the electronic equipment communication network is shown in fig. 2, and each electronic equipment node comprises three subsystems, namely an information processing subsystem, a network control subsystem and a communication subsystem. The information processing subsystem comprises a network switch, a server and a client. The network control subsystem comprises network control equipment and network safety protection equipment. The communication subsystem comprises various wireless communication equipment, wired communication equipment, a junction box and the like. The method collects the state data of the data link layer of the communication network in the network control subsystem and the communication subsystem, analyzes and clusters the state data generated by the communication network by using a clustering algorithm so as to achieve the purposes of finding faults and quickly positioning the faults, and predicts the equipment faults by using an Elman neural network algorithm so as to reduce the adverse effect of sudden equipment faults on equipment application.

Step 102: clustering the historical electronic equipment communication network state data by using an improved K-means clustering algorithm to obtain a clustering result; the clustering criterion function of the improved K-means clustering algorithm is a sum of square errors criterion function; the improved K-means clustering algorithm determines an initial clustering center by using a density partitioning algorithm; the improved K-means clustering algorithm determines the clustering number by using a K value-SSE line image algorithm.

The improved K-means clustering algorithm determines an initial clustering center by using a density partition algorithm, and specifically comprises the following steps:

and calculating density threshold values under different clustering number values by using the density threshold value coefficient and the Euclidean distance.

The improved K-means clustering algorithm determines the clustering number by using a K value-Sum of Square Error (SSE) broken line image algorithm, and specifically comprises the following steps:

and calculating the sum of squares of errors under different cluster numbers according to the historical electronic equipment communication network state data.

And determining a k value-SSE line graph according to the clustering number and the sum of squares of errors.

Clustering is the process of assigning objects in a data set to different clusters. Data objects in the same cluster are similar to each other and have a higher similarity, while data objects in different clusters are different from each other and have a lower similarity. The degree of similarity between two data objects is determined according to the value of the description attribute of the data objects, and the similarity of the data objects is usually described by the distance between the data objects, and the large distance between the data objects indicates that the data objects are different from each other, otherwise, the data objects are similar to each other. The basic guiding idea of clustering is to realize the maximum similarity among data objects in a class and the minimum similarity among data objects in the class to the maximum extent.

Before the iteration of the clustering algorithm, the algorithm firstly randomly selects K data objects from a data set in turn as K initial clustering centers, then sequentially divides other data objects into classes in which the nearest clustering centers are located, finishes dividing the data objects, then calculates the center average value of each cluster as a new clustering center point, and iterates the clustering process. Until the cluster center no longer changes, i.e. the cluster criterion function values converge, or the cluster criterion function successive values differ by less than a given threshold.

Aiming at the problems that the clustering result falls into a local optimal solution and the clustering result is greatly influenced by an initial clustering center due to the random selection of the initial clustering center in the traditional K-means algorithm, the traditional K-means algorithm is improved by combining the characteristics of the electronic equipment communication network state data and adopting the modes of state data normalization processing, selection of a similarity measurement formula and a clustering criterion function, determination of the initial clustering center based on a density partition algorithm and the like.

(1) Normalization process

Because each object contains 3 attributes, and the value range of each attribute is different, the data is firstly normalized before the data is subjected to cluster analysis, and the normalization processing has the advantages that: 1) After normalization, the speed of solving the optimal solution by gradient descent in the clustering analysis process is increased; 2) Normalization can improve the accuracy of cluster analysis.

Representing the equipment running state data by using an m-n matrix, wherein the row number of the matrix is m, m attributes in all of each sample are represented, and m =3 according to the characteristics of the equipment running state data; the number of columns of the matrix is n, representing a total of n samples. Normalizing the data should also be normalizing the data of the same row. The normalization algorithm adopted by the invention is as follows:

wherein x is the number to be normalized in the matrix, y is the normalized number, x _max Is the maximum value of the row of the matrix corresponding number x, x _min Is the minimum value of the row where the matrix correspondence x is located.

(2) Similarity measurement formula selection

Suppose data set X = { X = ₁ ,x ₂ ,…,x _n There are two data objects x in _i And x _j The concrete description is as follows: x is the number of _i ＝{x _i1 ,x _i2 ,…,x _ir And x _j ＝{x _j1 ,x _j2 ,…,x _jr Before clustering, similarity calculation needs to be performed on the two data objects, and d (x) is generally adopted _i ，x _j ) I.e. the distance between the objects, to determine the degree of similarity between the two data objects, in general d (x) _i ，x _j ) The larger, the more data object x is specified _i And x _j The larger the difference, and vice versa. In clustering algorithms, the distance function used to calculate the distance between data objects typically satisfies the following requirements: nonnegativity, symmetry, triangle inequality. Nonnegativity generally refers to the similarity value (distance) d (x) between data objects _i ，x _j ) Not less than 0, the symmetry between data objects means that the symmetry requirement is satisfied between data objects, i.e. d (x) _i ，x _j )＝d(x _j ，x _i ) Finally, the distance function needs to satisfy the requirement of triangle inequality, that is, the property that the sum of two sides is greater than the third side is satisfied, if expressed by a formula: d (x) _i ，x _j )<d(x _i ，x _k )+d(x _k ，x _j ). Electronic equipment based communication networkThe characteristic of the state data, the invention mainly uses the Euclidean distance as the measurement formula of the similarity of the state data, the calculation formula of the Euclidean distance is as follows:

d(x _i ，x _j )＝(|x _i1 -x _j1 | ² +|x _i2 -x _j2 | ² +…+|x _ir -x _jr | ² ) ^1/2

(3) Clustering criteria function selection

The clustering algorithm is a key research content of clustering analysis, wherein similarity measurement is a basis of the clustering algorithm, a similarity measurement function is used for determining the dissimilarity degree of objects in a data set, but the similarity measurement of the clustering algorithm is not enough, a criterion function is also needed for evaluating the quality of a clustering result, the quality of the clustering result is influenced to a certain extent by the quality of the selection of the clustering criterion function, and a good clustering criterion function can often obtain a more correct clustering result. Currently, common clustering functions are: the method mainly adopts the error square sum criterion function as a clustering criterion function of state data clustering analysis based on the characteristics of electronic equipment communication network state data, and the error square sum criterion function, the weighted average square distance sum criterion function, the inter-class distance sum criterion function and the weighted inter-class distance sum criterion function have the following calculation processes:

suppose a data set X = { X) containing n data ₁ ,x ₂ ,…,x _n And f, obtaining k clusters after clustering, wherein the k clusters are expressed as: w ₁ ,W ₂ ,…,W _k Wherein the number of data objects in each cluster is n ₁ ,n ₂ ,…,n _k Namely: n is ₁ +n ₂ +…+n _k And (n). Let m _j Represents the jth cluster W _j The mean of all the objects in (1) is similar to the operation state data object of each device, m _j Is composed of r attributes, then m _j The h-th attribute calculation method is as follows:

the sum of squared errors criterion function is defined as:

sum of squares of errors criterion function J _c The value of (2) can be described as the sum of the squares of the errors of the data objects in all clusters and the cluster center in which the data objects are located according to the mathematical formula, it is obvious that if the accuracy of one cluster result is higher, the degree of similarity between the data objects of each cluster in the cluster result is higher, and since the data objects have higher similarity with the cluster center, the value of the cluster criterion function, i.e., the value of the sum of the squares of the errors, is smaller, and if the accuracy of the cluster result is lower, the degree of similarity between the data objects of each cluster in the cluster result is lower, and since the data objects have lower similarity with the cluster center, the value of the sum of the squares of the errors of the values of the cluster criterion function is also larger. From the above analysis, the following conclusions can be drawn: the objective of clustering is to assign objects in a data set to different classes, but it is also a clustering result that to some extent finds the smallest value of the clustering criterion function.

(4) Determining initial clustering centers based on density partition algorithm

And (4) optimizing the selection of the initial clustering center of the K-means clustering algorithm under the inspiration of the density-based clustering algorithm idea to ensure that the selected result is close to the optimal solution as much as possible. When the initial clustering center is selected, a density threshold is used as a distance metric to replace the traditional Euclidean distance metric. When each sample point is judged as the center, the total number of the sample points existing in a certain spherical space of the point is calculated through a defined density threshold value and is recorded as the density of the sample point. And selecting the sample point with the maximum density as a first initial clustering center, and selecting the point with the maximum Euclidean distance from the first sample point as a second initial clustering center, wherein the purpose of the step is to avoid that the selected second initial clustering center is similar to or too close to the first clustering center.

Suppose data set X = { X ₁ ,x ₂ ,…,x _n There are two data objects x in _i And x _j The concrete description is as follows: x is the number of _i ＝{x _i1 ,x _i2 ,…,x _ir And x _j ＝{x _j1 ,x _j2 ,…,x _jr The density threshold formula is:

wherein alpha is defined as a density threshold coefficient, when the sample point is reserved as a clustering center for discrimination, the corresponding density threshold can reserve the density of the core area when the point is used as the center and has more referential, and k is the number of clusters. Density threshold y _z Defined as the maximum value of the Euclidean distance between any two points in the sample point set multiplied by the coefficient

The value of (c).

The selection of the density threshold coefficient has a decisive influence on the classification effect of the algorithm, so that the selection of the alpha value is optimized to be suitable for the data distribution type of the experiment, and the clustering result is close to the optimal solution, thereby having an important influence. According to the characteristics of experimental data, the idea of selecting the density threshold coefficient alpha is as follows: according to the concept of normal distribution, a standard deviation range is used as a 'core region' of data, the distribution of the corresponding 'core region' is selected according to different clustering numbers (k values), and the density threshold coefficient alpha is obtained through calculation.

According to the experimental preprocessing result, the classification number k of the experimental data is less than 5, so that the density threshold coefficient alpha of k = 1-5 is derived.

The distance between any two points of the data sample is calculated, and the maximum distance is taken as the diameter to obtain a circular area which can substantially cover all the sample points, as shown in fig. 11.

Assume that when k =1, a circle a is drawn with the diameter at the maximum distance, and this circle is taken as the range of all data distributions. According to the normal distribution, a numerical range of one standard deviation of 68.27% is used as a core area of the experimental data under the condition of k =1, a circle B is used as a range of the core area, and a radius ratio of the circle a to the circle B is R: r =1. As in fig. 12.

In FIG. 12, the density threshold y _z The geometric meaning of (a) is the radius of circle B (a single classification core data range circle), according to the density threshold formula

From the geometrical relationship, the value of the density threshold coefficient α is α =2.9295 when k =1.

Assuming that when k =2, the distribution range of the two types of data in the data distribution range a is represented by circles B and C, and the numerical range of one standard deviation of 68.27% is still used as the core area of the classified data, and the core area ranges of the circles B and C are circles D and E. As shown in fig. 13. The value of the density threshold coefficient alpha is a =2.9295 under the condition that k =2 can be obtained from the geometrical relation according to the density threshold formula.

Similarly, when k =3, the classification distribution is as shown in fig. 14. When k =3 is calculated, the density threshold coefficient α =2.1041.

When k =4, the classification distribution is as shown in fig. 15. When calculated to k =4, the density threshold coefficient α =1.7682.

When k =5, the classification distribution is as shown in fig. 16. When k =5 is calculated, the density threshold coefficient α =1.582.

In summary, the values of the experimental data α when k =1 to 5 are shown in table 1.

TABLE 1 Density threshold coefficient value-taking Table

k	α
			1	2.9295
2	2.9295
		3	2.1041
4	1.7682
		5	1.582

In K mean value calculation, density threshold values y with different sizes are obtained under different K values _z Calculating each point to correspond to a density threshold y _z Selecting a point with the maximum corresponding density as a first initial clustering center after the density of a core area of the range, then calculating the Euclidean distance between the point and other points, selecting a point with the maximum distance from the point as a second initial clustering center, selecting a point with the maximum Euclidean distance from a third clustering center to the first initial clustering center and the second initial clustering center, similarly selecting a fourth point and a fifth point, and so on until k initial clustering centers meeting the conditions are obtained. The improved method can effectively observe that the accuracy of the clustering result of the data is improved.

(5) Determination of the number of clusters k

Aiming at the other defect that the K value of the clustering number cannot be determined in the traditional K-means algorithm, the K value is determined by a K value-SSE (steady state imaging) line graph method. The specific method comprises the following steps: firstly, the sum of the squares of the errors J under different k values is calculated _c And drawing a k value-SSE line graph, and then determining a proper value of k by searching an inflection point in the graph, wherein the value of the SSE is inevitably reduced along with the gradual increase of the k value, and the k value at a smooth part (also called an elbow, which refers to a turning part in the middle of a curve which is rapidly reduced to a gentle reduction) in the image can maximally reach a balance between the SSE and the k value.

(6) Improved algorithm process

According to the characteristics of the state data of the electronic equipment communication network, the selection range of the k value of the clustering number is set to be 2-7, aiming at each value of the k value, an initial clustering center is determined by a density method, then clustering processing is carried out based on the traditional k mean value iteration process, and finally the error square sum J under the k value is calculated _c . And finally, drawing a k value-SSE line graph, and determining a proper k value by finding an inflection point.

Aiming at each value of the k value, the specific iteration steps are as follows:

1. sequentially calculating the density of all sample points in the electronic equipment communication network state data set, selecting the sample point with the maximum density as a first initial clustering center, and recording as p ₁ 。

2. Calculating other sample points and p in turn ₁ The sample point with the largest distance is selected as the second initial clustering center p ₂ 。

3. Calculating other sample points and p in turn ₁ Euclidean distance d (x) _n ,x ₁ ) And p ₂ Euclidean distance d (x) _n ,x ₂ ) Selecting d (x) _n ,x ₁ )+d(x _n ,x ₂ ) Is taken as the third initial cluster center and is denoted as p ₃ 。

4. By the way of analogy, the method can be used,

obtaining corresponding k initial clustering centers p through calculation _k 。

5. And searching the clustering result of the clustering algorithm by utilizing the iterative process in the traditional k-means clustering algorithm, and dividing the data objects into the clusters represented by the nearest clustering centers one by one according to the nearest principle.

6. Respectively calculating the mean value of all data objects in each cluster as a new center of each cluster, comparing the updated cluster center with the original cluster center, returning to the step 5 if the cluster center changes, and considering that the cluster center is selected if the cluster center does not change position any moreAnd finishing, calculating the minimum error square sum J of the criterion function according to the cluster center obtained by operation _c 。

Step 103: and determining the structure of the Elman neural network by using a fitting error method, and training the Elman neural network by using the clustering result as a training set to obtain a fault prediction model.

The activation function of the Elman neural network is a hyperbolic tangent Sigmoid function (Tan-Sigmoid function), and the expression of the hyperbolic tangent Sigmoid function is as follows:

Step 104: and taking the real-time monitoring electronic equipment communication network state data as a test set, and predicting by using the fault prediction model to obtain a fault diagnosis model.

The artificial neural network can perform associative memory and large-scale parallel processing, and can be divided into a feedforward neural network and a feedback neural network according to the flow direction of information in the network. The Elman neural network is a relatively common feedback type neural network, and introduces a receiving layer on an implicit layer of a feed-forward type network, and belongs to an internal delay network. The Elman neural network can process dynamic information and reflect the dynamic process of the system.

The network topology of the Elman neural network is composed of four layers: the first layer is an input layer and comprises a plurality of input nodes, and the input nodes play a role in transmitting signals; the second layer is a hidden layer and comprises a plurality of hidden layer neurons, and the hidden layer neurons receive input from the input layer nodes and feedback input of the carrying layer nodes; the third layer is a receiving layer, and the receiving layer units respectively memorize and store the output values of the hidden layer neurons corresponding to the receiving layer units at the previous moment and delay the output values to feed back to the hidden layer neurons. The fourth layer is an output layer, and neurons of the output layer play a linear weighting role. The delay feedback effect of the neuron of the bearing layer enables the Elman neural network to have strong sensitivity and dynamic memory function on historical data. The structure of the Elman neural network is shown in figure 3.

The structure of the Elman neural network is determined by a fitting error analysis method, the idea of determining the structure of the Elman neural network by the fitting error analysis method is to perform learning training by training set data to obtain dynamic characteristics among input and output parameters, an error correction learning rule is adopted, structural parameters of each layer are dynamically adjusted, and finally stable network parameters are obtained, as shown in FIG. 4, the specific process is as follows:

(1) Learning training is carried out through training set data, data input is received through an input layer, processing is carried out through a hidden layer, and forward propagation of input signals is carried out in a mode of outputting results from an output layer;

(2) And calculating the error between the real output result and the expected output result of the output layer, if the difference is overlarge and exceeds the acceptable range, entering an error back propagation link, reversely propagating error signals layer by layer in a certain form, distributing the error signals to neurons of each layer, and updating and correcting the weight value and the threshold matrix of each neuron according to the error signals.

(3) The purpose of the network learning is to find a weight matrix which can minimize an objective function, at this time, error correction learning is converted into a typical optimization problem, and a learning algorithm based on gradient descent is often adopted for the optimization problem.

In the implementation process of the fitting error analysis method, an activation function needs to be specified for an hidden layer of the Elman neural network, a continuous nonlinear function is selected in the invention, because the continuous nonlinear activation function can be derived, the solution can be carried out by an optimization method, and the specific function selected in the invention is a hyperbolic tangent sigmoid function (Tansig), and the Tansig function expression is as follows:

the input data of the Elman neural network are also status data of the communication network. The data clustering mainly completes the diagnosis of the fault, namely the communication fault exists, and the fault is specifically positioned through clustering; and the prediction means that the fault does not occur at present, but the next communication network state is predicted according to the trend of the current state data, and the fault is predicted.

As shown in fig. 5, the whole experimental environment is composed of a server, a client, a switching unit, a network control device, a wireless communication device, a software radio platform, a load generator and a main control unit. The server, the client, the switching unit, the network control device and the wireless communication device construct an equipment link, the software radio platform and the main control unit simulate various wireless communication link communication interferences and communication faults, and the load generator is used for simulating various data services. The invention collects the state data of the electronic equipment communication network to diagnose and predict the fault, and takes three indexes of the utilization rate, the number of the sending data packets and the number of the receiving data packets in the electronic equipment communication network as the judgment basis to analyze the big data.

Iterating the k value through density-based partition clustering algorithm calculation, and calculating to obtain the corresponding minimum error square sum J when the k values are 2 to 7 _c The k value-SSE line graph is obtained as follows in FIG. 6:

as can be seen from fig. 6, the k value is selected to be 3 by using an image method through an inflection point, and k =3 is calculated by a density-based partition clustering algorithm, so that a clustering result shown in fig. 7 is obtained:

the 387 monitoring data points are divided into three types of states through clustering, wherein a five-pointed star represents a normal state, and a square represents a high-load running state inverted triangle represents a fault state. Through the processing, the fault type can be conveniently and quickly judged when the equipment generates sudden faults, the fault position can be positioned, and the faults can be solved in the shortest time.

Compared with the traditional clustering analysis based on division, the result comparison is carried out below. As shown in fig. 8 (a), the clustering result obtained by the clustering algorithm based on the present invention is compared with the clustering result obtained by the conventional clustering algorithm based on partitioning shown in fig. 8 (b), and it is obvious that the improved algorithm of the clustering algorithm of the present invention is more uniform in sample point classification and distribution, and the result is more reasonable and reliable. Because the initial starting point of the traditional K mean value is randomly selected, the deviation of the operation result can be caused by each operation, the operation result falls into the local optimal solution, and the result obtained by the density-based partition clustering algorithm is stable.

And (3) training an Elman neural network algorithm by using an Elman neural network according to the known sample points of the monitoring data, and then predicting the sample index of the next time point.

After training, the prediction result of the operation is shown in fig. 9, where the dotted line is the prediction result and the solid line is the actual result. A comparison graph of prediction errors resulting from different numbers of hidden layer neurons is shown in FIG. 10.

Through an Elman neural network algorithm, equipment parameters at a future time point can be predicted, and the influence of sudden failures of battlefield equipment on equipment application can be reduced powerfully. Meanwhile, the more monitoring data are input, the more sufficient the neural network training is, and the more accurate the equipment parameter prediction at the future time point is.

The invention mainly researches a K-means clustering algorithm and a neural network algorithm. The K-means clustering algorithm is one of the earliest and most widely applied clustering analysis algorithms, and has the advantages of simple structure, easiness in implementation, strong local searching capability, suitability for processing a large data set and the like. The work of the present invention mainly includes the following three aspects:

(1) An improved K-means algorithm, namely a density-based partition clustering algorithm, is provided. The algorithm effectively solves the problem of different initial points of k values in the clustering algorithm based on division. Meanwhile, a new measurement formula, namely a density threshold value formula, is provided.

(2) The method completes the calculation of the criterion function Jc under different K values by using a computer, and solves the problem of uncertain K value selection of the traditional K mean value by selecting a proper K value through an image method.

(3) And predicting the data condition of the future point by utilizing the Elman neural network through the learning of the monitored data, thereby realizing the pre-discovery of the fault.

the acquisition module is used for acquiring historical electronic equipment communication network state data.

The clustering module is used for clustering the historical electronic equipment communication network state data by utilizing an improved K-means clustering algorithm to obtain a clustering result; the clustering criterion function of the improved K-means clustering algorithm is a sum of square errors criterion function; the improved K-means clustering algorithm determines an initial clustering center by using a density partitioning algorithm; the improved K-means clustering algorithm determines the clustering number by using a K value-SSE line image algorithm.

And the training module is used for determining the structure of the Elman neural network by using a fitting error method, and training the Elman neural network by using the clustering result as a training set to obtain a fault prediction model.

In practical application, the initial clustering center determining submodule in the clustering module specifically includes:

and the density threshold value determining unit is used for calculating the density threshold values under different clustering number values by using the density threshold value coefficient and the Euclidean distance.

In practical application, the module for determining the clustering number in the clustering module specifically comprises:

and the error square sum determining unit is used for calculating the error square sum under different clustering numbers according to the historical electronic equipment communication network state data.

And the k value-SSE line graph determining unit is used for determining a k value-SSE line graph according to the clustering number and the sum of squares of errors.

In practical application, the activation function of the Elman neural network is a hyperbolic tangent sigmoid function, and the expression of the hyperbolic tangent sigmoid function is as follows:

The present invention also provides an electronic device comprising:

one or more processors.

A storage device having one or more programs stored thereon.

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods as described above.

The invention also provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method as described above.

Aiming at the problems of difficult positioning and difficult prediction of electronic equipment communication network faults, the invention provides a method based on big data analysis for diagnosing and predicting the electronic equipment communication network faults, a clustering algorithm is used for analyzing state data of the communication network, a density-based partition improvement algorithm is provided for solving the problems that a clustering result falls into a local optimal solution and the clustering result is greatly influenced by an initial clustering center due to random selection of the initial clustering center in the traditional K-means algorithm, the Elman neural network algorithm is used for predicting the faults of the electronic equipment communication network, and the structure of an Elman neural network hidden layer is determined by a fitting error analysis method aiming at the problem that the Elman neural network structure is difficult to select.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the description of the method part.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A method for diagnosing a fault in a communication network, comprising:

acquiring historical electronic equipment communication network state data;

clustering the historical electronic equipment communication network state data by using an improved K-means clustering algorithm to obtain a clustering result; the clustering criterion function of the improved K-means clustering algorithm is a sum of square errors criterion function; the improved K-means clustering algorithm determines an initial clustering center by using a density partitioning algorithm; the improved K-means clustering algorithm determines the clustering number by using a K value-SSE (steady state wavelet transform) line image algorithm;

2. The method for diagnosing faults in a communication network according to claim 1, wherein the improved K-means clustering algorithm determines an initial clustering center by using a density partition algorithm, and specifically comprises:

3. The method for diagnosing faults in a communication network according to claim 1, wherein the improved K-means clustering algorithm determines the number of clusters by using a K-value-SSE line image algorithm, and specifically comprises:

determining a k value-SSE line graph according to the clustering number and the sum of squares of errors;

4. The communication network fault diagnosis method according to claim 1, wherein the activation function of the Elman neural network is a hyperbolic tangent sigmoid function, and an expression of the hyperbolic tangent sigmoid function is as follows:

wherein f sigma is a hyperbolic tangent sigmoid function, and sigma is output of an input layer of the Elman neural network.

5. A communication network fault diagnosis system, comprising:

the clustering module is used for clustering the historical electronic equipment communication network state data by utilizing an improved K-means clustering algorithm to obtain a clustering result; the clustering criterion function of the improved K-means clustering algorithm is a sum of square error criterion function; the improved K-means clustering algorithm determines an initial clustering center by using a density partitioning algorithm; the improved K-means clustering algorithm determines the clustering number by using a K value-SSE (steady state wavelet transform) line image algorithm;

and the prediction module is used for predicting the state data of the real-time monitoring electronic equipment communication network as a test set by using the fault prediction model to obtain a fault diagnosis model.

6. The system according to claim 5, wherein the initial cluster center determining submodule in the clustering module specifically includes:

7. The system according to claim 5, wherein the cluster number determining word module in the clustering module specifically includes:

8. The communication network fault diagnosis system according to claim 5, wherein the activation function of the Elman neural network is a hyperbolic tangent sigmoid function, and the expression of the hyperbolic tangent sigmoid function is as follows:

9. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-4.

10. A computer storage medium, having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method of any of claims 1 to 4.