CN114268981A

CN114268981A - Network fault detection and diagnosis method and system

Info

Publication number: CN114268981A
Application number: CN202111061785.XA
Authority: CN
Inventors: 朱晓荣; 何明坤; 张佩佩; 吴知航
Original assignee: Nanjing Xinghang Communication Technology Co ltd
Current assignee: Nanjing Xinghang Communication Technology Co ltd
Priority date: 2021-09-10
Filing date: 2021-09-10
Publication date: 2022-04-01

Abstract

The application relates to the field of communication networks, and discloses a network fault detection and diagnosis method and a system, wherein the method comprises the following steps: analyzing and correlating key performance indexes for measuring the network performance and common network faults; preprocessing network fault data in an original database, and screening out samples with sample weight larger than a preset threshold value so as to filter out virtual samples inconsistent with the distribution rule of an original data set; generating a network fault detection and diagnosis model based on a generative countermeasure network by using the sample data obtained after screening; training and testing the network fault checking and diagnosing model; and detecting and diagnosing the network fault by using the trained and tested network fault detection and diagnosis model, and outputting a detection and diagnosis result. The method and the device have the advantages of being higher in accuracy and better in stability, effectively improving the efficiency of network fault detection and diagnosis, and ensuring the service experience of users.

Description

Network fault detection and diagnosis method and system

Technical Field

The present application relates to the field of communication networks, and in particular, to a network fault detection and diagnosis technique.

Background

In recent years, the demands of communication network users are increasingly diversified, and meanwhile, the communication network technology is rapidly developed, and a 5G network is put into use.

In the face of diversified demands of users, future network scenes are very complex, and management of complex network environments and guarantee of normal operation of the network are huge challenges. The detection and diagnosis of network faults refer to monitoring the change of a network and network nodes all the time, discovering whether links or nodes normally operate or not in time, and feeding back acquired information to a management system in time. The detection and diagnosis of the fault mainly aims to provide fault early warning flexibly by using network parameters and ensure the Quality of Service (QoS) of network services.

Therefore, how to provide more efficient network fault detection and improve the service experience of the user is an urgent problem to be solved.

Disclosure of Invention

The application aims to provide a network fault detection and diagnosis method and system, which are higher in accuracy and better in stability, effectively improve the efficiency of network fault detection and diagnosis, and ensure the service experience of users.

The application discloses a network fault detection and diagnosis method, which comprises the following steps:

analyzing and measuring key performance indexes of network performance and common network faults, and correlating the key performance indexes and the common network faults;

by utilizing the incidence relation between the key performance indexes and common network faults, preprocessing network fault data in an original database, screening out samples with sample weight larger than a preset threshold value, and filtering out virtual samples inconsistent with the distribution rule of an original data set;

generating a network fault detection and diagnosis model based on a generative countermeasure network by using the sample data obtained after screening;

training and testing the network fault checking and diagnosing model;

and detecting and diagnosing the network fault by using the trained and tested network fault detection and diagnosis model, and outputting a detection and diagnosis result.

In a preferred embodiment, the step of preprocessing the network fault data in the original database includes data normalization, wherein k indexes X are given₁，X₂，…，X_kWherein k is an integer greater than 1: x_i＝{x₁，x₂，…，x_nNormalizing each key performance index to obtain a value Y₁，Y₂，…，Y_kWherein:

wherein: x_iData set representing the ith index, X_ijIndicating the jth data in the ith index. Y is_iIndicates the normalized value of index data, Y_ijRepresents the jth data in the normalized ith index.

In a preferred embodiment, in the step of generating the network fault detection and diagnosis model based on the generative countermeasure network by using the sample data obtained after screening,

respectively carrying out normalization processing based on the maximum value of each key performance index:

wherein KPI_iIndicates the normalized ith key performance index, max (KPI)_i) Refers to the maximum value of the ith key performance index in the collected data, which is used for converting the specific index KPI_iWherein only the out-of-range [0,1 ] is considered]KPI between_iTo ensure that all variables are within the desired interval.

In a preferred example, in the step of generating the network fault detection and diagnosis model based on the generative confrontation network by using the sample data obtained after screening, the generative confrontation network framework is based on the null sum game in the game theory, the framework has 2 first networks and second networks which compete with each other and optimize the targets thereof at the same time, the first network is a generator G, the second network is a discriminator D, and the optimized objective function is:

wherein p is_rNormalized true data representing heterogeneous wireless network collections

The distribution of (a); p is a radical of_zA distribution representing input noise compliance; g (Z) represents a mapping of a data space; g represents a differentiable function represented by a multi-layer perceptron;

is a scalar quantity, represents

From the real data distribution.

In a preferred embodiment, in the step of generating the network fault detection and diagnosis model based on the generative countermeasure network by using the sample data obtained after screening, an additional parameter factor is added to the loss function of the generator and the discriminator.

adding a term to the loss function:

wherein, theta_iRepresenting a parameter at time i, theta_iUpdating in the network training process;

the loss function of the discriminator D is:

wherein the content of the first and second substances,

p_zrepresenting the distribution of data generated by G;

p_rrepresenting the distribution of the original existing data;

is obtained by sampling a data set composed of real data and generated data, and using epsilon-uniform [0,1 ]]In a

And

is sampled by random interpolation on the connecting line to obtain

Is a penalty item;

the closer to 1, the less penalty; u is a penalty parameter;

w is a parameter of the discriminator D;

and, the loss function of generator G is:

where θ is the generator parameter, θ_iRepresenting a parameter at time i, theta_iAnd updating in the network training process, wherein z represents noise which follows the real data distribution rule, G is a generator, and D represents a discriminator.

In a preferred embodiment, the key performance indicator includes one of the following or any combination thereof: reference signal received power, reference signal received quality, uplink packet loss rate, downlink packet loss rate, uplink signal-to-noise ratio, downlink signal-to-noise ratio, radio resource control connection establishment success rate, evolved radio access bearer establishment success rate, dropped call rate switching success rate, uplink average throughput, downlink average throughput, node outgoing average throughput, node incoming average throughput, and switching delay link error rate.

In a preferred embodiment, the network failure includes one of the following or any combination thereof: interference, overlay failure, hardware failure, link failure, configuration parameter failure.

In a preferred example, in the step of screening out the samples with the sample weight larger than the preset threshold value by preprocessing the network fault data in the original database, the most relevant key performance index affecting the network state is selected by using the feature importance ranking function of the XGBoost frame.

In a preferred embodiment, in the step of screening out samples with sample weights larger than a preset threshold value by preprocessing network fault data in an original database, the importance of each key performance index is counted by using an XGBoost algorithm for modeling and sorting from high to low, and then the number of feature combinations is sequentially increased according to a sorting sequence from high to low to test the classification accuracy of the feature combinations with different numbers, wherein the feature combination with the highest classification accuracy is the most optimal.

In a preferred embodiment, in the step of training and testing the network fault checking and diagnosing model, the sample data obtained after screening is divided into two parts, wherein one part of the sample data is used for performing data fitting in different network states through the generative countermeasure network to obtain virtual data with marks in different network states, and the virtual data is used as training data for model training; and taking another part of the sample data as test data in the network fault detection and diagnosis model to perform model test.

The application also discloses a network fault detection and diagnosis system, including:

the correlation module is used for analyzing and measuring key performance indexes of network performance and common network faults and correlating the key performance indexes and the common network faults;

the preprocessing module is used for preprocessing the network fault data in the original database by utilizing the key performance indexes and the correlation relationship of common network faults, screening out samples with sample weight larger than a preset threshold value, and filtering out virtual samples inconsistent with the distribution rule of the original data set;

the generation module is used for generating a network fault detection and diagnosis model based on the generative confrontation network by using the sample data obtained after screening;

the training and detecting module is used for training and testing the network fault checking and diagnosing model;

and the detection and diagnosis module is used for detecting and diagnosing the network fault by utilizing the trained and tested network fault detection and diagnosis model and outputting a detection and diagnosis result.

a memory for storing computer executable instructions; and the number of the first and second groups,

a processor for implementing the steps of any of the above methods when executing the computer-executable instructions.

The present application also discloses a computer-readable storage medium having stored therein computer-executable instructions that, when executed by a processor, implement the steps of any of the methods described above.

In the embodiment of the application, firstly, real samples are preprocessed, samples which are easy to generate and inconsistent with the distribution rule of an original data set are filtered, and the advantages of effectively improving the classification precision of the classifier and only learning samples with weight values larger than a set value by setting a characteristic weight threshold value; secondly, in the process of building the network fault detection and diagnosis model, virtual data are taken as training data to perform model training, real data are taken as test data, and then model building and testing are performed; finally, in order to prevent the occurrence of nash balance when the generative countermeasure network architecture generates the virtual data, additional parameter factors are added to the loss functions of the generative model G and the discriminant model D, so that the convergence of the loss functions is more stable. Therefore, the efficiency of network fault detection and diagnosis is effectively improved, and the service experience of a user is ensured.

The present invention is not limited to the embodiments described above, but rather, the embodiments described above may be implemented in a variety of forms (e.g., a variety of forms, and a variety of combinations). In order to avoid this problem, the respective technical features disclosed in the above summary of the invention of the present application, the respective technical features disclosed in the following embodiments and examples, and the respective technical features disclosed in the drawings may be freely combined with each other to constitute various new technical solutions (which are considered to have been described in the present specification) unless such a combination of the technical features is technically infeasible. For example, in one example, the feature a + B + C is disclosed, in another example, the feature a + B + D + E is disclosed, and the features C and D are equivalent technical means for performing the same function, and technically, only one feature is used, and the feature E can be technically combined with the feature C, so that the solution of a + B + C + D should not be considered as being described because the technology is not feasible, and the solution of a + B + C + E should be considered as being described.

Drawings

FIG. 1 is a diagram of a dense heterogeneous wireless network scenario;

FIG. 2 is a schematic diagram of a network fault detection and diagnosis method according to a first embodiment of the present application;

FIG. 3 is a schematic diagram of a data preprocessing flow of a network fault detection and diagnosis method according to a first embodiment of the present application;

FIG. 4 is a schematic flow chart diagram of a network fault detection and diagnosis method according to a first embodiment of the present application;

fig. 5 is a schematic structural diagram of a network fault detection and diagnosis system according to a first embodiment of the present application.

Detailed Description

In the following description, numerous technical details are set forth in order to provide a better understanding of the present application. However, it will be understood by those skilled in the art that the technical solutions claimed in the present application may be implemented without these technical details and with various changes and modifications based on the following embodiments.

The following outlines some of the innovative points of the present application:

after a great deal of research and analysis, the inventor of the application finds that compared with the network fault detection and diagnosis technology based on human, the network fault diagnosis with the intelligent algorithm has the following advantages: (1) the diagnosis cost is reduced. The structure of the mobile communication network is very complex, the occurrence of network faults is random, and in this case, if the network faults are detected and checked only by means of manual force, a large amount of manpower and material resources are consumed. And the introduction of the intelligent algorithm can well liberate manpower and material resources and save the cost. (2) And the fault diagnosis efficiency is improved. The network fault diagnosis is detected and checked only by means of manual force, and the efficiency is low. By using the intelligent algorithm, the network fault detection and diagnosis can be more efficient by using the efficient computing power of a computer system. (3) And the fault diagnosis precision is improved.

In view of the increasing complexity of mobile communication network architecture, it is difficult for people to find and locate network failures in time. The past manual diagnosis process mainly refers to past engineering experience, and some errors may exist in the experience. In contrast, the intelligent algorithm can learn the data presentation rule when the network fails according to the collected historical parameters. In addition, the method also has the advantage of automatically updating the algorithm parameters, and improves the fault diagnosis precision.

With the introduction and application of 5G and 6G networks, the mobile communication network will be a very complicated network environment in the future. In such a complex network environment, the interference faced by the network will become more and more complex, the devices used in the network will be more diverse and complex, and when the network is broken down or has some faults, the reasons for the occurrence will be less easy to be checked. Therefore, future network fault detection and diagnosis still have the difficult problems that cannot be easily overcome.

Based on the consideration, the application creatively provides a new network fault detection and diagnosis method, which is used for preprocessing real samples and filtering out samples which are easy to generate and inconsistent with the distribution rule of an original data set, so that the classification precision of the classifier is effectively improved. By setting a characteristic weight threshold, only learning samples with weight values larger than a set value, and adding additional parameter factors into the loss functions of the generator G and the discriminator D to make the convergence of the loss functions more stable in order to prevent the occurrence of Nash equilibrium in the generation of virtual data in the generative countermeasure network architecture.

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Application scenarios:

preferably, the embodiment takes the dense heterogeneous wireless network scenario shown in fig. 1 as an example.

In this scenario, the network has diversity, the system is more complex, and network management is more difficult.

A first embodiment of the present application relates to a method for detecting and diagnosing network faults based on the combination of enhanced network convergence and a wotherstein-generated countermeasure network, the flow of which is shown in fig. 3-4, the method comprising the following steps:

step 110: data association

In this step, the key performance indexes for measuring the network performance and the common network faults are analyzed and correlated, which is the early work for constructing a network fault diagnosis model.

A small amount of key performance index data under different network states are collected from a heterogeneous wireless network environment, and the different network states are associated with KPI data.

The method has the advantages of reducing the false alarm rate of the fault, facilitating troubleshooting, extracting the index with higher relevance and analyzing the fault.

Key performance indexes are as follows:

TABLE 1

Table 1 exemplarily lists key performance indicators in the heterogeneous network selected by the present patent, and details are as follows:

(1) RSRP, which describes the strength of the pilot signal received in the downlink. Which is defined as the average downlink received power over resource elements carrying cell-specific reference signals from the serving cell within the considered bandwidth.

(2) RSRQ, which describes the quality of the pilot signal received in the downlink, in dB. Which is defined as the ratio between RSRP and the wideband received signal of all base stations when the carrier bandwidth plus thermal noise is present.

(3) Packet loss rate, refers to the ratio of the number of lost packets to the total number of packets transmitted.

(4) The signal-to-noise ratio can be used for measuring the network performance, and the higher the signal-to-noise ratio is, the less clutters in the signal is, and the better the network performance is; conversely, the worse the network performance.

(5) The ratio of the success rate of the establishment of the radio resource control connection, the number of times of the successful establishment of the RRC connection and the total number of times of the connection attempts. Only when the RRC is successfully established, the communication service can be carried out, and if the RRC fails, the user and the network cannot establish normal connection, and the network service is interrupted.

(6) The success rate of establishing the evolved radio access bearer refers to the ratio of the successful times of establishing the connection of the E-RAB to the total times of trying the connection. If the connection fails, the user and the network can not be connected normally, and the quality of the network service is affected.

(7) DCR refers to the probability of an unexpected interruption of communication during mobile communication. Call drop can be caused by weak coverage area of a cell or interference between networks.

(8) Handover success rate, which is the ratio of the number of successful handovers to the total number of handover attempts, describes the ability of the network to enable the user to continue receiving service and to maintain connectivity during mobility.

(9) Average throughput refers to the amount of data downloaded or uploaded per unit time. Is an important performance indicator for operators.

(10) The delay, generally referred to as the time interval between transmission and reception, may be considered from the aspects of network topology, traffic model in the network, transmission resources, etc. when the network has a delay.

(11) Bit error rate, a measure of the accuracy of a data transmission over a specified period of time. The bit error is generated due to low transmission quality of the network.

Common network failures:

it should be noted that for network failure analysis, the data set required for training the network failure diagnosis model is composed of different network failures, each failure represents a problem of one unit cell, and the neighbor cells are also affected by the problem.

In the embodiment of the application, common network faults in the network operation process are exemplarily analyzed, for example:

(1) interference, which generally refers to signals entering a channel or a communication system that affect normal operation of a legal channel, can cause problems such as disconnection and connection failure of a mobile communication network, and seriously affect normal operation of the network.

(2) Coverage failures, a common failure being a coverage hole, mean that the signal levels of a service unit and neighboring units are on average lower than the mandatory level area required to maintain service, often caused by obstacles such as new buildings in urban areas and hills in rural areas.

(3) Hardware failure, which generally refers to a failure of a base station equipment component, such as a base station mid-section component, is insensitive. When the hardware has a serious fault, the service of the cell is interrupted, and the call drop rate is increased sharply.

(4) The link failure generally refers to that a link is blocked in the operation of the network or the link cannot normally transmit data due to other reasons, and the network cannot normally operate.

(5) In case of parameter configuration failure, at the radio access end, if only the base station is considered, the base station itself has many parameter adjustments, and if some important parameters are configured incorrectly or have problems during updating, the network performance may be reduced.

Correlation of critical metrics with common network failures:

in this step, the key performance indicators and the network failures are associated according to the experience of the troubleshooting specialist, as exemplarily shown in table 2.

TABLE 2

Next, in step 120-.

Specifically, data preprocessing is carried out on network fault data of an original database, wherein data standardization is carried out firstly, and then only samples with sample weights larger than a characteristic weight threshold value are extracted by setting the characteristic weight threshold value to achieve the effect of characteristic screening, so that virtual samples inconsistent with the distribution rule of an original data set are filtered out by preprocessing real samples, and the classification precision of a classifier is effectively improved.

Wherein step 120 is primarily a normalization; step 130 is mainly feature screening, which mainly uses the XGBoost algorithm to solve the weight of each feature and sort the importance, and screens the features based on the result. The data preprocessing flow of steps 120-130 can be further described with reference to fig. 3-4. The method comprises the following specific steps:

step 120: data normalization (standardization)

In this step, each key performance index is standardized.

For example:

preferably, k indices X are given₁，X₂，…，X_kWherein

X_i＝{x₁，x₂，…，x_n}

Preferably, the value normalized for each key performance indicator is Y₁，Y₂，…，Y_kThen, then

Step 130: data screening

In the step, XGboost algorithm modeling is utilized, so that the importance of each key performance index can be counted and ranked from high to low. And then, sequentially increasing the number of the feature combinations according to a sequence from high to low, testing the classification accuracy of the feature combinations under different numbers, wherein the highest classification accuracy is the optimal feature combination.

The XGboost algorithm:

specifically, the XGboost framework can be used for training data, and then the trained model is used for predicting the network state in a certain time period, namely, the collected other unknown data is labeled.

Note that another benefit of using XGBoost is that after the lifting tree is created, a significance score can be obtained for each attribute. In general, the importance score measures the value of an attribute in the model to enhance the construction of the decision tree. The more times an attribute is used in the model to construct a decision tree, the more important it is.

The characteristic importance sorting function of the XGboost framework carries out data preprocessing and selects the most relevant performance indexes which influence the measurement of the network state. The algorithm can balance the accuracy of the test set and the complexity of the model, thereby realizing efficient and reliable detection and diagnosis of the network fault.

For ease of understanding, the XGBoost algorithm is explained further below.

The XGBoost is an improved algorithm based on a Gradient Boosting Decision Tree (GBDT), and is improved in terms of calculation speed, generalization performance, expandability, and the like. In each iteration of training, the GBDT algorithm firstly defines the gradient descending direction of the prior model loss function, then constructs a new decision tree model on the basis of the direction, and prunes the decision tree after constructing the decision tree. XGboost adds regularization terms to the loss function in the construction stage of the decision tree, as shown in the formula.

Wherein the content of the first and second substances,

is a loss function for measuring the value y finally desired to be obtained_iAnd the predicted value

The difference between the predicted network status label and the actual network status label is shown in the formula.

Ω(f_m) As a regularization term, defined as

Where T refers to the number of leaf nodes, λ is the regularization parameter, γ is the learning rate, w_jRepresenting the predicted value of the jth leaf node.

Order to

Represents the optimal solution of the existing (m-1) tree, the loss function F_obj ^(m)In that

At a second order Taylor expansion of

Wherein the content of the first and second substances,

I_jdefined as a set of indices of samples whose values are associated with leaf node j.

Assuming that the structure of the decision tree has been determined, the prediction value at each leaf node can be obtained by making the derivative of the loss function zero, i.e.

And substituting the predicted value into the loss function to obtain the minimum value of the loss function. F_obj ^*For the final loss function, the smaller its value, the closer to the actual result, the better the structure of the representation tree, as shown in the equation.

Step 140: model generation

In the step, sample data obtained after screening is utilized, a network fault detection and diagnosis model is generated based on a generative confrontation network, wherein the discriminant and the generator are trained by using an Adam algorithm, and the main idea is to flexibly adjust the learning rate of each parameter by utilizing the first moment estimation and the second moment estimation of the gradient, so that the purpose of updating the parameters is achieved. The Adam algorithm has the advantage that after deviation correction, the learning rate of each iteration is fixed in a certain range, so that the parameters are relatively stable.

Generating a countermeasure network:

for ease of understanding, the generative confrontation network is further explained below.

The generative countermeasure network, as a typical method for implementing artificial intelligence, has shown a good ability to handle complex problems. The generative confrontation network comprises 2 independent deep networks, namely a generator and a discriminator. The generator receives a random variable subject to the distribution for capturing the distribution of the data. The discriminator outputs 1 and 0 to distinguish between the true sample and the generated sample, respectively. The generative confrontation network utilizes a generator and a discriminator to respectively generate and classify samples in a training process, so that the performance of the samples is improved resistively. There are some problems in the actual training process, such as difficulty in training, lack of diversity in the generated samples, etc. The generative countermeasure network concept is combined with a typical network fault diagnosis method. By utilizing the generative confrontation network idea, a large number of reliable labeled data sets are obtained for training a network fault diagnosis model based on a small number of labeled data sets.

It should be noted that different network states have different characteristics, and the network fault diagnosis model must determine symptoms corresponding to different network states in order to identify multiple faults.

In the examples of the present application, S ═ KPI is defined₁,KPI₂,KPI₃,…,KPI_m](KPI, key performance indicator) represents the input vector of different network states, S is the vector containing m key performance indicators; definition C ═ { FC₁,FC₂,FC₃,…,FC_n} tableIndicating the state of the network, such as the network operating normally or some failure.

The input data vector consisting of small sample data collected from a heterogeneous wireless network environment consists of all relevant KPIs for the cell under study. KPIs can be collected using different time aggregation levels (e.g., hours, days, weeks, months, etc.) depending on the granularity required for the diagnostic procedure.

If a network fault FC occurs in a certain period of time T_iThen the network status during this time is expressed as

Wherein the content of the first and second substances,

refers to the value of the mth KPI at time t.

In the input phase, a particular KPI is selected_iAnd normalized. Ensuring that their dynamic ranges are similar. In this patented system, normalization is performed separately based on the maximum value of each KPI, i.e. the

Wherein KPI_iIndicates the normalized ith key performance index, max (KPI)_i) Refers to the maximum value of the ith key performance index in the collected data, which is used for converting the specific index KPI_iOnly the dynamic range of (1) is considered out of the range [0,1 ]]KPI between_i. The goal is to ensure that all variables are within the desired interval.

The normalized network state is

As a further preference, the generative confrontation network framework is based on the null sum game in the game theory. The framework must have 2 competing networks and optimize its objectives simultaneously. The first network is a generator G which outputs analog samples given gaussian or uniform noise. The second network is discriminator D, which inputs samples from the true distribution or samples generated by G into discriminator D, which attempts to label a given sample as either 0 (samples from the generator distribution) or 1 (samples from the true data distribution). After iteration, this competition will make 2 networks better performing the task. And in particular the generator G, can produce a real sample that can deceive humans. The optimized objective function is

is a scalar quantity, represents

From true data distribution other than p_zThe probability of (c).

It can be understood that the generative model in the generative countermeasure network does not have the problem that the data is very complex and difficult to calculate, and only needs to input one network which obeys certain regular noise, some real data and two networks which can approximate functions. Through continuous chess playing of the generator and the discriminator, when the discriminator tends to be stable, the generator obtains different network states and tends to be distributed by real data

The algorithm is as follows:

preferably, the following algorithm 1 may be employed:

the algorithm is as follows:

the following further explanation of algorithm 1 concerning Ada algorithm parameters:

wherein t is the number of steps (step) of the update; theta is a parameter to be solved; f. of_θA random objective function with a parameter theta, generally referred to as a loss function; g_tAs an objective function f_θDeriving the gradient from θ; m is_tIs a gradient g_t(iii) a desire; v. of_tIs a gradient g_t ²(iii) a desire;

is m_tCorrection of bias of (3);

is v is_tTo correct the bias of (3).

Parameter factors:

preferably, in order to prevent the occurrence of the phenomenon of the nash balance imperfection in the generation type countermeasure network architecture when generating the virtual data, an additional parameter factor is added to the loss function of the generator and the discriminator, so that the convergence of the loss function is more stable.

It should be noted that the wotherstein-generated confrontation network algorithm fails to take into account the disadvantage of reaching some kind of undesirable nash equilibrium state when training the generated confrontation network. Nash refers to a playing scene, and as long as some players in the playing scene do not change their own strategies, any other players cannot improve their own situation. Nash proves that Nash equilibrium must exist if each player can only select a limited number of strategies and allow for a mixed strategy.

The inventor of the present application proposes to combine a method that can improve network convergence with the idea of a wotherstein-generated countermeasure network algorithm for solving the problem of nash equalization. The main idea is to add a term in the loss function:

in the formula theta_iIndicating the parameter at time i. The parameters can be updated during the network training process, and if the parameters are added, the gradient cannot easily enter a stable orbit and can be continuously updated to the balance point.

At this time, the discriminator loss function is as follows:

wherein the content of the first and second substances,

p_zrepresenting the distribution of data generated by G;

p_rthe distribution of original existing data is represented, and the distribution refers to the distribution of normalized small sample data under different network states collected under the heterogeneous wireless network environment; s% is obtained by sampling a data set consisting of real data and generated data, using e-uniform [0,1 ]]In a

And

is sampled by random interpolation on the connecting line to obtain

In order to be a penalty term,

the closer to 1, the less penalty; u is a penalty parameter.

Where w is the discriminator parameter.

The generator loss function is as follows:

where θ is a generator parameter.

Step 150: model training and testing

In this step, the network fault checking and diagnosing model is trained and tested.

Preferably, the sample data obtained after screening is divided into two parts, one part of the sample data is used for performing data fitting under different network states through the generative countermeasure network, so as to obtain virtual data with marks under different network states, and the virtual data is used as training data for model training; and the other part of data is used as test data in the network fault detection and diagnosis model to carry out model test.

In other words, in this step, model training is performed using the virtual data as training data, and model construction and testing are performed using the real data as test data.

Step 160: network fault detection and diagnosis

In this step, the trained and tested network fault detection and diagnosis model is used to detect and diagnose the network fault, and the detection and diagnosis result is output.

The technical effects are as follows:

according to the network fault detection and diagnosis method of the embodiment, the real samples are preprocessed, samples which are easy to generate and inconsistent with the distribution rule of the original data set are filtered, and therefore the classification accuracy of the classifier can be effectively improved. Learning only samples with weight values larger than a set value by setting a characteristic weight threshold; secondly, in the process of building the network fault detection and diagnosis model, virtual data is taken as training data to carry out model training, real data is taken as test data, and then model building and testing are carried out; finally, in order to prevent the occurrence of Nash equilibrium in the generation of virtual data in the generative countermeasure network architecture, additional parameter factors are added to the loss functions of the generative model G and the discriminant model D, so that the convergence of the loss functions is more stable.

A second embodiment of the present application relates to a network fault detection and diagnosis system, the structure of which is shown in fig. 5, and the network fault detection and diagnosis system includes:

the preprocessing module is used for preprocessing the network fault data in the original database, screening out samples with sample weights larger than a preset threshold value, and filtering out virtual samples inconsistent with the original data set distribution rule;

The first embodiment is a method embodiment corresponding to the present embodiment, and the technical details in the first embodiment may be applied to the present embodiment, and the technical details in the present embodiment may also be applied to the first embodiment.

It should be noted that, as will be understood by those skilled in the art, the implementation functions of the modules shown in the embodiment of the network fault detection and diagnosis system can be understood by referring to the related description of the network fault detection and diagnosis method. The functions of the modules shown in the embodiments of the network fault detection and diagnosis system described above may be implemented by a program (executable instructions) running on a processor, or may be implemented by specific logic circuits. The network fault detection and diagnosis system in the embodiment of the present application, if implemented in the form of a software functional module and sold or used as an independent product, may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.

Accordingly, the present application also provides a computer storage medium, in which computer executable instructions are stored, and when executed by a processor, the computer executable instructions implement the method embodiments of the present application.

In addition, the embodiment of the present application further provides a network fault detection and diagnosis system, which includes a memory for storing computer executable instructions, and a processor; the processor is configured to implement the steps of the method embodiments described above when executing the computer-executable instructions in the memory. The Processor may be a Central Processing Unit (CPU), other general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or the like. The aforementioned memory may be a read-only memory (ROM), a Random Access Memory (RAM), a Flash memory (Flash), a hard disk, a solid state disk, or the like. The steps of the method disclosed in the embodiments of the present invention may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.

It is noted that, in the present patent application, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the use of the verb "comprise a" to define an element does not exclude the presence of another, same element in a process, method, article, or apparatus that comprises the element. In the present patent application, if it is mentioned that a certain action is executed according to a certain element, it means that the action is executed according to at least the element, and two cases are included: performing the action based only on the element, and performing the action based on the element and other elements. The expression of a plurality of, a plurality of and the like includes 2, 2 and more than 2, more than 2 and more than 2.

All documents mentioned in this application are to be considered as being included in their entirety in the disclosure of the present application so as to be able to be relied upon as modifications when necessary. Further, it is understood that various changes or modifications may be made to the present application by those skilled in the art after reading the above disclosure of the present application, and such equivalents are also within the scope of the present application as claimed.

Claims

1. A network fault detection and diagnosis method is characterized by comprising the following steps:

analyzing and correlating key performance indexes for measuring the network performance and common network faults;

training and testing the network fault checking and diagnosing model;

2. The method of claim 1, wherein the step of preprocessing by preprocessing network fault data in a raw database comprises a data normalization process, wherein k indices X are given₁，X₂，...，X_kWherein k is an integer greater than 1: x_i＝{x₁，x₂，...，x_nNormalizing each key performance index to obtain a value Y₁，Y₂，...，Y_kWherein:

wherein: x_iData set representing the ith index, X_ijRepresents the jth data in the ith index. Y is_iIndicates the normalized value of index data, Y_ijRepresents the jth data in the normalized ith index.

3. The method of claim 1, wherein in the step of generating a network fault detection and diagnosis model based on a generative countermeasure network using sample data obtained after the filtering,

4. The method of claim 1, wherein in the step of generating the network fault detection and diagnosis model based on the generative confrontation network by using the sample data obtained after screening, the generative confrontation network framework is based on the zero sum game in the game theory, the framework has 2 first networks and second networks which compete with each other and optimize their targets simultaneously, the first network is a generator G, the second network is a discriminator D, and an optimized objective function is:

The distribution of (a); p is a radical of_zA distribution representing input noise obeys; g (Z) represents a mapping of a data space; g represents a differentiable function represented by a multi-layer perceptron;

is a scalar quantity, represents

From the real data distribution.

5. The method of claim 1, wherein in the step of generating a network fault detection and diagnosis model based on a generative confrontation network using the sample data obtained after screening, an additional parameter factor is added to a loss function of the generator and the discriminator.

6. The method of claim 1, wherein in the step of generating a network fault detection and diagnosis model based on a generative countermeasure network using sample data obtained after the filtering,

adding a term to the loss function:

the loss function of the discriminator D is:

wherein the content of the first and second substances,

p_zrepresenting the distribution of data generated by G;

p_rrepresenting the distribution of the original existing data;

And

is sampled by random interpolation on the connecting line to obtain

Is a penalty item;

the closer to 1, the less penalty; u is a penalty parameter;

w is a parameter of the discriminator D;

and, the loss function of generator G is:

7. The method of claim 1, wherein the key performance indicators comprise one or any combination of: the method comprises the following steps of reference signal receiving power, reference signal receiving quality, uplink packet loss rate, downlink packet loss rate, uplink signal-to-noise ratio, downlink signal-to-noise ratio, radio resource control connection establishment success rate, evolution wireless access bearer establishment success rate, call drop rate switching success rate, uplink average throughput, downlink average throughput, node outgoing average throughput, node incoming average throughput and switching delay link error rate.

8. The method of claim 1, wherein the network failure comprises one or any combination of: interference, overlay failure, hardware failure, link failure, configuration parameter failure.

9. The method as claimed in claim 1, wherein in the step of screening out samples with sample weights greater than a preset threshold value by preprocessing the network fault data in the original database, the most relevant key performance indicators affecting the state of the network are selected by using the feature importance ranking function of the XGBoost framework.

10. The method as claimed in claim 9, wherein in the step of screening out samples with sample weights greater than a preset threshold value by preprocessing the network fault data in the original database, the XGBoost algorithm is used for modeling and counting the importance of each key performance index and sorting the samples from high to low, and then the number of feature combinations is sequentially increased according to the sorting order from high to low to test the classification accuracy of the feature combinations with different numbers, wherein the feature combination with the highest classification accuracy is optimal.

11. The method according to claim 1, wherein in the step of training and testing the network fault checking and diagnosing model, the sample data obtained after screening is divided into two parts, wherein one part of the sample data is used for performing data fitting under different network states through the generative countermeasure network to obtain virtual data with marks under different network states, and the virtual data is used as training data for model training; and taking another part of the sample data as test data in the network fault detection and diagnosis model to perform model test.

12. A network fault detection and diagnosis system, comprising:

the preprocessing module is used for preprocessing the network fault data in the original database by utilizing the incidence relation between the key performance indexes and the common network faults, screening out samples with sample weight larger than a preset threshold value, and filtering out virtual samples inconsistent with the distribution rule of the original data set;

13. A network fault detection and diagnosis system, comprising:

a processor for implementing the steps in the method of any one of claims 1 to 11 when executing the computer-executable instructions.

14. A computer-readable storage medium having stored thereon computer-executable instructions which, when executed by a processor, implement the steps in the method of any one of claims 1 to 11.