CN115705279A

CN115705279A - Intelligent fault early warning method and device based on index data

Info

Publication number: CN115705279A
Application number: CN202110912497.4A
Authority: CN
Inventors: 吴杰; 陈豪; 许乐静; 管益文; 赵春阳
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Group Zhejiang Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Group Zhejiang Co Ltd
Priority date: 2021-08-10
Filing date: 2021-08-10
Publication date: 2023-02-17

Abstract

The invention discloses an intelligent fault early warning method and device based on index data, wherein the method comprises the following steps: extracting historical index data of the cloud platform, preprocessing the historical index data, and then training to obtain a service fault prediction model; training historical index data through a K nearest neighbor classifier to obtain an index fault prediction model; acquiring current index data in a cloud platform, inputting the current index data into a service fault prediction model for prediction, and outputting index data of a fault service; inputting index data of the fault service into an index fault prediction model for prediction to obtain a prediction result; and determining whether the current index data has faults according to the prediction result. According to the method, the service fault prediction model is built through historical index data, the index fault prediction model is built through the K nearest neighbor classifier, the service fault prediction and the index fault prediction are combined to achieve fault early warning on the cloud platform, and the fault prediction is more flexible.

Description

Intelligent fault early warning method and device based on index data

Technical Field

The invention relates to the technical field of data management, in particular to an intelligent fault early warning method and device based on index data.

Background

The system fault early warning means that the occurrence of the fault is quickly and accurately predicted through a machine algorithm, the fault reason is positioned when the fault occurs, and the risk is reduced, and the central system fault early warning method in the prior art comprises the following steps: collecting various index data of the system, including CPU, memory, disk, fault condition, etc.; taking a part of data as a sample, and training an anomaly detection model by adopting a machine learning algorithm; taking the rest data to be sleeved in the model for verification and adjustment; filling real data into a fault early warning model, judging whether an abnormal index exists or not, and if so, giving an alarm; the index data analyzed by the traditional fault early warning method is simple, the system architecture and the product characteristics do not need to be considered, the applicability is wide, and the following problems exist: because the analyzed index data is too single, the prediction accuracy is low, the false alarm rate exists, and the judgment of operation and maintenance personnel is influenced; when the fault early warning method is applied to a relatively complex cloud platform, the training analysis of the mass index data has high requirements on the calculation accuracy and the calculation rate, and the load on the cloud platform is overlarge.

Disclosure of Invention

In view of the above, the present invention is proposed to provide an intelligent fault pre-warning method and apparatus based on index data, which overcomes or at least partially solves the above problems.

According to one aspect of the invention, an intelligent fault early warning method based on index data is provided, which comprises the following steps:

extracting historical index data of a cloud platform, preprocessing the historical index data, and then training to obtain a service fault prediction model; training the historical index data through a K nearest neighbor classifier to obtain an index fault prediction model;

acquiring current index data in a cloud platform, inputting the current index data into the service fault prediction model for prediction, and outputting index data of a fault service;

inputting the index data of the fault service into the index fault prediction model for prediction to obtain a prediction result;

determining whether the current index data has a fault according to a prediction result;

the historical index data and the current index data respectively comprise multi-dimensional service log data.

According to another aspect of the present invention, there is provided an intelligent fault early warning device based on index data, including:

the model training module is used for extracting historical index data of the cloud platform, preprocessing the historical index data and then obtaining a service fault prediction model through training; training the historical index data through a K nearest neighbor classifier to obtain an index fault prediction model;

the prediction module is used for acquiring current index data in the cloud platform, inputting the current index data into the service fault prediction model for prediction, and outputting index data of a fault service; inputting the index data of the fault service into the index fault prediction model for prediction to obtain a prediction result;

the processing module is used for determining whether the current index data has a fault according to a prediction result;

According to yet another aspect of the present invention, there is provided a computing device comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are communicated with each other through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the intelligent fault early warning method based on the index data.

According to still another aspect of the present invention, a computer storage medium is provided, where at least one executable instruction is stored in the storage medium, and the executable instruction causes a processor to perform operations corresponding to the above intelligent fault early warning method based on index data.

According to the intelligent fault early warning method and device based on the index data, historical index data of a cloud platform are extracted, preprocessed and trained to obtain a service fault prediction model; training historical index data through a K nearest neighbor classifier to obtain an index fault prediction model; acquiring current index data in a cloud platform, inputting the current index data into a service fault prediction model for prediction, and outputting index data of a fault service; inputting the index data of the fault service into an index fault prediction model for prediction to obtain a prediction result; and determining whether the current index data has a fault according to the prediction result. According to the method, the service fault prediction model is built through the historical index data, the index fault prediction model is built through the K nearest neighbor classifier, the service fault prediction and the index fault prediction are combined to realize fault early warning on the cloud platform, and the fault prediction is more flexible.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 shows a flowchart of an intelligent fault early warning method based on index data according to an embodiment of the present invention;

fig. 2 shows a schematic structural diagram of an intelligent fault early warning device based on index data according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a computing device provided by an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

Fig. 1 shows a flowchart of an embodiment of an intelligent fault early warning method based on index data, and as shown in fig. 1, the method includes the following steps:

step S110: extracting historical index data of the cloud platform, preprocessing the historical index data, and then training to obtain a service fault prediction model; and training the historical index data through a K nearest neighbor classifier to obtain an index fault prediction model.

In this embodiment, the collected cloud platform product services include product services such as cloud computing, storage, a database, and an intermediate component, and historical data related to the cloud platform product services is collected as historical index data, and the historical index data further includes other hardware data of the cloud platform, such as host performance data, alarm data, and fault log data (e.g., fault processing logs of a physical machine/a virtual machine).

In an optional manner, step S110 further includes: carrying out normalization processing on the historical index data according to a preset rule to obtain a historical data set; performing data sampling on the historical data set, performing feature extraction on the sampled data to obtain first feature data, and inputting the first feature data into a LightGBM model to obtain second feature data; and training the first characteristic data and the second characteristic data through a logistic regression algorithm model to obtain a service fault prediction model.

Specifically, historical index data of the last month or week can be collected according to actual needs. Preprocessing historical index data, specifically: data is observed to determine a processing method (for example, retention or discarding) for missing historical index data, data useful for constructing a fault prediction model should be generally retained, and then the historical index data is normalized to form a historical data set with a uniform format so as to be put into the model for training.

Further, data sampling is carried out on the historical data set, characteristic extraction is carried out on the sampled data to obtain first characteristic data, and the first characteristic data is input into a Gradient hoisting Machine (Light Gradient hoisting Machine, light GBM) model to obtain second characteristic data; specifically, in order to improve the learning efficiency of the machine model when the model is constructed and not to lose the prediction precision of the model, a histogram algorithm and a gaussian distribution (GOSS) algorithm are adopted to perform classified sampling on a historical data set, namely, the historical index data are classified according to gradient distribution, different sampling rates and weights are set, meanwhile, an exclusive property binding (EFB) algorithm is adopted to perform dimensionality reduction on the historical data set, and the acquired service log data of each dimensionality is processed into a piece of first feature data. The LightGBM model uses a histogram algorithm to find an optimal split node, discretizes the value of the characteristic of each sampling data by a bucket method, divides the value in a certain range into a certain bucket (bin), and discretizes continuous floating point characteristics into k discrete values; then, constructing a histogram with the width of k to replace the original data by the histogram; finally, calculating the gradient, the number of samples and the like of the samples in each bin by means of the constructed histogram traversal data to find an optimal splitting node, and not needing to traverse all data one by one, so that the calculated amount is obviously reduced, the training speed is improved, the first characteristic data is input into a LightGBM model to obtain second characteristic data, and the first characteristic data and the second characteristic data are trained through a Logistic Regression (LR) model; the LR algorithm model maps the result of the linear function to the Sigmoid function through the Sigmoid function, estimates the probability of occurrence of a fault event and classifies the fault event, and therefore can be used for researching the influence relation between characteristic data (performance index data) and whether the fault occurs.

It should be particularly noted that, when data sampling is performed on a historical data set, a unilateral gradient sampling strategy is adopted, each historical data instance has a different gradient, and according to the definition of calculating information gain, the instance with a large gradient has a greater influence on the information gain, so that during sampling, sample data with a large gradient is retained (specifically, whether the sample data with a large gradient is determined by presetting a threshold or the highest percentile), sample data with a small gradient is randomly removed, the information gain is calculated by using only the remaining sample data, distribution of the historical data can be better simulated, and the generalization capability of the service fault early warning model is improved. The EFB algorithm is adopted to perform dimension reduction processing on the historical data set, when characteristic data are collected, the characteristics of the collected service log data of each dimension need to be sequenced according to the number of nonzero values, the conflict ratio among different characteristics is calculated, similar characteristics with one characteristic value being zero and one characteristic value being not zero are subjected to characteristic bundling, and the characteristics are changed into low-dimensional dense characteristics, so that the unnecessary calculation of 0-value characteristics can be effectively avoided, and the calculation efficiency is improved.

Firstly, training the constructed first characteristic data by using a LightGBM to construct new characteristic data, namely second characteristic data, so as to obtain a two-classifier, and searching for an optimal parameter combination by using a grid search, wherein a loss function used in the two-classification problem in the embodiment is a logarithmic loss function; when the trained LightGBM model is used for constructing second feature data, the value of a feature vector of the second feature data is 0/1, and each element of the feature vector corresponds to a leaf node of a tree in the LightGBM model; when the tree learned by the LightGBM model is predicted, a sample point finally falls on a leaf node of the tree through a certain tree, so that the element value corresponding to the leaf node in the new feature vector is 1, and the element values corresponding to other leaf nodes of the tree are 0; the length of the feature vector of the second feature data is equal to the sum of leaf node numbers contained in all trees in the GBDT model; inputting the first characteristic data and the second characteristic data into an LR algorithm model for training a final classifier, wherein a loss function of the LR algorithm model is as follows (1):

wherein, y _i Indicates the label, x, corresponding to the sample i _i Represents the value of sample i, h _θ (x _i ) The probability of the sample i is shown, and m represents the number of samples.

And predicting the acquired current index data through the trained service fault prediction model, and judging whether a fault occurs.

In an optional mode, the historical index data includes first historical index data of a normal service and second historical index data of a fault service, a service label of the first historical index data is a first label, and a service label of the second historical index data is a second label; step S110 further includes: training the first historical index data and the second historical index data through a K nearest neighbor classifier to obtain the distribution condition of each historical index data in a normal category and a fault category and the Euclidean distance between any two historical index data belonging to the same category; and adjusting parameters of the K nearest neighbor classifier according to the distribution condition, the Euclidean distance and the service labels corresponding to the historical index data to obtain a trained index fault prediction model.

In this embodiment, a stacking idea is utilized to fuse a lightGBM model and an LR algorithm model, convert a prediction problem into a classification problem, and classify historical index data into first historical index data of a normal service and second historical index data of a fault service, where a service label of the first historical index data is a first label, and a service label of the second historical index data is a second label; further, the first historical index data and the second historical index data are trained through a K-nearest neighbor classifier (KNN), and distribution conditions of the historical index data in a normal category and a fault category and Euclidean distances between any two historical index data belonging to the same category are obtained. The basic idea of the Stacking model is as follows: assuming there are 1000 training sets, 100 test sets, the training set is divided into 5 shares (typically 5 shares), each of which has 200. And (3) training four of the test sets by using a model, namely 800 test sets, predicting the remaining 200 test sets, and predicting 100 test sets to obtain a prediction result. After 5 times of training, the training set just obtains 200 × 5 results, that is, the number of the original training sets is combined into one column, that is, a 1000 × 1 matrix, the test set obtains 100 × 5, the 5 times of prediction results are averaged to obtain a 100 × 1 matrix, and the first layer task is ended. And then, trying other models by the same method, combining the results obtained by different models according to columns, obtaining a 1000 x 3 matrix and a 100 x 3 matrix if 3 basic models are used, taking the results as a training set and a test set of the second-layer model, taking the initial training set label as a second-layer training set label, and putting into training to predict the results.

The Euclidean distance calculation formula between any two pieces of historical index data belonging to the same category is as the following formula (2):

wherein x and y represent the values of two different historical index data belonging to the same category (normal category or failure category), respectively; d (x, y) represents the distance of the two values.

And adjusting parameters of the K nearest neighbor classifier according to the distribution condition, the Euclidean distance and the service labels corresponding to the historical index data to obtain a trained index fault prediction model.

Step S120: acquiring current index data in a cloud platform, inputting the current index data into a service fault prediction model for prediction, and outputting the index data of a fault service; and inputting the index data of the fault service into an index fault prediction model for prediction to obtain a prediction result.

In an optional manner, step S120 further includes: inputting the index data of the fault service into an index fault prediction model for prediction, and calculating Euclidean distances between the index data of the fault service and each historical index data; and determining a prediction result according to the service label of the historical index data with the minimum Euclidean distance.

Step S130: and determining whether the current index data has faults according to the prediction result.

In an optional manner, step S130 further includes: if the service label in the prediction result is a label corresponding to the normal service, determining that the current index data does not have a fault; and if the service label in the prediction result is the label corresponding to the fault service, determining that the current index data has a fault.

By adopting the method of the embodiment, a batch of MySQL products are subjected to sample training through steps S110 to S130 to obtain a service failure prediction model and an index failure prediction model, table 1 is an example of collected historical index data, as shown in table 1, wherein 17 indexes of product service of the X1-X17 cloud platform MySQL are shown in table 1, each column is index data, each row is a value of the index data, and X1-X17 includes: CPU usage (%), CPU load (average), memory usage (%), disk space usage (%), disk IOPS (number of times per second), disk throughput (MB/s), network rate (Mb/s), bandwidth utilization (%), number of system processes (number), CPU temperature (deg.C), fan speed (rpm), mySQL query throughput (number of times per second), mySQL persistent connection utilization (%), the data structure comprises a MySQL query cache space usage rate (%), a MySQL query cache hit rate (%), a MySQL cache query number, and a MySQL index cache hit rate (%), wherein the data structure comprises first historical index data of normal equipment services and second historical index data of fault services, and sample data is shown in table 1, wherein label is 0 to represent normal services, and 1 represents fault services.

X1	X2	X3	X4	X5	X6	X7	X8	X9	X10	X11	X12	X13	X14	X15	X16	X17	label
																		0.08	0.4	0.53	0.75	1000	1	3	0.3	280	50	2300	5000	0.33	0.34	0.15	200000	0.5	0
0.4	0.8	0.77	0.5	5	50	4	0.5	2500	70	3500	8000	0.65	0.67	0.37	333456	0.69	0
																		…	…	…		…		…		…		…		…		…		…
0.99	0.99	0.9	0.88	10000	500	9	3.2	5000	80	5000	20000	0.9	0.9	0.7	555213	0.89	1

TABLE 1 example historical index data sample

Training the sample data in table 1 by using the LightGBM model specifically comprises: converting the sample data into sparse characteristic vectors by a method that the corresponding element value of a leaf node of each tree in the trained LightGBM model is 1 and the corresponding element values of other leaf nodes of the tree are 0 through the trained LightGBM model; wherein, the trained LightGBM model has n weak classifiers and m leaf nodes, each piece of first feature data is converted into a 1 × m-dimensional sparse vector, n elements are 1, the rest m-n elements are all 0, and the data conversion is in a form of [0.0.0.. 1.0.0.] as second feature data; sending second characteristic data in the form of [0.0.0.. 1.0.0.] into an LR algorithm model, and calculating the probability that each sample data is corresponding to the label [0,1], wherein the probability that the numerical value tends to be 1 to represent a fault reason is higher; the array format is [0.04403279,0.95596721], comparing two values in the array with 1 to determine whether the trend is 1, thereby obtaining the information of whether the fault is caused.

Furthermore, the index data of the predicted service fault in the service fault prediction model is used for predicting the fault index by calculating the Euclidean distance by using the trained index fault prediction model respectively, and whether the current index data has the fault or not is determined according to the prediction result so as to remind operation and maintenance personnel to perform key maintenance.

Taking the above CPU utilization (%) as an example: in the KNN training, a CPU utilization rate sample data service label obtained by training the historical fault data of the CPU utilization rate is expressed as [ Euclidean distance, 'probability of fault cause' ], and the data obtained by training is as follows:

The CPU utilization rate of one of the service data predicted by the service fault prediction model is [0.9], and the Euclidean distance between the index data and the KNN training data result is calculated by substituting the trained index fault prediction model as follows:

[ 'cpu utilization 22',0.07];

wherein 22 refers to the 22 nd record of the training data result of the index fault prediction model; 0.07 is the Euclidean distance of traffic data [0.9] from "cpu utilization 17" [0.92, '1' ].

Inputting the index data of the fault service into an index fault prediction model for prediction to obtain the following data:

[ 'cpu utilization 6',0.6], [ 'cpu utilization 21',0.06], [ 'cpu utilization 4',0.51], [ 'cpu utilization 19',0.04], [ 'cpu utilization 20',0.05], [ 'cpu utilization 17',0.02], [ 'cpu utilization 18',0.03], [ 'cpu utilization 15',0.0], [ 'cpu utilization 13',0.02], [ 'cpu utilization 9',0.2], [ 'cpu utilization 1',0.45], [ 'cpu utilization 5',0.4], [ 'cpu utilization 23',0.08], [ 'cpu utilization 8',0.3], [ 'cpu utilization 7',0.7], [ 'cpu utilization 12',0.73], [ 'cpu utilization 16',0.01 ', [' utilization 2', 0.11', 0.36 ', 0', 0.01 ', 0.36', 0', 0.11', 0', 0.36', 0', 0.11', 0', 0.11', 0.2 ', and 0.11'.

Applying the index data [0.9] of the fault service to the trained index fault prediction model, calculating that the 15 th item with the shortest Euclidean distance is 0, and predicting that a fault exists; therefore, the index data corresponding to the CPU utilization rate is close to the fault type with the type 1, and the current index data [0.9] has a fault, so that operation and maintenance personnel can be informed to perform key processing.

Extracting historical index data of the cloud platform, preprocessing the historical index data, and then training to obtain a service fault prediction model; training historical index data through a K nearest neighbor classifier to obtain an index fault prediction model; acquiring current index data in a cloud platform, inputting the current index data into a service fault prediction model for prediction, and outputting the index data of a fault service; inputting index data of the fault service into an index fault prediction model for prediction to obtain a prediction result; and determining whether the current index data has faults according to the prediction result. According to the method, a stacking idea is utilized, a lightGBM model and an LR algorithm model are fused to train historical index data, a prediction problem is converted into a classification problem, a service fault prediction model is constructed, an index fault prediction model is constructed through a K nearest neighbor classifier, and the service fault prediction and the index fault prediction are combined to realize fault early warning on a cloud platform, so that the fault prediction is more flexible; compared with mutually independent prediction models, the method has stronger nonlinear expression capability, reduces generalization errors, reduces overfitting, and improves the accuracy of model prediction and classification.

Fig. 2 shows a schematic structural diagram of an embodiment of the intelligent fault early warning device based on index data. As shown in fig. 2, the apparatus includes: a model training module 210, a prediction module 220, and a processing module 230.

The model training module 210 is configured to extract historical index data of the cloud platform, preprocess the historical index data, and obtain a service fault prediction model through training; and training the historical index data through a K nearest neighbor classifier to obtain an index fault prediction model.

In an alternative mode, the historical index data and the current index data respectively comprise multidimensional service log data.

In an alternative manner, the model training module 210 is further configured to: carrying out normalization processing on the historical index data according to a preset rule to obtain a historical data set; performing data sampling on the historical data set, performing feature extraction on the sampled data to obtain first feature data, and inputting the first feature data into a LightGBM model to obtain second feature data; and training the first characteristic data and the second characteristic data through a logistic regression algorithm model to obtain a service fault prediction model.

In an optional manner, the historical index data includes first historical index data of a normal service and second historical index data of a fault service, a service tag of the first historical index data is a first tag, and a service tag of the second historical index data is a second tag; the model training module 210 is further configured to: training the first historical index data and the second historical index data through a K nearest neighbor classifier to obtain the distribution condition of each historical index data in a normal category and a fault category and the Euclidean distance between any two historical index data belonging to the same category; and adjusting parameters of the K nearest neighbor classifier according to the distribution condition, the Euclidean distance and the service labels corresponding to the historical index data to obtain a trained index fault prediction model.

The prediction module 220 is configured to obtain current index data in the cloud platform, input the current index data into the service fault prediction model for prediction, and output index data of a fault service; and inputting the index data of the fault service into an index fault prediction model for prediction to obtain a prediction result.

In an alternative manner, the prediction module 220 is further configured to: inputting the index data of the fault service into an index fault prediction model for prediction, and calculating Euclidean distances between the index data of the fault service and each historical index data; and determining a prediction result according to the service label of the historical index data with the minimum Euclidean distance.

And the processing module 230 is configured to determine whether the current index data fails according to the prediction result.

In an optional manner, the processing module 230 is further configured to: if the service label in the prediction result is a label corresponding to the normal service, determining that the current index data does not have a fault; and if the service label in the prediction result is the label corresponding to the fault service, determining that the current index data has a fault.

By adopting the device of the embodiment, historical index data of the cloud platform is extracted, preprocessed and trained to obtain a service fault prediction model; training historical index data through a K nearest neighbor classifier to obtain an index fault prediction model; acquiring current index data in a cloud platform, inputting the current index data into a service fault prediction model for prediction, and outputting the index data of a fault service; inputting index data of the fault service into an index fault prediction model for prediction to obtain a prediction result; and determining whether the current index data has faults according to the prediction result. The device constructs a service fault prediction model through historical index data, constructs an index fault prediction model through a K nearest neighbor classifier, and combines service fault prediction and index fault prediction to realize fault early warning on the cloud platform, so that fault prediction is more flexible.

The embodiment of the invention provides a nonvolatile computer storage medium, wherein at least one executable instruction is stored in the computer storage medium, and the computer executable instruction can execute an intelligent fault early warning method based on index data in any method embodiment.

The executable instructions may be specifically configured to cause the processor to perform the following operations:

extracting historical index data of the cloud platform, preprocessing the historical index data, and then training to obtain a service fault prediction model; training historical index data through a K nearest neighbor classifier to obtain an index fault prediction model;

acquiring current index data in a cloud platform, inputting the current index data into a service fault prediction model for prediction, and outputting the index data of a fault service;

inputting the index data of the fault service into an index fault prediction model for prediction to obtain a prediction result;

determining whether the current index data has a fault according to the prediction result;

Fig. 3 is a schematic structural diagram of an embodiment of the computing device of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computing device.

As shown in fig. 3, the computing device may include:

a processor (processor), a Communications Interface (Communications Interface), a memory (memory), and a Communications bus.

Wherein: the processor, the communication interface, and the memory communicate with each other via a communication bus. A communication interface for communicating with network elements of other devices, such as clients or other servers. And the processor is used for executing a program, and specifically can execute related steps in the intelligent fault early warning method embodiment based on the index data.

In particular, the program may include program code comprising computer operating instructions.

The processor may be a central processing unit CPU or an Application Specific Integrated Circuit ASIC or one or more Integrated circuits configured to implement embodiments of the present invention. The server comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

And the memory is used for storing programs. The memory may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The program may specifically be adapted to cause a processor to perform the following operations:

extracting historical index data of the cloud platform, preprocessing the historical index data, and then obtaining a service fault prediction model through training; training historical index data through a K nearest neighbor classifier to obtain an index fault prediction model;

acquiring current index data in a cloud platform, inputting the current index data into a service fault prediction model for prediction, and outputting index data of a fault service;

inputting index data of the fault service into an index fault prediction model for prediction to obtain a prediction result;

The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components in the embodiments may be combined into one module or unit or component, and furthermore, may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limited to the order of execution unless otherwise specified.

Claims

1. An intelligent fault early warning method based on index data is characterized by comprising the following steps:

determining whether the current index data fails according to a prediction result;

2. The method of claim 1, wherein the preprocessing the historical index data and then training the preprocessed historical index data to obtain a business failure prediction model further comprises:

carrying out normalization processing on the historical index data according to a preset rule to obtain a historical data set;

performing data sampling on the historical data set, performing feature extraction on the sampled data to obtain first feature data, and inputting the first feature data to a LightGBM model to obtain second feature data;

and training the first characteristic data and the second characteristic data through a logistic regression algorithm model to obtain a service fault prediction model.

3. The method according to claim 1, wherein the historical index data comprises first historical index data of normal traffic and second historical index data of fault traffic, the traffic label of the first historical index data is a first label, and the traffic label of the second historical index data is a second label;

the training of the historical index data through the K nearest neighbor classifier to obtain the index fault prediction model further comprises:

training the first historical index data and the second historical index data through a K nearest neighbor classifier to obtain the distribution condition of each historical index data in a normal category and a fault category and the Euclidean distance between any two historical index data belonging to the same category;

4. The method of claim 3, wherein the inputting the index data of the fault service into the index fault prediction model for prediction, and obtaining a prediction result further comprises:

inputting the index data of the fault service into the index fault prediction model for prediction, and calculating Euclidean distances between the index data of the fault service and each historical index data;

and determining a prediction result according to the service label of the historical index data with the minimum Euclidean distance.

5. The method of claim 4, wherein determining whether the current metric data is faulty based on the prediction further comprises:

if the service label in the prediction result is a label corresponding to a normal service, determining that the current index data does not have a fault;

and if the service label in the prediction result is a label corresponding to the fault service, determining that the current index data has a fault.

6. The utility model provides an intelligence trouble early warning device based on index data which characterized in that includes:

the prediction module is used for acquiring current index data in a cloud platform, inputting the current index data into the service fault prediction model for prediction, and outputting index data of a fault service; inputting the index data of the fault service into the index fault prediction model for prediction to obtain a prediction result;

7. The apparatus of claim 6, wherein the model training module is further configured to:

performing data sampling on the historical data set, performing feature extraction on the sampled data to obtain first feature data, and inputting the first feature data into a LightGBM model to obtain second feature data;

8. The apparatus according to claim 6, wherein the historical index data comprises a first historical index data of normal traffic and a second historical index data of fault traffic, the traffic label of the first historical index data is a first label, and the traffic label of the second historical index data is a second label;

the model training module is further to:

9. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the intelligent fault early warning method based on index data in any one of claims 1-5.

10. A computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the method for intelligent fault pre-warning based on indicator data according to any one of claims 1-5.