CN116069606B - Software system performance fault prediction method and system - Google Patents

Software system performance fault prediction method and system Download PDF

Info

Publication number
CN116069606B
CN116069606B CN202310033759.9A CN202310033759A CN116069606B CN 116069606 B CN116069606 B CN 116069606B CN 202310033759 A CN202310033759 A CN 202310033759A CN 116069606 B CN116069606 B CN 116069606B
Authority
CN
China
Prior art keywords
sequence
data
value
vector
cross
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310033759.9A
Other languages
Chinese (zh)
Other versions
CN116069606A (en
Inventor
史玉良
杨南飞
王新军
孔凡玉
李晖
陈志勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202310033759.9A priority Critical patent/CN116069606B/en
Publication of CN116069606A publication Critical patent/CN116069606A/en
Application granted granted Critical
Publication of CN116069606B publication Critical patent/CN116069606B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a software system performance fault prediction method and a system, which belong to the technical field of big data information processing and intelligent operation and maintenance, wherein the scheme comprises the steps of obtaining monitoring index data of a system to be tested in operation and carrying out corresponding pretreatment; based on the preprocessed monitoring index data, a longitudinal sequence and a transverse sequence of each single index data are obtained; acquiring cross attention vectors of two sequences based on a cross attention mechanism, and acquiring predicted values of all single indexes based on a pre-trained LSTM model; calculating the difference value between the predicted value and the actual monitoring value, if the difference value between the single index predicted value and the actual monitoring value exceeds a preset threshold value, judging that an abnormality occurs, and outputting a fault early warning; combining the single index data sequences to obtain multi-dimensional monitoring index sequence data; and inputting the fault classification result into a pre-trained transducer model based on a similarity decomposition attention mechanism to obtain a fault type classification result.

Description

Software system performance fault prediction method and system
Technical Field
The invention belongs to the technical field of big data information processing and intelligent operation and maintenance, and particularly relates to a software system performance fault prediction method and system.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Large-scale software systems have become an important component of our daily activities, such as providing financial services, implementing healthcare operations, and the like. In order to meet the desires of users, software systems are required to perform their functions with high performance and high reliability. However, performance anomalies may occur while the system is running, thereby affecting the user experience.
To maintain the performance of a software system, an operator needs to detect performance anomalies to prevent failures at runtime. The manual tracking of the execution state of a software system by an operator can result in high costs due to the increased size and complexity of the software system. Thus, researchers have proposed various methods to automatically monitor software systems and detect anomalies in operation. However, the inventors have found that detecting and taking corrective action after a fault has occurred may result in violating a service level objective (e.g., long response time for user request), resulting in financial loss; meanwhile, in the case of fault prediction, most existing works pay attention to whether a fault occurs or not, and the cause of the abnormality is not further explained, so that operation and maintenance personnel need to spend a great deal of time to check the cause of the abnormality. Therefore, the performance fault prediction method capable of actively finding the fault in advance is provided, so that operation and maintenance personnel are helped to prevent the occurrence of potential faults, and the problem which is needed to be solved currently is solved urgently.
Disclosure of Invention
In order to solve the problems, the invention provides a software system performance fault prediction method and a software system performance fault prediction system, the scheme fully considers the relation among different dimensionalities of a single monitoring index, adopts LSTM combined with a cross attention mechanism to perform prediction, combines a plurality of monitoring indexes, optimizes a multi-head attention mechanism of a transducer by using an attention mechanism based on similarity decomposition, further predicts the cause of possible abnormality, and effectively improves the prediction efficiency and accuracy.
According to a first aspect of an embodiment of the present invention, there is provided a software system performance failure prediction method, including:
acquiring monitoring index data of a system to be tested in operation, and carrying out corresponding pretreatment;
based on the preprocessed monitoring index data, a longitudinal sequence and a transverse sequence of each single index data are obtained;
acquiring cross attention vectors of the longitudinal sequence and the transverse sequence based on a cross attention mechanism, and acquiring predicted values of all single indexes based on the cross attention vectors and a pre-trained LSTM model;
calculating the difference value between the predicted value and the actual monitoring value, if the difference value between the single index predicted value and the actual monitoring value exceeds a preset threshold value, judging that an abnormality occurs, and outputting a fault early warning;
combining the single index data sequences to obtain multi-dimensional monitoring index sequence data; and inputting the multidimensional monitoring index sequence data into a pre-trained transducer model for fault classification to obtain a fault type classification result, wherein the transducer model adopts an attention mechanism based on similarity decomposition.
Further, the cross attention mechanism acquires cross attention vectors of the longitudinal sequence and the transverse sequence, specifically:
respectively inputting the longitudinal sequence and the transverse sequence of the obtained single index data into a cross attention module to respectively obtain a first cross attention vector and a second cross attention vector;
and cross multiplying the first cross attention vector by the longitudinal sequence of the single index data, and splicing the result of cross multiplying the second cross attention vector by the transverse sequence of the single index data to obtain the cross attention vector.
Further, the attention mechanism based on similarity decomposition specifically comprises the following processing procedures:
based on the obtained multidimensional monitoring index sequence data, a key vector, a value vector and a query vector of the multidimensional monitoring index sequence data are obtained;
calculating an inflow and an outflow based on the obtained key vector and the query vector;
calculating competition based on the stream and the value vector; calculating convergence based on the key vector, the query vector, the inflow and the competition;
and calculating to obtain an allocation result based on the convergence and inflow.
Further, the step of inputting the multidimensional monitoring index sequence data into a pre-trained transducer model for fault classification, and obtaining a fault type classification result comprises the following specific steps:
and inputting the calculated distribution result into an Add & Norm network layer in a transform model to perform residual connection calculation, inputting the output result into a Feed Forward layer, and processing the output result by a normalization layer to obtain a classification result.
According to a second aspect of an embodiment of the present invention, there is provided a software system performance failure prediction system, including:
the data acquisition unit is used for acquiring monitoring index data of the system to be tested in operation and carrying out corresponding preprocessing;
a cross attention prediction unit for obtaining a longitudinal sequence and a transverse sequence of each single index data based on the preprocessed monitor index data; acquiring cross attention vectors of the longitudinal sequence and the transverse sequence based on a cross attention mechanism, and acquiring predicted values of all single indexes based on the cross attention vectors and a pre-trained LSTM model;
the prediction value analysis unit is used for calculating the difference value between the prediction value and the actual monitoring value, judging that an abnormality occurs if the difference value between the single index prediction value and the actual monitoring value exceeds a preset threshold value, and outputting fault early warning;
the fault classification unit is used for combining the single index data sequences to obtain multidimensional monitoring index sequence data; and inputting the multidimensional monitoring index sequence data into a pre-trained transducer model for fault classification to obtain a fault type classification result, wherein the transducer model adopts an attention mechanism based on similarity decomposition.
According to a third aspect of the embodiment of the present invention, there is provided an electronic device, including a memory, a processor, and a computer program stored to run on the memory, where the processor implements the software system performance failure prediction method when executing the program.
According to a fourth aspect of embodiments of the present invention, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the software system performance failure prediction method.
Compared with the prior art, the invention has the beneficial effects that:
(1) The invention provides a software system performance fault prediction method and a system, wherein the scheme combines a longitudinal sequence and a transverse sequence of an index, introduces a cross attention mechanism to capture dependency information between double-sequence information, and learns a representation vector by utilizing LSTM (least squares) at the same time, thereby effectively enhancing the effectiveness of a prediction result;
(2) According to the scheme, the SD-transducer model is obtained by optimizing the transducer model, after fault early warning is carried out, the multi-dimensional index monitoring data sequence is adopted and is input into the SD-transducer model for fault category prediction, possible abnormal reasons are obtained, and based on the obtained abnormal reasons, operation and maintenance personnel can be effectively guided, so that the operation efficiency of a software system is further improved.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a flowchart of a software system performance failure prediction method according to an embodiment of the present invention;
FIG. 2 is a flow chart of LSTM processing with fused cross-attention as described in an embodiment of the invention;
FIG. 3 is a flowchart of an SD-transducer process according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a software system performance failure prediction system according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Term interpretation:
key vectors, value vectors, query vectors are terms in the attention mechanism used, where:
the key vector and the value vector are used for representing input in the form of key value pairs;
the query vector is a task-related representation;
inflow, outflow, competition, convergence into a description under the network flow perspective, wherein:
the inflow and the outflow are input data and output data of a certain node;
competing: distribution of weights;
converging: and aggregating the information.
Embodiment one:
an object of the present embodiment is to provide a software system performance failure prediction method.
The performance fault prediction method predicts that the software system is about to enter a fault state by capturing the state before the fault occurs. Performance anomalies do not always immediately cause a malfunction to the system, during which there is a time difference. While the software system is considered to be in an abnormal state from performance degradation to performance failure occurrence, predicting the failure at this stage may take precautions in advance to reduce losses.
The use of cross-attention allows capturing the dependency between different dimensions of a single index, including a longitudinal sequence, which refers to the time sequence of simultaneous phase data formation over the last several cycles of a single index, and a lateral sequence, which refers to the time sequence of single index formation in the last several time steps. In the previous time sequence data prediction work, LSTM is widely applied, so that the invention integrates the cross attention into the LSTM model to obtain a single index predicted value.
In performing fault prediction, most existing works focus on whether a fault will occur or not, and the cause of the abnormality is not explained further. And the prediction timeliness is also important for the task of fault prediction, so after the fault prediction is carried out, the invention combines various index data, uses a transducer to carry out fault classification, optimizes a multi-head attention mechanism in the fault classification, uses an attention mechanism based on similarity decomposition (Similarity Decomposition), and proposes an SD-transducer to improve the computation efficiency of the transducer and solve the problem.
In order to solve the problems existing in the prior art, the present embodiment provides a software system performance fault prediction method, including:
acquiring monitoring index data of a system to be tested in operation, and carrying out corresponding pretreatment;
based on the preprocessed monitoring index data, a longitudinal sequence and a transverse sequence of each single index data are obtained;
acquiring cross attention vectors of the longitudinal sequence and the transverse sequence based on a cross attention mechanism, and acquiring predicted values of all single indexes based on the cross attention vectors and a pre-trained LSTM model;
calculating the difference value between the predicted value and the actual monitoring value, if the difference value between the single index predicted value and the actual monitoring value exceeds a preset threshold value, judging that an abnormality occurs, and outputting a fault early warning;
combining the single index data sequences to obtain multi-dimensional monitoring index sequence data; and inputting the multidimensional monitoring index sequence data into a pre-trained transducer model for fault classification to obtain a fault type classification result, wherein the transducer model adopts an attention mechanism based on similarity decomposition.
Further, the cross attention mechanism acquires cross attention vectors of the longitudinal sequence and the transverse sequence, specifically:
respectively inputting the longitudinal sequence and the transverse sequence of the obtained single index data into a cross attention module to respectively obtain a first cross attention vector and a second cross attention vector;
and cross multiplying the first cross attention vector by the longitudinal sequence of the single index data, and splicing the result of cross multiplying the second cross attention vector by the transverse sequence of the single index data to obtain the cross attention vector.
Further, the attention mechanism based on similarity decomposition specifically comprises the following processing procedures:
based on the obtained multidimensional monitoring index sequence data, a key vector, a value vector and a query vector of the multidimensional monitoring index sequence data are obtained;
calculating an inflow and an outflow based on the obtained key vector and the query vector;
calculating competition based on the stream and the value vector; calculating convergence based on the key vector, the query vector, the inflow and the competition;
and calculating to obtain an allocation result based on the convergence and inflow.
Further, the step of inputting the multidimensional monitoring index sequence data into a pre-trained transducer model for fault classification, and obtaining a fault type classification result comprises the following specific steps:
and inputting the calculated distribution result into an Add & Norm network layer in a transform model to perform residual connection calculation, inputting the output result into a Feed Forward layer, and processing the output result by a normalization layer to obtain a classification result.
Further, the longitudinal sequence and the transverse sequence for obtaining the single index data specifically are as follows: the longitudinal sequence is a time sequence formed by time phase data in the past several periods of a single index; the lateral sequence is a time sequence in which a single index is formed in a plurality of time steps adjacent to the current time.
Further, the monitoring index data comprises CPU usage rate, memory usage rate, network transmission rate, network receiving rate, file writing byte per second, file reading byte per second and active thread number.
Further, the preprocessing comprises data cleaning, missing data complement, data definition and normalization processing.
In particular, for easy understanding, the following detailed description of the embodiments will be given with reference to the accompanying drawings:
as shown in fig. 1, the present embodiment provides a software system performance failure prediction method, which specifically includes the following steps:
step 1: monitoring index data acquisition
Collecting monitoring index data of a system in operation by means of a monitoring tool of a software system, comprising: CPU usage (CPU), memory usage (mem), network transmission rate (netTransmit), network receiving rate (netReceive), file writing byte per second (diskWritten), file reading byte per second (diskRead) and active thread number (thread), wherein the monitoring index data comprises normal data and abnormal data, longitudinal data and transverse data, and preprocessing the monitoring data, wherein the preprocessing comprises data cleaning, missing data complementation, data definition and normalization processing, and then various index data are combined and converted into a format capable of being input into a neural network to obtain single index sequence data and multi-dimensional index sequence data.
Step 2: as shown in fig. 2. Based on the two types of sequence data of the single index, longitudinal data
Figure BDA0004048387790000081
Lateral data->
Figure BDA0004048387790000082
Cross-attention was used to obtain the interdependence between the two sequences. Wherein the longitudinal sequence refers to the formation of time phase data over the past several cycles of a single indexThe time series, for example, the current time is 15:00, the index data takes days as a period, the vertical series data is a series of data of 15:00 of the past t days, and the horizontal series refers to a time series formed by a single index in the last several time steps, for example, the current time is 15:00, and the horizontal series data is a time series formed by data of the past t minutes at one time step per minute.
Based on the dual sequence vectors, a cross attention vector between sequences is calculated
Figure BDA0004048387790000083
And->
Figure BDA0004048387790000084
Figure BDA0004048387790000085
Figure BDA0004048387790000086
Wherein V is cd ,W cd ,U cd ,V dc ,W dc And U dc Representing a parameter vector.
Based on the resulting cross-attention vector, the inputs of the LSTM are calculated:
Figure BDA0004048387790000087
for the input data, an input gate i is used t Controlling candidate states at a current time
Figure BDA0004048387790000088
How much information needs to be saved:
i t =σ(W i x t +U i h t-1 +b i )
use of forgetting door f t Controlling the interior of the previous momentState c t-1 How much information needs to be forgotten:
f t =σ(W f x t +U f h t-1 +b f )
using output gates o t Control of internal state c at the present instant t How much information needs to be output to the external state h t
o t =σ(W o x t +U o h t-1 +b o )
Wherein sigma is a Logistic function, x t H is the current input t-1 Is the external state at the last moment.
Thereafter, candidate states are calculated
Figure BDA0004048387790000091
Figure BDA0004048387790000092
And updating the memory cell c t
Figure BDA0004048387790000093
And then the internal state information is transferred to the external state h t
h t =o t ⊙tanh(c t )
In the above formula, W, U, b is a network parameter that can be learned.
Obtaining a state vector H [ H ] according to the steps 1 ,h 2 ,…,h t ]And inputting the single index prediction value into a full connection layer to obtain a single index prediction value.
Step 3: based on the obtained predicted value, calculating the difference value between the predicted value and the actual monitored value, comparing the difference value with a specified threshold value, and judging that no abnormality occurs if the difference value does not exceed the threshold value; if the difference value is higher than the specified threshold value, outputting fault early warning, and inputting the multidimensional monitoring data into the classification model.
For the collected multidimensional data, the monitoring data of the ith step (i.e. the ith moment) is combined to obtain a sequence
Figure BDA0004048387790000094
{CPU i ,mem i ,netTransmit i ,netReceive i ,diskWritten i ,diskRead i ,thread i }
Combining the sequences of the t steps to obtain
Figure BDA0004048387790000095
Step 4: based on the resulting multidimensional monitoring data, it is input into a modified transducer model.
As shown in FIG. 3, in the original transducer model, the attention mechanism adopts a multi-head attention mechanism, so that the calculation cost is high, and the calculation efficiency is improved by using the attention mechanism based on similarity decomposition.
For multidimensional monitoring sequence data input to a model
Figure BDA0004048387790000096
According to
Figure BDA0004048387790000097
Obtaining key value pairs and query vectors:
Figure BDA0004048387790000098
Figure BDA0004048387790000099
Figure BDA00040483877900000910
wherein W is q 、W k 、W v Linear mapping parameters for query-key-values, respectivelyA number matrix, which is a learnable parameter; q, K, V are matrices of query vectors, key vectors, and value vectors, respectively.
Thereafter, inflow I and outflow O are calculated:
Figure BDA0004048387790000101
Figure BDA0004048387790000102
based on the obtained I and O, calculating competition v≡A and convergence A:
Figure BDA0004048387790000103
Figure BDA0004048387790000104
finally, calculating an allocation result R:
R=Sigmoid(I)⊙A
based on the obtained result R, inputting the result R into an Add & Norm network layer for residual connection calculation:
Figure BDA0004048387790000105
based on the obtained output, input to Feed Forward layer to obtain output
Y=max(0,outputAdd·W 1 +b 1 )·W 2 +b 2
output=LayerNorm(outputAdd+Y)
In the above formula, W 1 、W 2 、b 1 、b 2 Is a network parameter that can be learned.
Based on the obtained output, the output is input to a full-connection layer, a softmax prediction function is used, a loss function of an output value of the prediction function is calculated, and a back propagation algorithm is adopted to train learning parameters of a model, so that fault classification is completed.
Embodiment two:
it is an object of this embodiment to provide a software system performance failure prediction system.
A software system performance fault prediction system, comprising:
the data acquisition unit is used for acquiring monitoring index data of the system to be tested in operation and carrying out corresponding preprocessing;
a cross attention prediction unit for obtaining a longitudinal sequence and a transverse sequence of each single index data based on the preprocessed monitor index data; acquiring cross attention vectors of the longitudinal sequence and the transverse sequence based on a cross attention mechanism, and acquiring predicted values of all single indexes based on the cross attention vectors and a pre-trained LSTM model;
the prediction value analysis unit is used for calculating the difference value between the prediction value and the actual monitoring value, judging that an abnormality occurs if the difference value between the single index prediction value and the actual monitoring value exceeds a preset threshold value, and outputting fault early warning;
the fault classification unit is used for combining the single index data sequences to obtain multidimensional monitoring index sequence data; and inputting the multidimensional monitoring index sequence data into a pre-trained transducer model for fault classification to obtain a fault type classification result, wherein the transducer model adopts an attention mechanism based on similarity decomposition.
Further, the system in this embodiment corresponds to the method in the first embodiment, and the technical details thereof are described in the first embodiment, so that they will not be described herein.
In further embodiments, there is also provided:
an electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the method of embodiment one. For brevity, the description is omitted here.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include read only memory and random access memory and provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store information of the device type.
A computer readable storage medium storing computer instructions which, when executed by a processor, perform the method of embodiment one.
The method in the first embodiment may be directly implemented as a hardware processor executing or implemented by a combination of hardware and software modules in the processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method. To avoid repetition, a detailed description is not provided herein.
Those of ordinary skill in the art will appreciate that the elements of the various examples described in connection with the present embodiments, i.e., the algorithm steps, can be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The software system performance fault prediction method and the system provided by the embodiment can be realized, and have wide application prospects.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A method for predicting a performance failure of a software system, comprising:
acquiring monitoring index data of a system to be tested in operation, and carrying out corresponding pretreatment;
based on the preprocessed monitoring index data, a longitudinal sequence and a transverse sequence of each single index data are obtained;
acquiring cross attention vectors of the longitudinal sequence and the transverse sequence based on a cross attention mechanism, and acquiring predicted values of all single indexes based on the cross attention vectors and a pre-trained LSTM model;
calculating the difference value between the predicted value and the actual monitoring value, if the difference value between the single index predicted value and the actual monitoring value exceeds a preset threshold value, judging that an abnormality occurs, and outputting a fault early warning;
combining the single index data sequences to obtain multi-dimensional monitoring index sequence data; inputting the multidimensional monitoring index sequence data into a pre-trained transducer model for fault classification to obtain a fault type classification result, wherein the transducer model adopts an attention mechanism based on similarity decomposition;
the cross attention mechanism is used for acquiring cross attention vectors of the longitudinal sequence and the transverse sequence, and specifically comprises the following steps:
respectively inputting the longitudinal sequence and the transverse sequence of the obtained single index data into a cross attention module to respectively obtain a first cross attention vector and a second cross attention vector;
cross multiplying the first cross attention vector with the longitudinal sequence of the single index data, and splicing the result of cross multiplying the second cross attention vector with the transverse sequence of the single index data to obtain a cross attention vector;
the longitudinal sequence and the transverse sequence for obtaining the single index data specifically include: the longitudinal sequence is a time sequence formed by time phase data in the past several periods of a single index; the lateral sequence is a time sequence in which a single index is formed in a plurality of time steps adjacent to the current time.
2. The method for predicting performance failure of a software system according to claim 1, wherein the similarity decomposition-based attention mechanism comprises the following steps:
based on the obtained multidimensional monitoring index sequence data, a key vector, a value vector and a query vector of the multidimensional monitoring index sequence data are obtained;
calculating an inflow and an outflow based on the obtained key vector and the query vector;
calculating competition based on the stream and the value vector; calculating convergence based on the key vector, the query vector, the inflow and the competition;
and calculating to obtain an allocation result based on the convergence and inflow.
3. The method for predicting performance faults of a software system according to claim 1, wherein the step of inputting the multi-dimensional monitoring index sequence data into a pre-trained transducer model to perform fault classification, and obtaining a fault type classification result specifically comprises the following steps:
and inputting the calculated distribution result into an Add & Norm network layer in a transform model to perform residual connection calculation, inputting the output result into a Feed Forward layer, and processing the output result by a normalization layer to obtain a classification result.
4. The method of claim 1, wherein the monitor indicator data comprises CPU utilization, memory utilization, network transmission rate, network reception rate, file write bytes per second, file read bytes per second, and active thread count.
5. A method of predicting a performance failure of a software system as recited in claim 1, wherein the preprocessing includes data cleansing, missing data complementation, data definition, and normalization.
6. A software system performance fault prediction system, comprising:
the data acquisition unit is used for acquiring monitoring index data of the system to be tested in operation and carrying out corresponding preprocessing;
a cross attention prediction unit for obtaining a longitudinal sequence and a transverse sequence of each single index data based on the preprocessed monitor index data; acquiring cross attention vectors of the longitudinal sequence and the transverse sequence based on a cross attention mechanism, and acquiring predicted values of all single indexes based on the cross attention vectors and a pre-trained LSTM model;
the prediction value analysis unit is used for calculating the difference value between the prediction value and the actual monitoring value, judging that an abnormality occurs if the difference value between the single index prediction value and the actual monitoring value exceeds a preset threshold value, and outputting fault early warning;
the fault classification unit is used for combining the single index data sequences to obtain multidimensional monitoring index sequence data; inputting the multidimensional monitoring index sequence data into a pre-trained transducer model for fault classification to obtain a fault type classification result, wherein the transducer model adopts an attention mechanism based on similarity decomposition;
the cross attention mechanism is used for acquiring cross attention vectors of the longitudinal sequence and the transverse sequence, and specifically comprises the following steps:
respectively inputting the longitudinal sequence and the transverse sequence of the obtained single index data into a cross attention module to respectively obtain a first cross attention vector and a second cross attention vector;
cross multiplying the first cross attention vector with the longitudinal sequence of the single index data, and splicing the result of cross multiplying the second cross attention vector with the transverse sequence of the single index data to obtain a cross attention vector;
the longitudinal sequence and the transverse sequence for obtaining the single index data specifically include: the longitudinal sequence is a time sequence formed by time phase data in the past several periods of a single index; the lateral sequence is a time sequence in which a single index is formed in a plurality of time steps adjacent to the current time.
7. An electronic device comprising a memory, a processor and a computer program stored for execution on the memory, wherein the processor implements a software system performance fault prediction method as claimed in any one of claims 1 to 5 when executing the computer program.
8. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a software system performance fault prediction method according to any of claims 1-5.
CN202310033759.9A 2023-01-10 2023-01-10 Software system performance fault prediction method and system Active CN116069606B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310033759.9A CN116069606B (en) 2023-01-10 2023-01-10 Software system performance fault prediction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310033759.9A CN116069606B (en) 2023-01-10 2023-01-10 Software system performance fault prediction method and system

Publications (2)

Publication Number Publication Date
CN116069606A CN116069606A (en) 2023-05-05
CN116069606B true CN116069606B (en) 2023-07-07

Family

ID=86178076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310033759.9A Active CN116069606B (en) 2023-01-10 2023-01-10 Software system performance fault prediction method and system

Country Status (1)

Country Link
CN (1) CN116069606B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117724935B (en) * 2024-02-06 2024-06-07 山东大学 Multi-index abnormality detection method and system for software system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815484A (en) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 Based on the semantic similarity matching process and its coalignment for intersecting attention mechanism
CN112633317A (en) * 2020-11-02 2021-04-09 国能信控互联技术有限公司 CNN-LSTM fan fault prediction method and system based on attention mechanism
CN114254695A (en) * 2021-11-18 2022-03-29 中国空间技术研究院 Spacecraft telemetry data self-adaptive anomaly detection method and device
CN114818515A (en) * 2022-06-24 2022-07-29 中国海洋大学 Multidimensional time sequence prediction method based on self-attention mechanism and graph convolution network
CN115344414A (en) * 2022-08-15 2022-11-15 山东省计算中心(国家超级计算济南中心) Log anomaly detection method and system based on LSTM-Transformer

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815484A (en) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 Based on the semantic similarity matching process and its coalignment for intersecting attention mechanism
CN112633317A (en) * 2020-11-02 2021-04-09 国能信控互联技术有限公司 CNN-LSTM fan fault prediction method and system based on attention mechanism
CN114254695A (en) * 2021-11-18 2022-03-29 中国空间技术研究院 Spacecraft telemetry data self-adaptive anomaly detection method and device
CN114818515A (en) * 2022-06-24 2022-07-29 中国海洋大学 Multidimensional time sequence prediction method based on self-attention mechanism and graph convolution network
CN115344414A (en) * 2022-08-15 2022-11-15 山东省计算中心(国家超级计算济南中心) Log anomaly detection method and system based on LSTM-Transformer

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于图注意力机制和Transformer的异常检测;史玉良;《电子学报》;第第50卷卷(第第4期期);900-908 *

Also Published As

Publication number Publication date
CN116069606A (en) 2023-05-05

Similar Documents

Publication Publication Date Title
KR101984730B1 (en) Automatic predicting system for server failure and automatic predicting method for server failure
CN113822421B (en) Neural network-based anomaly locating method, system, equipment and storage medium
CN109697570B (en) State evaluation method, system and equipment for secondary equipment of transformer substation
CN106844161A (en) Abnormal monitoring and Forecasting Methodology and system in a kind of carrier state stream calculation system
KR20180108446A (en) System and method for management of ict infra
CN108415810B (en) Hard disk state monitoring method and device
CN116069606B (en) Software system performance fault prediction method and system
Bao et al. Chemical process fault diagnosis based on a combined deep learning method
CN113449463A (en) LSTM-DNN-based equipment life prediction method and device
CN116467674A (en) Intelligent fault processing fusion updating system and method for power distribution network
Gupta et al. A supervised deep learning framework for proactive anomaly detection in cloud workloads
Xu et al. Industrial process fault detection and diagnosis framework based on enhanced supervised kernel entropy component analysis
CN108415819B (en) Hard disk fault tracking method and device
CN113487086A (en) Method and device for predicting remaining service life of equipment, computer equipment and medium
Poghosyan et al. Managing cloud infrastructures by a multi-layer data analytics
WO2024087404A1 (en) Nuclear reactor fault determination method, apparatus, device, storage medium, and product
CN111614504A (en) Power grid regulation and control data center service characteristic fault positioning method and system based on time sequence and fault tree analysis
CN115712874A (en) Thermal energy power system fault diagnosis method and device based on time series characteristics
CN113076217B (en) Disk fault prediction method based on domestic platform
CN109978038A (en) A kind of cluster abnormality determination method and device
KR20220064114A (en) System and method for equipment abnormality diagnosis based on multiple transfer active learning
Han et al. Diagnosis method of abnormal fluctuation of CPU usage based on iForest-Bi-LSTM
CN116340765B (en) Electricity larceny user prediction method and device, storage medium and electronic equipment
CN118113503A (en) Intelligent operation and maintenance system fault prediction method, device, equipment and storage medium
Pei et al. Application of multivariate time-series model for high performance computing (HPC) fault prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant