CN117785625A

CN117785625A - Method, device, equipment and storage medium for predicting server performance

Info

Publication number: CN117785625A
Application number: CN202311615482.7A
Authority: CN
Inventors: 梅森
Original assignee: Jinan Inspur Data Technology Co Ltd
Current assignee: Jinan Inspur Data Technology Co Ltd
Priority date: 2023-11-28
Filing date: 2023-11-28
Publication date: 2024-03-29

Abstract

The invention relates to the technical field of servers and discloses a method, a device, equipment and a storage medium for predicting server performance, wherein a log prediction model and a performance prediction model obtained through training are obtained; the log prediction model is used for representing the mapping relation between time and the operation log of the server; the performance prediction model is used for representing the mapping relation between the operation log and the performance data of the server; inputting the target time into a log prediction model to obtain a predicted value of an operation log; and inputting the predicted value of the operation log into a performance prediction model to obtain the predicted value of the performance data. Therefore, mapping relations can be respectively established between time and the operation log of the server, and between the operation log and the performance data of the server, and prediction of the performance of the server is split into prediction of the operation log of the server based on time and prediction of the performance of the server based on the operation log, so that the prediction effect of the performance of the server in a high concurrency state is improved.

Description

Method, device, equipment and storage medium for predicting server performance

Technical Field

The present invention relates to the field of server technologies, and in particular, to a method, an apparatus, a device, and a storage medium for predicting server performance.

Background

The service carried by the large-scale machine room has high real-time performance, and when faults occur, the normal operation of the original service can be influenced by the maintenance, so that the change trend of the performance of the future server needs to be predicted efficiently and accurately. However, in a high concurrency state, as the scale of users and the traffic volume increase, the load of the server system increases rapidly, and the performance prediction method adopted at present cannot well predict the performance of the server under the condition, so that the performance bottleneck of the server cannot be found in time, and potential performance problems are early warned in advance.

Disclosure of Invention

In view of the above, the present invention provides a method, apparatus, device and storage medium for predicting server performance, so as to solve the problem that the predicting effect on server performance is not good and the performance bottleneck of the server cannot be found in time in a high concurrency state.

In a first aspect, the present invention provides a method for predicting server performance, the method comprising:

acquiring a log prediction model and a performance prediction model which are obtained through training; the log prediction model is used for representing the mapping relation between time and the operation log of the server; the performance prediction model is used for representing the mapping relation between the operation log and the performance data of the server;

Inputting the target time into a log prediction model to obtain a predicted value of an operation log; the target time is a future period of time under the current time;

inputting the predicted value of the operation log into a performance prediction model to obtain a predicted value of performance data; the predicted value of the performance data corresponds to the target time.

Therefore, mapping relations can be respectively established between time and the operation log of the server, and between the operation log and the performance data of the server, and prediction of the performance of the server is split into prediction of the operation log of the server based on time and prediction of the performance of the server based on the operation log, so that the prediction effect of the performance of the server in a high concurrency state is improved.

In an alternative embodiment, the log prediction model and the performance prediction model are trained by:

acquiring an operation log and performance data of a server in a preset time interval;

determining a log feature vector corresponding to the operation log; the log feature vector is obtained by extracting keywords from an operation log; the log feature vector is used for representing the occurrence frequency of each keyword in the operation log;

dividing a preset time interval into a preset number of unit time intervals;

Based on a unit time interval, respectively dividing the log feature vector and the performance data into a unit log vector and unit performance data; the unit time interval, the unit log vector and the unit performance data are in one-to-one correspondence;

combining the unit time interval and the unit log vector to obtain a first data set;

combining the unit log vector and the unit performance data to obtain a second data set;

inputting the first data set into a first initial prediction model to train the first initial prediction model to obtain a log prediction model;

and inputting the second data set into a second initial prediction model to train the second initial prediction model to obtain a performance prediction model.

Thus, a log prediction model and a performance prediction model can be respectively obtained by constructing a data set and training the model.

In an alternative embodiment, the first initial prediction model and the second initial prediction model are used to characterize a nonlinear mapping relationship between the tag element and the feature element; the nonlinear mapping relationship is characterized by the following mathematical expression:

wherein Y is a tag element, the tag elementX is a characteristic element, characteristic element->W is a weight matrix, epsilon is an error matrix;

In the first initial prediction model, a tag element Y is a unit log vector, and a characteristic element X is a unit time interval;

in the second initial prediction model, the tag element Y is unit performance data, and the feature element X is unit log vector.

Therefore, a nonlinear mapping relation can be established between the unit log vector and the unit time interval and between the unit performance data and the unit log vector, and the accuracy of the finally obtained log prediction model and the accuracy of the performance prediction model are ensured.

In an alternative embodiment, determining the log feature vector corresponding to the operation log includes:

extracting keywords from the operation log to obtain a keyword library;

determining the occurrence frequency of each keyword in the keyword library in an operation log;

and arranging the occurrence frequency according to the arrangement sequence of the keywords in the keyword library to obtain the log feature vector.

Therefore, the operation log can be extracted according to the occurrence frequency of the keywords in the operation log to obtain the log feature vector, so that the operation log is mapped into the multi-dimensional feature vector with mathematical representation.

In an alternative embodiment, inputting the first data set into the first initial predictive model to train the first initial predictive model to obtain the log predictive model includes:

Dividing the first data set into a first training data set and a first test data set;

and inputting the first training data set into a first initial prediction model to train the first initial prediction model to obtain a log prediction model.

In an alternative embodiment, the second data set is input into a second initial predictive model to train the second initial predictive model to obtain a performance predictive model, comprising:

dividing the second data set into a second training data set and a second test data set;

and inputting the second training data set into a second initial prediction model to train the second initial prediction model to obtain a performance prediction model.

In an alternative embodiment, the training to obtain the log prediction model and the performance prediction model further includes:

taking the log prediction model as a log model to be verified, and taking the performance prediction model as a performance model to be verified;

inputting a unit time interval in the first test data set into a log model to be verified to obtain a predicted value of a unit log vector in the first test data set;

inputting the unit log vector in the second test data set into the performance model to be verified to obtain a predicted value of the unit performance data in the second test data set;

And when the difference between the value of the unit log vector in the first test data set and the predicted value is in a first error range and the difference between the value of the unit performance data in the second test data set and the predicted value is in a second error range, determining the log model to be verified as a log prediction model, and determining the performance model to be verified as a performance prediction model.

Thus, the prediction model obtained by training can be verified after the prediction model is obtained by training.

In a second aspect, the present invention provides a server performance prediction apparatus, the apparatus comprising:

the model acquisition module is used for acquiring a log prediction model and a performance prediction model which are obtained through training; the log prediction model is used for representing the mapping relation between time and the operation log of the server; the performance prediction model is used for representing the mapping relation between the operation log and the performance data of the server;

the operation log prediction module is used for inputting the target time into the log prediction model to obtain a predicted value of the operation log; the target time is a future period of time under the current time;

the performance data prediction module is used for inputting the predicted value of the operation log into the performance prediction model to obtain the predicted value of the performance data; the predicted value of the performance data corresponds to the target time.

In a third aspect, the present invention provides a computer device comprising: the server performance prediction method according to the first aspect or any one of the embodiments thereof is implemented by the processor and the memory, the memory and the processor are communicatively connected to each other, and the memory stores computer instructions.

In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon computer instructions for causing a computer to execute the server performance prediction method of the first aspect or any one of its corresponding embodiments.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a server performance prediction method according to an embodiment of the present invention;

FIG. 2 is a flow chart of training a derived log prediction model and a performance prediction model according to an embodiment of the present invention;

FIG. 3 is a flow chart of determining a log feature vector according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of determining a log feature vector according to an embodiment of the present invention;

FIG. 5 is a flow diagram of server performance prediction according to an embodiment of the present invention;

FIG. 6 is a block diagram of a server performance prediction apparatus according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the related art, the method for predicting the performance of the server, such as a least square method, a quadratic exponential smoothing method, a recurrent neural network, and the like, is to collect a change curve of the performance of the server along with time, and then fit the mapping relation between the performance and time through a corresponding algorithm, so as to predict the performance of the server. However, in a high concurrency state, as the scale of users and the traffic volume increase, the load of the server system can be rapidly increased, the performance of the server is no longer in accordance with the fitting relation with time, the performance of the server cannot be well predicted by the method adopted in the related technology, the performance bottleneck of the server cannot be timely found, and the potential performance problem is early warned in advance.

Based on the above, the embodiment of the invention provides a server performance prediction method, which comprises the steps of obtaining a log prediction model and a performance prediction model which are obtained through training; the log prediction model is a mapping relation between time obtained by fitting and an operation log of the server; the performance prediction model is a mapping relation between the operation log obtained by fitting and the performance data of the server; obtaining a predicted value of an operation log based on the log prediction model; and inputting the predicted value of the operation log into a performance prediction model to obtain the predicted value of the performance data. Therefore, mapping relations can be respectively fitted between time and operation logs of the server and between the operation logs and performance data of the server, and prediction of the performance of the server is split into prediction of the operation logs of the server based on time and prediction of the performance of the server based on the operation logs, so that the prediction effect of the performance of the server in a high concurrency state is improved.

According to an embodiment of the present invention, there is provided a server performance prediction method embodiment, it being noted that the steps shown in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is shown in the flowchart, in some cases the steps shown or described may be performed in an order other than that shown or described herein.

In this embodiment, a server performance prediction method is provided, which may be used in the server described above, and fig. 1 is a flowchart of a server performance prediction method according to an embodiment of the present invention, as shown in fig. 1, where the flowchart includes the following steps:

step S101, obtaining a log prediction model and a performance prediction model which are obtained through training.

In the embodiment of the invention, the log prediction model is used for representing the mapping relation between time and the operation log of the server, and the performance prediction model is used for representing the mapping relation between the operation log and the performance data of the server. And respectively constructing a first initial prediction model corresponding to the log prediction model and a second initial prediction model corresponding to the performance prediction model, and respectively training the first initial prediction model and the second initial prediction model to obtain the log prediction model and the performance prediction model.

In the embodiment of the invention, the load condition of the server system is represented by adopting the operation log, the prediction of the performance of the server is disassembled into the prediction of the load of the server, and the prediction of the performance of the server is based on the load of the server. Because the performance of the server has strong correlation with the load condition of the server, when the load of the server is increased, the related performance occupation of the server is necessarily increased, so that the mapping relation between the load condition of the server system and the performance data is fitted, the performance of the server is predicted based on the predicted load condition of the server, and a better server performance prediction effect can be generated.

In an alternative embodiment, since the operation log is a text information set about the instructions of the user on the server-related operations and the server-related responses, the information content is large, and the model cannot directly process the original text information, so that the original text information is subjected to feature extraction, and is mapped into a multi-dimensional feature vector of mathematical representation, so that the multi-dimensional feature vector is converted into data which can be processed by the model.

Step S102, inputting the target time based on the log prediction model to obtain a predicted value of the operation log.

In the embodiment of the invention, the target time is a future period of time under the current time. Because the log prediction model is used for representing the mapping relation between time and the operation log of the server, the target time can be input into the log prediction model to obtain the predicted value of the operation log in a future period of time.

Step S103, inputting the predicted value of the operation log into a performance prediction model to obtain the predicted value of the performance data.

In the embodiment of the invention, the predicted value of the operation log obtained from the log prediction model is input into the performance prediction model, the performance data of the server is predicted based on the predicted value of the operation log, the predicted value of the performance data is obtained, and the predicted value of the performance data obtained by the method corresponds to the target time, so that the prediction of the performance data of the server in the target time is realized.

In the embodiment of the present invention, the performance data of the server may include data of various performances, such as a processor utilization rate, a memory utilization rate, and a disk utilization rate of the server, that is, the output of the performance prediction model is a set of data, not one data.

In an alternative embodiment, when the predicted value of the performance data exceeds the risk threshold of the performance data, alarm information is sent, wherein the alarm information is used for prompting the server to be in an overload state, so that the alarm information is sent when the predicted value of the performance data of the server is predicted to have an overload risk, and the server is prompted to be in the overload state.

In an alternative embodiment, since the performance data of the server is a set of data about the performance of the server, the server may be considered in an overload state as long as a predicted value of one of the data in the set exceeds its risk threshold, at which point an alarm message may be sent.

In an alternative embodiment, the risk thresholds of the performance data corresponding to different performance may also be different, for example, the risk threshold of the processor utilization may be preset to be 70%, the risk threshold of the memory utilization is 80%, and the risk threshold of the disk utilization is 80%.

Compared with the method that the mapping relation between the time and the performance data obtained based on fitting can not be used for well predicting the performance of the server under the condition of high concurrency in the related art, the method for predicting the performance of the server provided by the embodiment of the invention can be used for respectively establishing the mapping relation between the time and the operation log of the server and between the operation log and the performance data of the server, and splitting the prediction of the performance of the server into the prediction of the operation log of the server based on the time and the prediction of the performance of the server based on the operation log, so that the prediction effect of the performance of the server under the high concurrency state is improved.

In an alternative implementation, fig. 2 is a flowchart of training to obtain a log prediction model and a performance prediction model according to an embodiment of the present invention, where the log prediction model and the performance prediction model are trained as shown in fig. 2 by:

step S201, obtaining an operation log and performance data of the server in a preset time interval.

In the embodiment of the invention, the operation log and the performance data of the server in a preset time interval are collected under the normal running state of the server. The preset time interval may be a time interval under the current time, for example, a time interval set by taking 24 hours as a set time interval, and then the preset time interval may be a time interval of 24 hours before the current time.

In an alternative embodiment, to improve the reliability of the collected data, the preset time interval may comprise a plurality of time intervals, while to ensure the validity of the collected data over time, a maximum data collection interval may be set, which limits the maximum history time of the collected data, beyond which data will no longer be collected. The multiple time intervals can be uniformly distributed in the maximum data collection interval, can be randomly distributed in the maximum data collection interval, and the time duration of each time interval can be unequal. For example, data collection is performed in a preset time interval in the first three days of the current time, data collection is performed in a preset time interval in the current time, and the two time intervals are used as the preset time interval to complete collection of performance data of the operation log of the server.

Step S202, determining a log feature vector corresponding to the operation log.

In the embodiment of the invention, the log feature vector is obtained by extracting keywords from the operation log, and is used for representing the occurrence frequency of each keyword in the operation log. By extracting the characteristics of the keywords from the operation log, the text information of the operation log is mapped into the multi-dimensional characteristic vector which can be represented mathematically, so that the processing of the prediction model is facilitated.

In step S203, the preset time interval is divided into a preset number of unit time intervals.

In the embodiment of the invention, the preset time interval is equally divided into a preset number of unit time intervals. Assuming that the preset time interval is M and the preset number is N, the duration of each unit time interval is

In an alternative embodiment, the partitioning of the unit time interval may be adjusted based on the accuracy requirements for the server performance prediction. When the accuracy of the server performance prediction is required to be higher, the unit time interval can be divided into time intervals with shorter duration; when the accuracy requirement of the required server performance prediction is relatively low, the unit time interval can be divided into time intervals with shorter duration.

In an alternative embodiment, the unit time intervals may also be partitioned according to predicted time interval requirements for server performance. For example, when it is necessary to predict the server performance at a time interval of 1 hour, the duration of the unit time interval may be set to 1 hour.

Step S204, based on the unit time interval, the log feature vector and the performance data are respectively divided into a unit log vector and a unit performance data.

In the embodiment of the invention, the unit time interval, the unit log vector and the unit performance data are in one-to-one correspondence, that is, the log feature vector and the performance data are respectively divided into the unit log vector and the unit performance data which are consistent with the number of the unit time interval.

In an alternative embodiment, there may be multiple data of each server performance in the unit performance data, and the data corresponding to the same server performance may be averaged and then used as the corresponding unit performance data.

Step S205, combining the unit time interval and the unit log vector to obtain a first data set.

In the embodiment of the invention, a unit time interval and a unit log vector are combined to obtain a first data set. In the first dataset, the unit time interval is used as a characteristic element, namely an independent variable in the mapping relation, and the unit log vector is used as a label element, namely an independent variable in the mapping relation.

Step S206, combining the unit log vector and the unit performance data to obtain a second data set.

In the embodiment of the invention, the unit log vector and the unit performance data are combined to obtain the second data set. In the second data set, the unit log vector is used as a feature vector, i.e., an argument in the mapping relationship, and the unit performance data is used as a tag element, i.e., an argument in the mapping relationship.

In step S207, the first data set is input into the first initial prediction model to train the first initial prediction model, so as to obtain a log prediction model.

In the embodiment of the invention, a first data set is input into a first initial prediction model, and the first initial prediction model is trained, so that a log prediction model is obtained. The first initial prediction model can be constructed based on a cyclic neural network model, the cyclic neural network model is highly effective on data with sequence characteristics, and time sequence information and semantic information in the data can be fully mined; by adopting the cyclic neural network model as the first initial prediction model, semantic information of the operation log contained in the unit log vector can be extracted well.

In an alternative embodiment, when the first data set is input into the first initial prediction model and the first initial prediction model is trained, the first data set may be divided into a first training data set and a first test data set, and then the first training data set is input into the first initial prediction model to train the first initial prediction model, so as to obtain the log prediction model. The first initial prediction model is trained through the first training data set by dividing the first data set into the first training data set and the first test data set, and the log prediction model obtained through training is verified through the second test data set, so that the prediction result of the finally obtained log prediction model can be attached to the change trend of the operation log.

In an alternative embodiment, the first data set may also be divided into a first training data set, a first test data set and a first verification data set. The first training data set is input into a first initial prediction model to train the first initial prediction model, so that a log prediction model is obtained; the first verification data set is used for carrying out preliminary evaluation on the log prediction model obtained through training, inputting the first verification data set into the log prediction model obtained through training, and adjusting related parameters in the log prediction model so as to reduce the output of the log prediction model, namely the error of a predicted value, and further fit the actual condition of the change of the operation log; the first test data set is used for testing the log prediction model after the first verification data set is adjusted, the first test data set is input into the log prediction model after the adjustment to obtain a predicted value of the log prediction model, and the error of the log prediction model is determined based on the predicted value and an actual value in the first test data set, so that whether the obtained log prediction model meets the error requirement is judged.

Step S208, inputting the second data set into the second initial prediction model to train the second initial prediction model to obtain the performance prediction model.

In the embodiment of the invention, the second data set is input into the second initial prediction model to train the second initial prediction model, so that the performance prediction model is obtained. The second initial predictive model may be constructed based on a convolutional neural network model.

In an alternative embodiment, when the second data set is input into the second initial prediction model and the second initial prediction model is trained, the second data set may be divided into a second training data set and a second test data set, and then the second training data set is input into the second initial prediction model to train the second initial prediction model, so as to obtain the performance prediction model. The second initial prediction model is trained through the second training data set by dividing the second data set into the second training data set and the second test data set, and the performance prediction model obtained through training is verified through the second test data set, so that the prediction result of the finally obtained performance prediction model can be attached to the change trend of the operation performance.

In an alternative embodiment, the second data set may also be divided into a second training data set, a second test data set and a second validation data set. The second training data set is input into a second initial prediction model to train the second initial prediction model, so that a performance prediction model is obtained; the second verification data set is used for carrying out preliminary evaluation on the performance prediction model obtained through training, inputting the second verification data set into the performance prediction model obtained through training, and adjusting relevant parameters in the performance prediction model so as to reduce the output of the performance prediction model, namely the error of a predicted value, and further fit the actual condition of the change of the operation performance; the second test data set is used for testing the performance prediction model adjusted by the second verification data set, the second test data set is input into the performance prediction model adjusted to obtain a predicted value of the performance prediction model, and the error of the performance prediction model is determined based on the predicted value and an actual value in the second test data set, so that whether the obtained performance prediction model meets the error requirement is judged.

In an alternative embodiment, the mapping relationship between the time obtained by fitting based on the log prediction model and the operation log of the server, and the mapping relationship between the operation log obtained by fitting based on the performance prediction model and the performance data of the server are nonlinear relationships, which are obtained by training a first initial prediction model and a second initial prediction model that represent nonlinear mapping relationships between tag elements and feature elements, respectively. The nonlinear mapping relationship can be represented by the following formula (1):

wherein,are tag elements, i.e., dependent variables; />Is a feature element, namely an argument; w is a weight matrix used for fitting a nonlinear mapping relation between the tag element Y and the characteristic element X; epsilon is an error matrix used for counting error factors influencing the tag element Y in addition to the feature element X in the actual scene.

In the log prediction model, Y represents a log vector matrix in a unit time range, namely a unit log vector, Y _j Representing a specific operation log, such as adding alarm rules, etc. X represents a unit time interval, X ₁ Indicating a specific sampling time point.

In the performance prediction model, Y represents a system performance vector matrix in a unit time range, namely unit performance data, Y _j Specific performance metrics are represented, such as processor utilization, memory utilization, disk utilization, etc. X represents a matrix of log vectors in a unit time range, i.e. unit log vectors, X ₁ Representing a specific operation log, such as adding alarm rules, etc.

In an alternative implementation, fig. 3 is a schematic flow chart of determining the log feature vector according to an embodiment of the present invention, as shown in fig. 3, the step S202 may include the following steps:

step S301, extracting keywords from the operation log to obtain a keyword library.

In the embodiment of the invention, the obtained operation log is extracted with keywords, and then the extracted keywords are arranged according to the first appearance sequence of the operation log to obtain a keyword library, wherein the keyword library is equivalent to a keyword sequence.

In the embodiment of the present invention, feature extraction may be performed on an operation log based on a word bag model, and fig. 4 is a schematic diagram of a certain feature vector of the operation log according to the embodiment of the present invention, as shown in fig. 4, "adding an alarm rule", "newly adding a nanotube resource", "shielding a resource an alarm", and so on are specific text information included in the operation log, and keyword extraction is performed on the specific text information, for example, three keywords of "adding an alarm" and "alarming" are obtained by extracting "adding an alarm rule", and the keywords are arranged according to the order of first occurrence in the operation log, for example, for "resource", the first occurrence is in "newly adding a nanotube resource", so that the keyword is located after "nanotube" keyword in a keyword library.

Step S302, determining the occurrence frequency of each keyword in the keyword library in the operation log.

In the embodiment of the invention, for each obtained operation log, determining the occurrence frequency of each keyword in the keyword library, if the occurrence frequency is not in the operation log, marking the occurrence frequency as 0 in the log feature vector corresponding to the operation log, and correspondingly, marking the occurrence frequency as corresponding times in the log feature vector corresponding to the operation log when the occurrence frequency is in the operation log for several times.

Step S303, according to the arrangement sequence of the keywords in the keyword library, arranging the occurrence frequency to obtain the log feature vector.

According to the embodiment of the invention, the occurrence frequency of the keywords corresponding to each operation log is arranged according to the arrangement sequence of the keywords in the keyword library, so as to obtain the corresponding log feature vector. As shown in fig. 4, three keywords of "add", "alarm" and "rule" in the "add alarm rule" appear 1 time respectively, and then the log feature vector corresponding to the "add alarm rule" is [1, 0, … … ]; the three key words of 'newly added', 'nano tube' and 'resource' in the 'newly added nano tube resource' respectively appear once, and then the log feature vector corresponding to the 'newly added nano tube resource' is [0,0,0,1,1,1,0, … … ]; the four keywords of shielding, resource, A and alarm in the shielding resource A alarm occur once respectively, and then the log feature vector corresponding to the shielding resource A alarm is [0,1,0,0,0,1,1, … … ]. Thus, the oplog is mapped into a mathematically characterizable multi-dimensional feature vector.

In an optional embodiment, after step S208, a step of performing test verification on the trained log prediction model and the performance prediction model through the first test data set and the second test data set may specifically include the following steps:

and a step a1, taking the log prediction model as a log model to be verified, and taking the performance prediction model as a performance model to be verified.

And a step a2, inputting the unit time interval in the first test data set into a log model to be verified, and obtaining the predicted value of the unit log vector in the first test data set.

And a step a3 of inputting the unit log vector in the second test data set into the performance model to be verified to obtain the predicted value of the unit performance data in the second test data set.

And a step a4, when the difference between the value of the unit log vector in the first test data set and the predicted value is in a first error range and the difference between the value of the unit performance data in the second test data set and the predicted value is in a second error range, determining the log model to be verified as a log prediction model, and determining the performance model to be verified as a performance prediction model.

As in the above steps a1 to a4, if and only if the difference between the value of the unit log vector and the predicted value and the difference between the value of the unit performance data and the predicted value are within the error range, that is, the predicted errors of the log model to be verified and the performance model to be verified are within the error range, the log model to be verified and the performance model to be verified are considered to meet the error requirement, and can be used as the log prediction model and the performance prediction model; otherwise, the first data set and the second data set are re-acquired, and the first initial prediction model and the second initial prediction model are re-trained.

In an alternative embodiment, the error range of the unit log vector may be set to 5% -8% and the error range of the unit performance data may be set to 3% -5%.

FIG. 5 is a schematic flow chart of server performance prediction according to an embodiment of the present invention, as shown in FIG. 5, by first collecting operation logs and server performance data in a time sequence interval, then performing feature extraction on the operation logs, mapping the operation logs into multidimensional feature vectors, i.e., log feature vectors, constructing a time-log dataset, i.e., a first dataset, and based on the training log prediction model, constructing a log-performance dataset, i.e., a second dataset, and based on the training performance prediction model; when the log prediction error of the log prediction model and the performance prediction error of the performance prediction model are within an allowable error range, performing performance prediction on the basis of training to obtain the log prediction model and the performance prediction model, otherwise, re-acquiring data and re-performing model training; based on a log prediction model, predicting an operation log of a future time, inputting the operation log obtained by prediction into a performance prediction model, and predicting performance data of a server of the future time; based on the predicted performance data, judging whether the performance data exceeds a risk threshold, and if so, sending out early warning to a performance monitoring platform to ensure the normal operation of the server.

The embodiment also provides a server performance prediction device, which is used for implementing the foregoing embodiments and preferred embodiments, and is not described in detail. As used below, the terms "module," "unit" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

The present embodiment provides a server performance prediction apparatus, as shown in fig. 6, including:

the model obtaining module 601 is configured to obtain a log prediction model and a performance prediction model obtained by training; the log prediction model is used for representing the mapping relation between time and the operation log of the server; the performance prediction model is used for representing the mapping relation between the operation log and the performance data of the server;

the operation log prediction module 602 is configured to input a target time into a log prediction model to obtain a predicted value of an operation log; the target time is a future period of time under the current time;

a performance data prediction module 603, configured to input a predicted value of the operation log into a performance prediction model, to obtain a predicted value of the performance data; the predicted value of the performance data corresponds to the target time.

In an alternative embodiment, the method further comprises a model training module; a model training module comprising:

the data acquisition unit is used for acquiring the operation log and the performance data of the server in a preset time interval;

the characteristic vector determining unit is used for determining a log characteristic vector corresponding to the operation log; the log feature vector is obtained by extracting keywords from an operation log; the log feature vector is used for representing the occurrence frequency of each keyword in the operation log;

the unit time dividing unit is used for dividing the preset time interval into preset number of unit time intervals;

the unit data dividing unit is used for dividing the log characteristic vector and the performance data into a unit log vector and unit performance data based on the unit time interval; the unit time interval, the unit log vector and the unit performance data are in one-to-one correspondence;

the first data combination unit is used for combining the unit time interval and the unit log vector to obtain a first data set;

the second data combination unit is used for combining the unit log vector and the unit performance data to obtain a second data set;

the log prediction model training unit is used for inputting the first data set into the first initial prediction model so as to train the first initial prediction model to obtain a log prediction model;

And the performance prediction model training unit is used for inputting the second data set into the second initial prediction model to train the second initial prediction model so as to obtain the performance prediction model.

In an alternative embodiment, the feature vector determining unit includes:

the keyword library acquisition subunit is used for extracting keywords from the operation log to obtain a keyword library;

the occurrence frequency determining subunit is used for determining the occurrence frequency of each keyword in the keyword library in the operation log;

and the feature vector determining subunit is used for arranging the occurrence frequency according to the arrangement sequence of the keywords in the keyword library to obtain the log feature vector.

In an alternative embodiment, the log prediction model training unit includes:

a first data dividing subunit for dividing the first data set into a first training data set and a first test data set;

the log prediction model training subunit is used for inputting the first training data set into the first initial prediction model to train the first initial prediction model so as to obtain the log prediction model.

In an alternative embodiment, the performance prediction model training unit includes:

a second data dividing sub-unit for dividing the second data set into a second training data set and a second test data set;

and the performance prediction model training subunit is used for inputting the second training data set into the second initial prediction model so as to train the second initial prediction model to obtain the performance prediction model.

In an alternative embodiment, the apparatus further comprises:

the to-be-verified model determining module is used for taking the log prediction model as a to-be-verified log model and taking the performance prediction model as a to-be-verified performance model;

the log vector prediction module is used for inputting the unit time interval in the first test data set into the log model to be verified to obtain a predicted value of the unit log vector in the first test data set;

The performance data prediction module is used for inputting the unit log vector in the second test data set into the performance model to be verified to obtain a predicted value of the unit performance data in the second test data set;

and the prediction model determining module is used for determining the log model to be verified as a log prediction model and the performance model to be verified as a performance prediction model when the difference between the value of the unit log vector in the first test data set and the predicted value is within a first error range and the difference between the value of the unit performance data in the second test data set and the predicted value is within a second error range.

Further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.

The server performance prediction means in this embodiment is presented in the form of functional units, here referred to as ASICs (Application Specific Integrated Circuit, application specific integrated circuits), processors and memories executing one or more software or fixed programs, and/or other devices that can provide the above described functionality.

The embodiment of the invention also provides computer equipment, which is provided with the server performance prediction device shown in the figure 6.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer device according to an alternative embodiment of the present invention, as shown in fig. 7, the computer device includes: one or more processors 10, memory 20, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 7.

The processor 10 may be a central processor, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.

Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform a method for implementing the embodiments described above.

The memory 20 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the computer device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk, or solid state disk; the memory 20 may also comprise a combination of the above types of memories.

The computer device also includes a communication interface 30 for the computer device to communicate with other devices or communication networks.

The embodiments of the present invention also provide a computer readable storage medium, and the method according to the embodiments of the present invention described above may be implemented in hardware, firmware, or as a computer code which may be recorded on a storage medium, or as original stored in a remote storage medium or a non-transitory machine readable storage medium downloaded through a network and to be stored in a local storage medium, so that the method described herein may be stored on such software process on a storage medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware. The storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk, a solid state disk or the like; further, the storage medium may also comprise a combination of memories of the kind described above. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.

Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims

1. A method for predicting server performance, the method comprising:

inputting target time into the log prediction model to obtain a predicted value of the operation log; the target time is a future period of time under the current time;

inputting the predicted value of the operation log into the performance prediction model to obtain the predicted value of the performance data; the predicted value of the performance data corresponds to the target time.

2. The method of claim 1, wherein the log prediction model and the performance prediction model are trained by:

Acquiring the operation log and the performance data of the server in a preset time interval;

determining a log feature vector corresponding to the operation log; the log feature vector is obtained by extracting keywords from the operation log; the log feature vector is used for representing the occurrence frequency of each keyword in the operation log;

dividing the preset time interval into a preset number of unit time intervals;

dividing the log feature vector and the performance data into a unit log vector and unit performance data based on the unit time interval; the unit time interval, the unit log vector and the unit performance data are in one-to-one correspondence;

combining the unit time interval and the unit log vector to obtain the first data set;

combining the unit log vector and the unit performance data to obtain the second data set;

inputting the first data set into a first initial prediction model to train the first initial prediction model to obtain the log prediction model;

and inputting the second data set into a second initial prediction model to train the second initial prediction model so as to obtain the performance prediction model.

3. The method of claim 2, wherein the first initial predictive model and the second initial predictive model are used to characterize a nonlinear mapping between tag elements and feature elements; the nonlinear mapping relationship is characterized by the following mathematical expression:

wherein Y is the tag element, the tag element +.>X is the characteristic element, the characteristic element +.>W is a weight matrix, epsilon is an error matrix;

in the first initial prediction model, the tag element Y is the unit log vector, and the characteristic element X is the unit time interval;

in the second initial prediction model, the tag element Y is the unit performance data, and the feature element X is the unit log vector.

4. The method of claim 2, wherein the determining the log feature vector corresponding to the operation log comprises:

extracting keywords from the operation log to obtain a keyword library;

determining the occurrence frequency of each keyword in the keyword library in the operation log;

5. The method of claim 2, wherein said inputting the first dataset into the first initial predictive model to train the first initial predictive model to obtain the log predictive model comprises:

and inputting the first training data set into the first initial prediction model to train the first initial prediction model to obtain the log prediction model.

6. The method of claim 2, wherein said inputting the second dataset into the second initial predictive model to train the second initial predictive model to obtain the performance predictive model comprises:

and inputting the second training data set into the second initial prediction model to train the second initial prediction model to obtain the performance prediction model.

7. The method of claim 5 or 6, wherein training the log prediction model and the performance prediction model further comprises:

inputting the unit time interval in a first test data set into the log model to be verified to obtain a predicted value of the unit log vector in the first test data set;

inputting the unit log vector in a second test data set into the performance model to be verified to obtain a predicted value of the unit performance data in the second test data set;

and when the difference between the value of the unit log vector in the first test data set and the predicted value is in a first error range and the difference between the value of the unit performance data in the second test data set and the predicted value is in a second error range, determining that the log model to be verified is the log prediction model, and the performance model to be verified is the performance prediction model.

8. A server performance prediction apparatus, the apparatus comprising:

The operation log prediction module is used for inputting target time into the log prediction model to obtain a predicted value of the operation log; the target time is a future period of time under the current time;

9. A computer device, comprising:

a memory and a processor in communication with each other, the memory having stored therein computer instructions which, upon execution, cause the processor to perform the method of any of claims 1 to 7.

10. A computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of claims 1 to 7.