A kind of index forecast of IT O﹠M and Forecasting Methodology and device that utilizes correlation
Technical field
The present invention relates to IT operation management field, especially monitoring between the index of IT O﹠M and management domain are specifically related to a kind of index intelligent prediction and forecast method of utilizing correlation.
Background technology
The IT operation management, promptly IT enterprises or department adopt relevant method, means, technology, system, flow process and document etc., the integrated management that IT running environment (comprising physical environment, hardware environment etc.), IT operation system and IT O﹠M personnel are carried out.Along with deepening continuously of building of IT and perfect, the operation maintenance of computer hardware and software system has obtained attention, owing to this is a new problem that produces along with the deep application of computer information technology, therefore how research carries out effective I T operation management, will have vast potential for future development and huge realistic meaning.
Say that briefly the organize content of IT O﹠M can manage and safeguard through extracting into index.Index is also promptly described the data of a certain characteristics of objects.The administration behaviour of IT O﹠M can be abstracted into the change of data in essence.Therefore, the management of research IT O﹠M index highly significant.In the present invention, proposition is that a kind of correlation of utilizing is carried out intelligent prediction and forecast method to index.
Intelligent prediction and prediction, the i.e. process of certain desired value being alarmed or being estimated by the mode of unartificial detection.Intelligentized example is a lot, is applied to the function of mobile phone or terminal hand-writing input method as the clustering algorithm with pattern recognition, can improve input efficiency; Some music software provides the function of automatic recommendation song for another example, predicts that by record audience historical record this didactic mode can further satisfy audience's wish; 360 security guards are to the program updates of operating system with safeguard the function that automatic forecasting is provided for another example, can optimization system, improve system useful life.
Intelligentized theoretical system developed comparative maturity, intelligent theoretical method and the means of using mainly comprise at present: (1) adaptation theory system, and this theory is a kind of feedback theory in essence, comprises the artificial neural net system, by the learning training sample, predict following data; (2) area of pattern recognition reaches the purpose of identification by structure different mode system; (3) Optimum Theory system, this theory comprises supporting vector machine model, ant group algorithm, genetic algorithm, linearity and nonlinear restriction model reach the purpose of optimization aim data by modeling; (4) modern signal processing field theory and method, signal processing method such as moving average adaptive regression model, and filtering method such as Wiener filtering, Kalman filter model, by modeling to following time quantum predict, level and smooth or estimate.
In the present invention, will directly not use above-described intelligent method, but utilize correlation.
Certainly exist correlation between some index of IT O﹠M.Detecting with the WLAN index is example, the field intensity signal to noise ratio intensity of WLAN signal directly influences the network data bandwidth, even as the connectedness such as the ping packet success rate of network, the Congestion Level SPCC of network then may influence WEB authentication index, because when offered load is overweight, the WEB authentication access delay time may increase.In the application scenarios of reality, because of the cost problem, some WLAN index should not be monitored constantly, as the field intensity signal to noise ratio, and some data can obtain constantly by the mode of software supervision, and between these two kinds of indexs or exist contact between more indexs, in this case, utilize the correlation between index just can overcome the problem that other intelligent scheme is unpredictable or predictablity rate descends, because no matter whether data are known, correlation between index is to exist in the moment, only needs just can reach as the method among employing the present invention the effect of prediction.In addition, correlation can also forecast whether it exceeds standard when some index unknown data dynamic range.
The mathematics of correlation is according to as follows:
For two vectors
, covariance so between the two can be expressed as
Constitute the matrix that the capable M of M is listed as by the cross covariance between M the index,
The definition coefficient correlation
, according to the character of coefficient correlation, auto-correlation coefficient equals that 0, two vector is uncorrelated, and the auto-correlation coefficient absolute value equals 1, and two SYSTEM OF LINEAR VECTOR that and if only if are relevant.Thus, we infer that it is uncorrelated more that the covariance absolute value approaches 0, two index more, otherwise then relevant more.
Summary of the invention
The invention provides a kind of IT O﹠M index intelligent forecasting and Forecasting Methodology of utilizing correlation, the feature of this each step of method is:
(1) upgrades Data Source, training data sample and test data sample data are provided, and wherein the training data of each index is a multidimensional, and test sample book is an one dimension, as time passes, make training sample huge gradually after test sample book being incorporated into historical data base.
(2) training comprises two steps of data preliminary treatment and data computation, can eliminate burr data such as minimax after the process data preliminary treatment of training sample source, reaches smooth effect, thereby provides accurately reasonably Data Source for next step; When pretreated data are passed through the data computation step, obtain a covariance matrix according to formula (1), (2), calculate the covariance fluctuation range again.
Preferably, at first, matrix (2) is done characteristic value decomposition obtain
,
Be respectively characteristic vector and characteristic value diagonal matrix, then, keep the bigger characteristic value of absolute value, reject little order and equal zero, thereby obtain
, so,
Inevitable also is a symmetrical matrix, and differs from
, consider the element of triangular portions on it, then define fluctuation range and be: a boundary of fluctuation range
, another boundary is so
(5)
(3) test comprises data forecast and two steps of data prediction.In data forecast step,
Preferably, at first, obtain test sample book, fluctuation range and i and j index average of any two indexs that obtain according to training module from data source
, the covariance that then defines between any two test sample book data is expressed as,
Then can judge
Whether drop on
Fluctuation range in, thereby forecast.
Preferably, if known a certain index, but can't forecast whether it exceeds standard, forecast that then thought is: find draw in the training module with the maximally related several indexs of this index, sequentially if one of them index can forecast, then stop forecast.
Under the prerequisite that can't detect achievement data, can predict index.
Preferably, according to formula (6), accurately the algorithm of prediction is: find the maximally related index j with index i to be measured earlier, find the maximally related index k with j then, then can think
, the equation left side is unknown test covariance, the right is known training covariance.Thereby three systems of linear equations of simultaneous, separate promptly get to predict the outcome also promptly separated.Also promptly solve an equation organize X
The present invention also provides a kind of intelligent forecasting and prediction unit that utilizes correlation simultaneously, comprises,
The data source module with the initialization data of existing historical data as training module, is selected big as far as possible.Simultaneously,, incorporate it into tranining database after whenever testing one group of data, guarantee upgrading in time of database for the test data of bringing in constant renewal in.
Preferably, when data volume reaches certain scale, carry out the packet training, to improve test accuracy.Referring to key diagram 1.
Training module comprises data pretreatment unit and data computation unit,
The data pretreatment unit,
Preferably, eliminate the burr purpose in order to reach, to each index, under the initial situation, remove obviously extreme several sample values earlier and keep remaining sample, calculate as several extremely big arithmetic mean M and several extremely little value arithmetic mean m, when at every turn more during new data,, then it is considered as burr and rejects if find that data drop on outside M or the m, the data set of Ti Chuing is formed new manifold simultaneously, upgrades M and m.Go in such a manner, make data reach level and smooth effect as far as possible.Shown in Figure 5 referring to explanation.
The data computation unit,
Preferably, because the data preprocessing part is eliminated burr to each index and is handled, may make between two achievement data vectors dimension different, the mode that solves is, for burr of an every elimination of index, when data lack, replace the error when calculating covariance matrix with the arithmetic mean of all data acquisition systems of front to reduce;
Preferably, the rule of rejecting less characteristic value is, with the addition that takes absolute value of all characteristic values, calculates the ratio of each characteristic value then, if this characteristic value ratio less than as 0.05, claim that the characteristic value contribution margin is too small, also it equals zero even it can be considered to reject.Reject manyly more, the fluctuation range of calculating is big more.This execution mode can be shown in Figure 6 referring to explanation.
Test module comprises data forecast unit and data prediction unit,
Data forecast unit comprises and differentiating and forecast,
Differentiate, in a single day some index is measured in the reality just has with reference to scope, therefore need not forecast, and measures not with reference to scope for the other index, and whether therefore at first distinguish index needs to forecast;
Preferably, the algorithm principle of forecast module is, see earlier with maximally related that index of index x to be measured whether be index known and in known dynamic range, if not continue search, m meets the demands before searching out, and the m maximum can reach all known dynamic range index numbers.First is made as i, to index i and x calculate covariance conv (x, i), if less than fluctuation range, then forecasting index x does not exceed standard; Otherwise if then calculate index j with index x correlations again greater than fluctuation range, if conv (j i) less than fluctuation range, forecasts that then x exceeds standard, otherwise, claim the i prediction to lose efficacy, replace i with j, repeat the flow process of i.So repeatedly, m index all predicted inefficacy before all, forecasts that then x does not exceed standard.
This unit specifically can be shown in Figure 7 referring to explanation.
The data prediction unit is used to the data of predicting that some can't directly detect, is divided into differentiating and prediction, preferably, carries out according to mentioning the thought of solving an equation in the method.Referring to key diagram 8.
The flow chart of whole device is shown in Figure 4 as illustrating.
A kind of IT O﹠M index intelligent forecasting and Forecasting Methodology and device that utilizes correlation provided by the invention, its intelligent being embodied in: in the time of can't judging when the given data source whether it exceeds standard, use the data test unit, in the IT of reality O﹠M system, alarm; When because chance failure or additive method can't directly detect certain index the time, utilize all the other associated desired values and data prediction unit, can predict more exactly this index.
A kind of IT O﹠M index intelligent forecasting and Forecasting Methodology and device that utilizes correlation provided by the invention, its advantage and characteristics are: compare with traditional intelligent prediction or Forecasting Methodology, all need two steps of training and testing, but amount of calculation is much smaller, and can reaches higher accuracy.
Description of drawings
The present invention will illustrate by example and with reference to the mode of accompanying drawing, wherein
Fig. 1 is every group of number of station work and certain index success rate graph of a relation of prediction.
Fig. 2 is the magnitude relationship figure of a certain test index warning probability and this index.
Fig. 3 is the variation relation figure of the predicted value deviation ratio of a certain test index with the index size.
Fig. 4 is the flow chart of device.
Fig. 5 is the flow chart of the data pretreatment unit of training module.
Fig. 6 is the flow chart of data computation unit in the training module.
Fig. 7 is the flow chart of data forecast unit in the test module.
Fig. 8 is the flow chart of data prediction unit in the test module.
Fig. 9 is entire method and apparatus system principle schematic.
Embodiment
For making the inventive method and device can reach the result and the function of expectation,, the simulation result figure that adopts MATLAB is described and show simultaneously for more clear and intuitive expression method thinking of the present invention.
In specific embodiment 1, with reference to key diagram 1,
Suppose under the real scene that receive 20 achievement data sources altogether, the statistical history data suppose that the initial sampled data of each index is fixed as 1000, establishing index training data to be measured source is that average is 10, and variance is 0.1 just too distributed data.Consideration divides into groups to enter the processing of training module to it, in theory, for guarantee that fluctuation range calculates accurately, every group of number is unsuitable very few, simultaneously for smoothing processing, therefore the group number should not, have a compromise very little.This routine purpose is checking when data source fixedly the time, how to distribute these data can reach good performance.For embodiment 2 does foundation.
Shown in Figure 1 by explanation, under the testing data known cases, set two kinds of situations:
Index test data to be measured equal 10, and in scope, presentation of results is divided into when 1000 data under every group 100 ~ 500 the scope, and predicated error is lower than 0.1; Test data equals 14 outside scope, and presentation of results when 1000 data are divided into every group 100 ~ 500, can reach better prediction effect relatively, and predicated error is minimum about 0.4.
By embodiment 1, obtain the allocation proportion of 1000 numbers grouping number and group number, can elect 100 every group as, totally 10 groups, as the foundation of next embodiment.
Simultaneously, this example has also illustrated on duty having exceeded after the scope, and its predicted value is very inaccurate, and this explanation several indexs relevant with this index have all exceeded scope, because do not satisfy the condition of prediction, so this situation does not meet application category of the present invention.
In the specific embodiment 2, see also key diagram 2.
Suppose under the real scene, several 10 of index, training data adds up to 10000, it is divided into 100 groups, every group of 100 data, the training data source is the random number between 0 ~ 1, preset desired value to be measured and whenever increase progressively 0.5 up to approaching 20 from 0, put 1 for reporting to the police (exceeding standard), 0 does not report to the police.In theory, when data are got over away from this scope of 0 ~ 1, report to the police and should be 1, otherwise be 0.Because the algorithm robustness that method provides exists, so, through after the smoothing processing, reflect the forecast performance by the warning probability.
Shown in Figure 2 by explanation, when initialize data (test data to be forecast) gradually away from 1 the time, the warning probability rises gradually, up to approaching 1.In the reality, the mode of solution is, sets up a threshold values, is higher than threshold values and then reports to the police when certain test data obtains the warning probability, otherwise do not report to the police.
This embodiment has verified the validity of inventive method data forecasts, and a solution is provided.
In the specific embodiment 3, see also key diagram 3.
Suppose under the real scene that the index number is 20, there are 1000 data in every group of index training data source, and achievement data to be measured source is 10 being average, and 0.1 is the random number of variance, presets index test data to be measured and is incremented to 15 from 5 with 0.5, calculates the prediction deviation rate.
Shown in Figure 3 by explanation, when presetting range during in 10 scopes, the I of predicated error is lower than 0.1, otherwise predicated error is increasing.This key diagram, the same with embodiment 1, illustrated that the Forecasting Methodology that the present invention provides has higher precision.