CN104021045A - CPU load multi-step prediction method based on mode fusion - Google Patents
CPU load multi-step prediction method based on mode fusion Download PDFInfo
- Publication number
- CN104021045A CN104021045A CN201410183205.8A CN201410183205A CN104021045A CN 104021045 A CN104021045 A CN 104021045A CN 201410183205 A CN201410183205 A CN 201410183205A CN 104021045 A CN104021045 A CN 104021045A
- Authority
- CN
- China
- Prior art keywords
- pattern
- modes
- data
- prediction
- length
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A CPU load multi-step prediction method based on mode fusion includes the steps of firstly, dividing a time sequence datum to form a set of multiple data modes, and counting the number of all the data modes; secondly, setting a filtering factor alpha for all the obtained modes and the number to filter out some unfrequent modes; thirdly, combining some modes with small differences into some universal tendency modes, conducting matching according to the universal tendency modes, measuring the direction distance between modes through the Hamming distance in the matching process, and measuring the actual distance through the Euclidean distance; fourthly, conducting multi-step prediction according to last values of the modes through the average rule strategy or the average drop strategy after some approximate modes are found; finally, guiding prediction through multiple mode lengths, conducting fusion according to prediction values of all the mode lengths, conducting synthesizing through the Adaboost algorithm, and obtaining the final result. The CPU load multi-step prediction method has the advantages of being high in accuracy and high in reliability.
Description
Technical field
The present invention relates to server cpu load electric powder prediction, be specifically related to a kind of cpu load multistep forecasting method based on schema merging.
Background technology
In distributed system, available resource is time dependent, and dispatching system also needs to make corresponding variation simultaneously.Due to geographic distribution situation, in reality, the data of monitoring and collection resource distribute to exist and postpone, and are difficult to the real-time current available resources of obtaining, and therefore can facilitate resource management and scheduling by performance prediction.
The monitoring of cpu load is the necessary condition of successful operation of application program with prediction, the performance state that cpu load can display device.If cpu load is too high, the performance of server will seriously reduce.The server exception state of the cpu load of high load capacity can cause system crash, and monitoring can help keeper to take corresponding countermeasure with prediction cpu load effectively, as closes or re-launching applications upgrading hardware etc.
The cpu load that studies have shown that in the past can be recorded as a time series data, by time series forecasting algorithm, cpu load is predicted, but these models can not effectively be supported multi-step prediction.Multi-step prediction has more challenge than Single-step Prediction, also more meaningful.Multi-step prediction can be known the trend of cpu load in the longer time in the future, can to keeper and dispatching system time enough to go to process anomalous event.
Existing multistep forecasting method is based on iteration Single-step Prediction, and this method is more applicable to the less prediction of step number, if but step number is larger, there will be the increasing problem of skew that predicts the outcome.Occur error as several steps above predict the outcome, prediction so below will be difficult to estimate exact value.
In recent years, in forecasting process, the mode based on pattern match also has popular, and they are to carry out similarity between metric data by the Euclidean distance between computation schema and pattern.The too matching of this matching way, is unable to estimate directivity or the tendency of pattern, has certain matching error, cannot accurately estimate to predict the outcome.
Summary of the invention
The object of the present invention is to provide a kind of cpu load multistep forecasting method based on schema merging, the invention provides the method for server cpu load being carried out to long-term forecasting, the method precision of prediction is high, has advantages of that accuracy is high, reliability is high.
For achieving the above object, the invention provides a kind of cpu load multistep forecasting method based on schema merging, comprise the following steps:
Step 1: pattern extraction
A time series data is cut into the set of multiple data patterns, and adds up the number of each data pattern;
Step 2: mode filtering
Obtain all patterns and number through step 1, filter the pattern that some seldom occur, to these pattern statistics, sort from big to small according to number, a given filterable agent α, makes the pattern after filtering can cover most of pattern;
Step 3: schema merging, coupling
The pattern being more or less the same for some patterns, be merged into some general Trend Patterns, and mate according to these general Trend Patterns, in matching process, adopt Hamming distance to carry out the direction distance between measurement pattern and pattern, then measure actual range by Euclidean distance;
Step 4: pattern weight estimation
After step 3 finds some approximate modes, adopt average rule and policy or the even strategy that declines to carry out multi-step prediction according to the value below of these patterns;
Step 5: fusion predicts the outcome
Adopt multiple modal lengths to instruct prediction, and merge according to the predicted value of each modal length, adopt machine learning Adaboost algorithm to synthesize, obtain net result.
According to the cpu load multistep forecasting method based on schema merging described in preferred embodiment of the present invention, in step 1:
Time series data is a series of data x
1, x
2, x
3..., x
n, between these data, have sequence;
Data pattern is a given time series data, looks for and make a call to a sub-sequence C from time series data
p=x
p, x
p+1..., x
p+w-1, this subsequence often occurs in historical data.
According to the cpu load multistep forecasting method based on schema merging described in preferred embodiment of the present invention, in step 2, filterable agent α meets the following conditions:
Wherein, Qi is a set of modes that length is i;
Number (Qi) is the number of pattern in Qi;
Filter (Qi) is the number of pattern after filtering,
According to the cpu load multistep forecasting method based on schema merging described in preferred embodiment of the present invention, in step 3, pattern match comprises the following steps:
Step 31: trend coupling
By the trend distance between tolerance Hamming distance computation schema, adopt parameter lambda to assess:
Wherein, || X
i'-m
j' || be the Hamming distance of two patterns of tolerance;
λ is matching parameter;
X
i' with m
j' be the trend direction of i and j pattern;
Step 32: Euclidean distance
Through the pattern match after step 31, can obtain some similar patterns, calculate Euclidean distance:
dist(i,j)=|X
i-m
j|
Wherein i and j are the sequence numbers of pattern;
X
ii pattern;
M
jj pattern
| X
i-m
j| be two Euclidean distances between pattern;
Step 33, according to the Euclidean distance between each pattern of the rear calculating of step 32 and pattern, sorts to these distances, the immediate pattern of conduct of K Euclidean distance minimum before then therefrom selecting.
According to the cpu load multistep forecasting method based on schema merging described in preferred embodiment of the present invention, in step 4, average rule and policy is: from history, find after some approximate modes, from the direct weighted mean of these approximate mode successor value, as the end value of prediction, the value of predicting the outcome is as follows:
Wherein, h is the length of prediction step number;
D is the number that finds approximate mode from historical data;
N is the length of given data;
CP
isome candidate's approximate modes that find.
According to the cpu load multistep forecasting method based on schema merging described in preferred embodiment of the present invention, in step 4, evenly decline strategy is: in analytic process, find the length in time of correlativity of data and change, some nearest pattern is higher to the confidence level of prediction, adopt to each pattern the end value as prediction to weight, the value of predicting the outcome is as follows:
Wherein,
D is the number that finds approximate mode from historical data;
N is the length of given data;
CP
isome candidate's approximate modes that find.
H is prediction length;
I is the sequence number of pattern;
L
iit is the time span between pattern and present mode.
According to the cpu load multistep forecasting method based on schema merging described in preferred embodiment of the present invention, the net result that step 5 obtains is:
Wherein, h is prediction length;
M is the number of preference pattern length;
α
ii the weight that modal length predicts the outcome;
N is the length of given data.
Beneficial effect of the present invention is: the feature that the present invention is directed to server cpu load, cpu load is carried out to multi-step prediction, by schema merging method, close pattern is fused into a general pattern, and mate according to these general patterns, the mode of coupling adopts the trend distance of pattern to combine with Euclidean distance, find relevant approximate mode, and instruct multi-step prediction according to these approximate modes, adopt a kind of synthesis mode based on weight to calculate multi-step prediction result, reach cpu load prediction accurately, improve accuracy and the reliability of server to scheduling of resource.
The present invention adopts single Euclidean distance coupling, propose from historical data, to find similar pattern by trend mode to Euclidean distance matching way, solve the unicity problem of time series data coupling, simultaneously, the synthesis mode the present invention proposes based on weight calculates multi-step prediction result, has solved the inaccuracy problem of long-term and multi-step prediction, therefore, compared with prior art, the present invention has advantages of that precision of prediction is high, accuracy is high, reliability is high.
Brief description of the drawings
Fig. 1 is the principle schematic of the cpu load multistep forecasting method based on schema merging of the present invention;
Fig. 2 is the principle schematic of schema merging of the present invention;
Fig. 3 is the principle schematic of pattern match of the present invention.
Embodiment
Below with reference to accompanying drawing of the present invention; technical scheme in the embodiment of the present invention is carried out to clear, complete description and discussion; obviously; as described herein is only a part of example of the present invention; it is not whole examples; based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain under the prerequisite of not making creative work, belongs to protection scope of the present invention.
For the ease of the understanding to the embodiment of the present invention, be further explained as an example of specific embodiment example below in conjunction with accompanying drawing, and each embodiment does not form the restriction to the embodiment of the present invention.
Please refer to Fig. 1 to Fig. 3, a kind of cpu load multistep forecasting method based on schema merging, by schema merging method, close pattern is fused into a general pattern, and mate according to these general patterns, the mode of coupling adopts the trend distance of pattern to combine with Euclidean distance, find relevant approximate mode, and instruct multi-step prediction according to these approximate modes, adopt a kind of synthesis mode based on weight to calculate multi-step prediction result, specifically comprise the following steps:
S1, pattern extraction
A time series data is cut into the set of multiple patterns, and adds up the number of each pattern.
Wherein:
Time series data is a series of data x
1, x
2, x
3..., x
n, between these data, have sequence.
Data pattern is a given time series data, looks for and make a call to a sub-sequence C from data
p=x
p, x
p+1..., x
p+w-1, this subsequence often occurs in historical data.
S2, mode filtering
Obtain all patterns and number through S1, filter the pattern that some seldom occur.The mode of filtering is to these pattern statistics, sorts from big to small according to number, and a given filterable agent α, allows the pattern after filtering can cover most of pattern.
Wherein Qi is a set of modes that length is i;
Number (Qi) is the number of pattern in Qi;
Filter (Qi) is the number of pattern after filtering;
S3, schema merging, coupling
The pattern being more or less the same for some patterns, be merged into some general Trend Patterns, and mate according to these general Trend Patterns, in matching process, adopt Hamming distance to carry out the direction distance between measurement pattern and pattern, then measure actual range by Euclidean distance;
Because the direction indication of a pattern becomes rise and fall, 1 and-1, trend can be expressed as 1 with-1 combination.Between some pattern, be more or less the same like this, the pattern being more or less the same for some patterns, can be merged into some general Trend Patterns.As two Mode As=[3,5,8,10,11,14,18,21,22] and B=[2,5,6,9,8,12,15,17,20].In Fig. 2, can find that two patterns are closely similar, their trend direction is very approaching, can be expressed as [↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑] and [↑ ↑ ↑ ↓ ↑ ↑ ↑ ↑].Therefore adopt amalgamation mode that they are merged into a common pattern [↑ ↑ ↑ * ↑ ↑ ↑ ↑].
Pattern match specifically comprises the following steps:
S31, trend coupling, by the trend distance between tolerance Hamming distance computation schema, in order to calculate better similarity, adopt a parameter lambda to assess:
Wherein, || X
i'-m
j' || be the Hamming distance of two patterns of tolerance;
λ is matching parameter;
X
i' with m
j' be the trend direction of i and j pattern;
S32, Euclidean distance, the pattern match after step S31, can obtain some similar patterns, then calculates Euclidean distance:
dist(i,j)=|X
i-m
j|
Wherein i and j are the sequence numbers of pattern;
X
ii pattern;
M
jj pattern;
| X
i-m
j| be two Euclidean distances between pattern;
S33, K the immediate pattern of conduct that distance is minimum before selecting.
According to the Euclidean distance between each pattern of calculating after step S32 and pattern, these distances are sorted, the immediate pattern of conduct of K Euclidean distance minimum before then therefrom selecting.Wherein K is User Defined parameter, is to define according to data cases, generally gets and makes to predict that value is determined the most accurately.
S4, pattern weight estimation
Through finding in step S3 after some approximate modes, carry out multi-step prediction according to the value below of these patterns.
In the time that approximate mode is a lot, need to carry out integrated merging by a kind of strategy, as shown in Figure 3.The present invention adopts average rule and policy or the strategy that evenly declines carries out multi-step prediction according to the value below of these patterns.Wherein,
Average rule and policy is: from history, find after some approximate modes, from the direct weighted mean of these approximate mode successor value, as the end value of prediction, the value of predicting the outcome is as follows:
Wherein, h is the length of prediction step number;
D is the number that finds approximate mode from historical data;
N is the length of given data;
CP
isome candidate's approximate modes that find.
Evenly decline strategy is: in analytic process, and length in time of the correlativity of finding data and changing, some nearest pattern is higher to the confidence level of prediction, more can instruct prediction.Adopt each pattern to the end value of a Weight prediction, the size of weight is length variations in time.The value of predicting the outcome is as follows:
Wherein,
D is the number that finds approximate mode from historical data;
N is the length of given data;
CP
isome candidate's approximate modes that find;
H is prediction length;
I is the sequence number of pattern;
L
iit is the time span between pattern and present mode.
S5, fusion predicts the outcome
Adopt multiple modal lengths to instruct prediction, and merge according to the predicted value of each modal length, adopt machine learning Adaboost algorithm to synthesize, obtain net result.
For a time series data, be very difficult if find a suitable modal length from data.Therefore adopt multiple modal lengths to instruct prediction, and merge according to the predicted value of each modal length, fusion method adopts machine learning Adaboost algorithm to synthesize, and finally obtains a final result as follows:
Wherein, h is prediction length;
M is the number of preference pattern length;
α
ii the weight that modal length predicts the outcome;
N is the length of given data.
The present invention is directed to the feature of server cpu load, cpu load is carried out to multi-step prediction, by schema merging method, close pattern is fused into a general pattern, and mate according to these general patterns, the mode of coupling adopts the trend distance of pattern to combine with Euclidean distance, find relevant approximate mode, and instruct multi-step prediction according to these approximate modes, adopt a kind of synthesis mode based on weight to calculate multi-step prediction result, reach cpu load prediction accurately, improve accuracy and the reliability of server to scheduling of resource.
Disclosed is above only several specific embodiment of the present invention, but the present invention is not limited thereto, and the changes that any person skilled in the art can think of all should drop in protection scope of the present invention.
Claims (7)
1. the cpu load multistep forecasting method based on schema merging, is characterized in that, comprises the following steps:
Step 1: pattern extraction
A time series data is cut into the set of multiple data patterns, and adds up the number of each data pattern;
Step 2: mode filtering
Obtain all patterns and number through step 1, filter the pattern that some seldom occur, to these pattern statistics, sort from big to small according to number, a given filterable agent α, makes the pattern after filtering can cover most of pattern;
Step 3: schema merging, coupling
The pattern being more or less the same for some patterns, be merged into some general Trend Patterns, and mate according to these general Trend Patterns, in matching process, adopt Hamming distance to carry out the direction distance between measurement pattern and pattern, then measure actual range by Euclidean distance;
Step 4: pattern weight estimation
After step 3 finds some approximate modes, adopt average rule and policy or the even strategy that declines to carry out multi-step prediction according to the value below of these patterns;
Step 5: fusion predicts the outcome
Adopt multiple modal lengths to instruct prediction, and merge according to the predicted value of each modal length, adopt machine learning Adaboost algorithm to synthesize, obtain net result.
2. the cpu load multistep forecasting method based on schema merging according to claim 1, is characterized in that, in step 1:
Described time series data is a series of data x
1, x
2, x
3..., x
n, between these data, have sequence;
Described data pattern is a given time series data, looks for and make a call to a sub-sequence C from time series data
p=x
p, x
p+1..., x
p+w-1, this subsequence often occurs in historical data.
3. the cpu load multistep forecasting method based on schema merging according to claim 2, is characterized in that, in step 2, filterable agent α meets the following conditions:
Wherein, Qi is a set of modes that length is i;
Number (Qi) is the number of pattern in Qi;
Filter (Qi) is the number of pattern after filtering.
4. the cpu load multistep forecasting method based on schema merging according to claim 2, is characterized in that, in step 3, pattern match comprises the following steps:
Step 31: trend coupling
By the trend distance between tolerance Hamming distance computation schema, adopt parameter lambda to assess:
Wherein, || X
i'-m
j' || be the Hamming distance of two patterns of tolerance;
λ is matching parameter;
X
i' with m
j' be the trend direction of i and j pattern;
Step 32: Euclidean distance
Through the pattern match after step 31, can obtain some similar patterns, calculate Euclidean distance:
dist(i,j)=|X
i-m
j|
Wherein i and j are the sequence numbers of pattern;
X
ii pattern;
M
jj pattern;
| X
i-m
j| be two Euclidean distances between pattern;
Step 33, according to the Euclidean distance between each pattern of the rear calculating of step 32 and pattern, sorts to these distances, the immediate pattern of conduct of K Euclidean distance minimum before then therefrom selecting.
5. the cpu load multistep forecasting method based on schema merging according to claim 4, it is characterized in that, described in step 4, average rule and policy is: from history, find after some approximate modes, from the direct weighted mean of these approximate mode successor value, as the end value of prediction, the value of predicting the outcome is as follows:
Wherein, h is the length of prediction step number;
D is the number that finds approximate mode from historical data;
N is the length of given data;
CP
isome candidate's approximate modes that find.
6. the cpu load multistep forecasting method based on schema merging according to claim 4, it is characterized in that, described in step 4, evenly decline strategy is: in analytic process, find the length in time of correlativity of data and change, some nearest pattern is higher to the confidence level of prediction, adopt to each pattern the end value as prediction to weight, the value of predicting the outcome is as follows:
Wherein,
D is the number that finds approximate mode from historical data;
N is the length of given data;
CP
isome candidate's approximate modes that find;
H is prediction length;
I is the sequence number of pattern;
L
iit is the time span between pattern and present mode.
7. according to the cpu load multistep forecasting method based on schema merging described in claim 5 or 6, it is characterized in that, the net result that step 5 obtains is:
Wherein, h is prediction length;
M is the number of preference pattern length;
α
ii the weight that modal length predicts the outcome;
N is the length of given data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410183205.8A CN104021045A (en) | 2014-05-04 | 2014-05-04 | CPU load multi-step prediction method based on mode fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410183205.8A CN104021045A (en) | 2014-05-04 | 2014-05-04 | CPU load multi-step prediction method based on mode fusion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104021045A true CN104021045A (en) | 2014-09-03 |
Family
ID=51437813
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410183205.8A Pending CN104021045A (en) | 2014-05-04 | 2014-05-04 | CPU load multi-step prediction method based on mode fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104021045A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832265A (en) * | 2017-10-17 | 2018-03-23 | 上海交通大学 | The cpu load Forecasting Methodology of desktop based on state aware |
CN109347536A (en) * | 2018-09-11 | 2019-02-15 | 中国空间技术研究院 | A kind of spatial network monitoring resource condition system based on situation knowledge |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102662759A (en) * | 2012-03-20 | 2012-09-12 | 浪潮电子信息产业股份有限公司 | Energy saving method based on CPU (central processing unit) load in cloud OS (operating system) |
CN103076870A (en) * | 2013-01-08 | 2013-05-01 | 北京邮电大学 | Application fusing scheduling and resource dynamic configuring method of energy consumption drive in data center |
-
2014
- 2014-05-04 CN CN201410183205.8A patent/CN104021045A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102662759A (en) * | 2012-03-20 | 2012-09-12 | 浪潮电子信息产业股份有限公司 | Energy saving method based on CPU (central processing unit) load in cloud OS (operating system) |
CN103076870A (en) * | 2013-01-08 | 2013-05-01 | 北京邮电大学 | Application fusing scheduling and resource dynamic configuring method of energy consumption drive in data center |
Non-Patent Citations (1)
Title |
---|
DINGYU YANG ETC.: "A pattern fusion model for multi-step-ahead CPU load prediction", 《THE JOURNAL OF SYSTEMS AND SOFTWARE》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832265A (en) * | 2017-10-17 | 2018-03-23 | 上海交通大学 | The cpu load Forecasting Methodology of desktop based on state aware |
CN109347536A (en) * | 2018-09-11 | 2019-02-15 | 中国空间技术研究院 | A kind of spatial network monitoring resource condition system based on situation knowledge |
CN109347536B (en) * | 2018-09-11 | 2021-03-26 | 中国空间技术研究院 | Spatial network resource state monitoring system based on situation knowledge |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021179572A1 (en) | Operation and maintenance system anomaly index detection model optimization method and apparatus, and storage medium | |
CN109587713B (en) | Network index prediction method and device based on ARIMA model and storage medium | |
CN111064614B (en) | Fault root cause positioning method, device, equipment and storage medium | |
CN109242135B (en) | Model operation method, device and business server | |
CN111160617B (en) | Power daily load prediction method and device | |
CN105871634B (en) | Detect the method for cluster exception and the system of application, management cluster | |
CN106598822B (en) | A kind of abnormal deviation data examination method and device for Capacity Assessment | |
CN110444011B (en) | Traffic flow peak identification method and device, electronic equipment and storage medium | |
CN108074015B (en) | Ultra-short-term prediction method and system for wind power | |
CN112925608A (en) | Intelligent capacity expansion and contraction method, device and equipment based on machine learning and storage medium | |
CN102521080A (en) | Computer data recovery method for electricity-consumption information collecting system for power consumers | |
CN112800231A (en) | Power data verification method and device, computer equipment and storage medium | |
CN113049963A (en) | Lithium battery pack consistency detection method and device based on local outlier factors | |
CN116307215A (en) | Load prediction method, device, equipment and storage medium of power system | |
CN112113581A (en) | Abnormal step counting identification method, step counting method, device, equipment and medium | |
Li et al. | Multilinear-trend fuzzy information granule-based short-term forecasting for time series | |
CN103646670A (en) | Method and device for evaluating performances of storage system | |
US20120317536A1 (en) | Predicting performance of a software project | |
CN104021045A (en) | CPU load multi-step prediction method based on mode fusion | |
CN109816165A (en) | Wind-powered electricity generation ultra-short term power forecasting method and system | |
CN109840353A (en) | Lithium ion battery dual factors inconsistency prediction technique and device | |
CN105868918A (en) | Similarity index computing method of harmonic current type monitoring sample | |
KR102059112B1 (en) | IoT STREAM DATA QUALITY MEASUREMENT INDICATORS AND PROFILING METHOD FOR INTERNET OF THINGS AND SYSTEM THEREFORE | |
CN106487570B (en) | A kind of method and apparatus for assessing network performance index variation tendency | |
CN106683001A (en) | Thermal power plant set identification data selection method based on historical operation data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140903 |