CN107220483A - A kind of mode prediction method of polynary time series data - Google Patents
A kind of mode prediction method of polynary time series data Download PDFInfo
- Publication number
- CN107220483A CN107220483A CN201710324105.6A CN201710324105A CN107220483A CN 107220483 A CN107220483 A CN 107220483A CN 201710324105 A CN201710324105 A CN 201710324105A CN 107220483 A CN107220483 A CN 107220483A
- Authority
- CN
- China
- Prior art keywords
- interest mode
- mrow
- rule
- interest
- variable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Z—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
- G16Z99/00—Subject matter not provided for in other main groups of this subclass
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of mode prediction method of polynary time series data, including three phases, the time series formed first to each conditional-variable and decision variable finds candidate's interest mode collection, each candidate's interest mode collection is clustered respectively;Secondly, the prediction between Production conditions variable and decision variable;The interest mode that finally the conditional-variable execution stage one of testing data is obtained goes the prediction rule of matching stage two, if meeting the former piece of prediction rule, exports predicting the outcome for decision variable.The mode prediction method amount of calculation of the polynary time series data of the present invention is small, the effective time complexity reduced in model prediction, solves the problem of time complexity is too high in conventional method.
Description
Technical field
The invention belongs to computer realm, the more particularly to Data Mining in computer, and in particular to a kind of polynary
The mode prediction method of time series data.
Background technology
Time series forecasting is in weather forecast, and the field such as stock is a very important research direction.One in time series forecasting
Individual most important method is exactly the behavior that can remove to predict its dependent variable according to the trend of some variables, and this is just called polynary sequential
Prediction.For example, if we consider that two variable correlations, we may wonder that temperature is added for example in weather forecast
Whether 10% have impact on the trend of humidity.
In polynary prediction, main certain methods can be divided into mathematics and artificial method by us.In mathematical method
Such as ARIMA, (Autoregressive integrated Moving Average Model, nonstationary time series is converted into
Stationary time series, then only carries out recurrence institute by dependent variable to its lagged value and the present worth of stochastic error and lagged value
The model of foundation) or exponential smoothing algorithm handle real world in non-linear irregular data when it is unreliable.Manually
Neutral net, SVMs and k nearest neighbor are all the machine learning methods that some are applied to time series forecasting.Yet with very
Many time variables can be translated and stretched over time, and these traditional methods will fail.In order to solve this problem, one
Solution is exactly one variate-value of behavior rather than consideration for considering a sequence.For example certain methods are in time series analysis
Middle carry out model prediction.These methods all assume it is a kind of data are represented then as possible look for most frequent pattern.However, this
The subject matter of solutions presence is a bit:Data represent not reduce the especially high dimension of data dimension in these methods
According to, and they must also go to cause time complexity to improve with the method processing data of such as cluster;Another question is
Their research has no ability to explain output rule and relation, therefore reduction and explanation output rule and the relation of time complexity
Need effectively to solve.
The content of the invention
For the defect and deficiency of prior art, it is an object of the invention to provide a kind of model prediction of polynary time series data
Method, solves the problem of existing data processing method time complexity is high.
To achieve these goals, the present invention, which is adopted the following technical scheme that, is achieved:
A kind of mode prediction method of polynary time series data, comprises the following steps:
Stage one:The time series formed to each conditional-variable and decision variable finds candidate's interest mode collection,
Each candidate's interest mode collection is clustered respectively;
Step 1:Find candidate's interest mode collection;
Step 1.1:Searching can use initial subsequence
For time series S={ s1,…,sl, from s1Start to find slope m successively1≠ 0 two adjacent time sequences
Value, regard search out first two adjacent time sequential values as initial subsequence Si={ si,si+1, wherein, i=1,2 ...,
L-1, l are the length of time series, slope m1Calculation formula be:
Step 1.2:Calculate the slope of adjacent time sequential value
Increase next s to available initial subsequencei+2, calculate si+2And si+1Slope m2;
Step 1.3:Obtain interest mode
If m2It is not equal to m1, obtain interest mode pα={ si,si+1,si+2};
If m2Equal to m1, continue step 1.2, until mkIt is not equal to m1Untill, obtain interest mode pα={ si,si+1,…,
si+k, wherein, mkFor si+kAnd si+k-1Slope, k=1,2 ..., l-2;
Step 1.4 obtains candidate's interest mode collection
For time series S={ s1,…,sl, from interest mode pαLast time sequential value start, repeat walk
Rapid 1.1 to step 1.3, until finding whole time series S={ s1,…,slIn all interest mode, form candidate's interest
Set of patterns Pc={ p1,p2,…,pα,…,pβ,…,pn};
Step 2:Candidate's interest mode clustering;
Step 2.1:1. 2. it is assigned to using following prune rule with prune rule to the pattern distance value for meeting rule condition
It is infinitely great;
Prune rule is 1.:If candidate's interest mode collection PcIn any two interest mode pα,pβDo not appear in simultaneously
Peak width is wsThe same area in, by the D in Distance matrix DαβIt is assigned to infinity;Wherein, wsIt is the parameter that user specifies,
D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:If interest mode pαSlope be negative, and interest mode pβSlope for just, by Distance matrix D
In DαβIt is assigned to infinity;
Step 2.2:Calculate D in distance matrixαβFor the distance of non-infinitely great element, and assignment is corresponding into distance matrix
Position;
Step 2.3:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pα
And pβThe less interest mode of number of time sequential value in two interest modes, finally gives new interest mode collection P;
Wherein, dminTake certain between Euclidean distance minimum value and the maximum between two neighboring time sequential value
Individual value, is specifically specified by user;
Stage two:Produce prediction rule
Step 3:Correlation rule is calculated with Apriori algorithm
The interest mode collection P for merging each time variable obtains Pall, using Apriori algorithm to PallIn interest mode
Rule digging is associated, multiple correlation rules between different time variable are obtained;
Step 4:Generate prediction rule
1. m (pvMMultiple correlation rules of)=m (p ')=1 merge to form following prediction rule:
A1≤V(pv1)≤B1, and A2≤V(pv2)≤B2..., and Aj≤V(pv)j≤Bj..., and Aλ≤V(pvλ)≤B, then
C1≤V(p′)≤C2, and postpone Δ T1The individual unit interval;
Wherein, pvjBe conditional-variable formation interest mode, j=1,2 ..., λ, λ >=1, λ be conditional-variable number, p '
It is the interest mode of decision variable formation;
m(pvM) it is sLAnd s1Between slope, m (pvM)=sgn (sL-s1), pvMIt is to decision variable shadow in conditional-variable
Ring maximum interest mode, sLIt is pvMIn last time sequential value, s1It is pvMIn first time sequential value;M (p ') is
s′LAnd s1' between slope, s 'LRepresent the time sequential value of last in p ', s1' represent first time sequential value in p ';
AjAnd BjIt is m (p respectivelyvMV (p in the interest mode correlation rule of)=m (p ')=1vj) minimum value and maximum,
C1And C2It is m (p respectivelyvMV (p ') minimum value and maximum, A in the interest mode correlation rule of)=m (p ')=1j、Bj、C1
And C2It is positive number;
V(pvj) it is interest mode p in conditional-variablevjVariable quantity, V (p ') is the change of interest mode p ' in decision variable
Change amount,
V(pvj)=(max (pvj)-min(pvj))×m(pvj)
V (p ')=(max (p ')-min (p ')) × m (p ')
max(pvj) and min (pvj) interest mode p is represented respectivelyvjMaximum time sequential value and minimum time sequential value;
Time delay Δ T1=max (Δ (rg)), Δ (rg)=IpvM-Ip′, IpvMIt is pvMStart Time value, Ip′It is p ' starting
Time value;
2. m (pvM)=m (p ')=- 1 multiple correlation rules merge to form following prediction rule:
E1≤V(pv1)≤F1, and E2≤V(pv2)≤F2..., and Ej≤V(pv)j≤ F,j..., and Eη≤V(pvη)≤Fη, then
G1≤V(p′)≤G2, and postpone Δ T2The individual unit interval;
Wherein, EjAnd FjIt is m (p respectivelyvMV (p in)=m (p ')=- 1 interest mode correlation rulej) minimum value and most
Big value, G1And G2It is m (p respectivelyvMV (p ') minimum value and maximum, j in)=m (p ')=- 1 interest mode correlation rule
=1,2 ..., η, η >=1, η are the number of conditional-variable;Ej、Fj、G1And G2It is negative, j is natural number;ΔT2=max (Δs
(rg));
Stage three:The interest mode that the conditional-variable execution stage one of testing data is obtained goes the prediction of matching stage two
Rule, if meeting the former piece of prediction rule, exports predicting the outcome for decision variable.
Further, candidate's interest mode clustering method can be replaced with the following method in described step 2:
Step 2.1:The MBR for each pattern concentrated using R-tree structure candidate patterns, the data structure of rock mechanism,
Obtain the index of pattern;
Step 2.2:To each child nodes i and j in R-tree data structures, following prune rule 1 and rule 2 are utilized
Infinity is assigned to the pattern distance value for meeting rule condition.
Prune rule is 1.:If two interest mode pα,pβPeak width is not appeared in simultaneously for wsThe same area
In, by the D in distance matrixαβIt is assigned to infinity;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:If interest mode pαSlope be negative, and interest mode pβSlope for just, by Distance matrix D
In DαβIt is assigned to infinity;
Step 2.3:D in Distance matrix DαβCalculated for non-infinitely great element according to Euclidean distance, and assignment
The corresponding position into distance matrix;
Step 2.4:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pα
And pβThe less interest mode of number of time series in two interest modes, finally gives new interest mode collection P;Wherein,
dminSome value between Euclidean distance minimum value and the maximum between two neighboring time series is taken, specifically by user
Specify.
Compared with prior art, the beneficial effects of the invention are as follows:The mode prediction method of the polynary time series data of the present invention
Amount of calculation is small, the effective time complexity reduced in model prediction, solves that time complexity in conventional method is too high to ask
Topic.
Brief description of the drawings
Fig. 1 is the timing diagram of air themperature and rammed earth temperature.
Fig. 2 is the variable relation figure of air themperature and rammed earth temperature.
Fig. 3 is the MBR between pattern.
Fig. 4 is the data structure diagram based on R-tree for being used to retrieve candidate pattern proposed by the invention.
Fig. 5 is that the use beta pruning of Euclidean distance measurement carries out the performance of distance matrix calculating with unused Pruning strategy
With sequence number figure.
Fig. 6 is to enter row distance square using the Euclidean distance calculating of the prune rule time used and using beta pruning and R-tree
The time performance that battle array is calculated compares figure.
Fig. 7 is the six regular performance evaluations generated in embodiment.
Explanation is further explained in detail to the particular content of the present invention with reference to embodiments.
Embodiment
Conditional-variable in the present invention is the variable that can be used for predicting its dependent variable, and decision variable is exactly can be by other
The variable of variable prediction.
The procedure of the present invention is divided into three phases:
Stage one:The time series formed to each conditional-variable and decision variable finds candidate's interest mode collection,
Each candidate's interest mode collection is clustered respectively;Stage two:Produce prediction rule;Stage three:Generated according to the stage two
Prediction rule is predicted to testing data.
The wherein stage one and stage two is the training stage, is predicted after existing data are performed into stage one and stage two
Rule, the stage three is to be directed to data to be measured, and data to be measured are undergone to the interest mode obtained after the stage one and go to match rank
Duan Erzhong prediction rule, if meeting the former piece of prediction rule, exports predicting the outcome for decision variable.
In the stage one:Interest mode, and the behavior of summary data are found, for the change of data, slope is found for just
It is negative pattern with slope.Because sequence data may be clustered and divided to these patterns comprising the pattern repeated, the algorithm
Group, be specially:
Step 1:Find candidate's interest mode collection
Step 1.1:Searching can use initial subsequence
For time series S={ s1,…,sl, from s1Start to find slope m successively1Be 0 two adjacent time sequences
Train value, regard search out first two adjacent time sequential values as initial subsequence Si={ si,si+1, wherein, i=1,
2 ..., l-1, l are the length of time series,Slope m1Calculation formula be:
Step 1.2:Calculate the slope of adjacent time sequential value
Increase next s to available initial subsequencei+2, calculate si+2And si+1Slope m2;
Step 1.3:Obtain interest mode
If m2It is not equal to m1, obtain interest mode pα={ si,si+1,si+2};
If m2Equal to m1, continue step 1.2, until mkIt is not equal to m1Untill, obtain interest mode pα={ si,si+1,…,
si+k, wherein, mkFor si+kAnd si+k-1Slope, k=1,2 ..., l-2;
Step 1.4 obtains candidate's interest mode collection
For time series S={ s1,…,sl, from interest mode pαLast time sequential value (i.e. si+k) start,
Repeat step 1.1 is to step 1.3, until finding whole time series S={ s1,…,slN interest mode, formed candidate it is emerging
Interesting set of patterns Pc={ p1,p2,…,pα,…,pβ,…,pn};
Step 2:Candidate's interest mode clustering;
Concentrated in candidate pattern and be grouped parallel pattern, the first step for finding parallel pattern is to generate one between modes
Distance matrix.For every a pair of patterns, the element of a distance matrix shows the distance of two patterns.But traditional algorithm
Time loss it is too big, in order to solve this problem, two methods are proposed in the present invention:One kind is prune rule, another
For R-tree combination prune rules.
It is specific as follows for prune rule:
Step 2.1:1. 2. it is assigned to using following prune rule with prune rule to the pattern distance value for meeting rule condition
It is infinitely great;
Prune rule is 1.:If candidate's interest mode collection PcIn any two interest mode pα,pβDo not appear in simultaneously
Peak width is wsThe same area in, then the distance of the two patterns for infinity, by the D in Distance matrix DαβIt is assigned to nothing
It is poor big;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:The slope of use pattern carries out beta pruning, if interest mode pαSlope be negative, and interest mould
Formula pβSlope for just, then they be not construed as it is similar, by the D in Distance matrix DαβIt is filled with infinity;
Step 2.2:Calculate D in distance matrixαβFor the Euclidean distance between non-infinitely great element, and assignment to away from
From corresponding position in matrix.
Step 2.3:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pα
And pβThe less interest mode of number of time sequential value in two interest modes, finally gives new interest mode collection P;Its
In, dminTake some value between Euclidean distance minimum value and the maximum between two neighboring time sequential value, time sequence
Train value is specifically specified by user.
For R-tree combination prune rules, R-tree is used for index candidate set of patterns Pc, as shown in figure 4, P1-P9To wait
The leaf node that each in lectotype, R tree constructions is set is the MBR of a candidate pattern.It is specific as follows:
Step 2.1:The MBR for each pattern concentrated using R-tree structure candidate patterns, the data structure of rock mechanism,
Obtain the index of pattern;Each of which leaf node is the MBR of a pattern, R-Tree middle entry with neighbouring MBR come
Indexing model.This data structure will be used to reduce the time complexity of algorithm by reducing the quantity of the pattern of processing.Fig. 3
Illustrate pattern p1With pattern p2Between MBR.
Step 2.2:To each child nodes i and j in R-tree data structures, following prune rule 1 and rule 2 are utilized
Infinity is assigned to the pattern distance value for meeting rule condition.
Prune rule is 1.:If candidate's interest mode collection PcIn any two interest mode pα,pβDo not appear in simultaneously
Peak width is wsThe same area in, then the distance of the two patterns for infinity, by the D in Distance matrix DαβIt is assigned to nothing
It is poor big;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:The slope of use pattern carries out beta pruning, if interest mode pαSlope be negative, and interest mould
Formula pβSlope for just, then they be not construed as it is similar, by the D in Distance matrix DαβIt is filled with infinity;
Step 2.3:Calculate D in distance matrixαβFor the Euclidean distance between non-infinitely great element, and assignment to away from
From corresponding position in matrix.
Step 2.4:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pα
And pβThe less interest mode of number of time sequential value in two interest modes, finally gives new interest mode collection P;Its
In, dminTake some value between Euclidean distance minimum value and the maximum between two neighboring time sequential value, specifically by
User specifies.
Stage two:Produce prediction rule
Step 3:Correlation rule is calculated with Apriori algorithm
Merge the interest mode collection P of each time variable into Pall, using Apriori algorithm to PallIn interest mode enter
Row association rule mining, obtains the correlation rule of interest mode between different time variable:
Wherein, g=1,2 ..., R, pvjIt is the interest mode of conditional-variable formation, p ' is the interest mould of decision variable formation
Formula;
Correlation rule r is calculated according to formula belowgDirection, time delay and variable quantity:
The direction calculating formula of correlation rule:
Wherein, m (pvM) it is sLAnd s1Between slope, m (pvM)=sgn (sL-s1), pvMIt is decision-making to be become in conditional-variable
The maximum interest mode of amount influence, sLRepresent pvMIn last time sequential value, s1Represent pvMIn first time sequential value;
M (p ') is s 'LSlope between s ', s 'LThe time sequential value of last in p ' is represented, s ' represents first time sequence in p '
Train value;
The calculation formula of time delay:
ΔT1=max (Δ (rg)), Δ (rg)=IpvM-Ip′ (4)
Wherein, IpvMIt is pvMStart Time value, Ip′It is p ' Start Time value;
The variable quantity of rule:
V(pvj)=(max (pvj)-min(pvj))×m(pvj)
V (p ')=(max (p ')-min (p ')) × m (p ') (5)
Wherein, V (pvj) it is interest mode p in conditional-variablevjVariable quantity, V (p ') is interest mode p ' in decision variable
Variable quantity, max (pvj) and min (pvj) interest mode p is represented respectivelyvjMaximum time sequential value and minimum time sequential value.
Step 4:Generate prediction rule
1. m (pvMMultiple correlation rules of)=m (p ')=1 merge to form following prediction rule:
A1≤V(pv1)≤B1, and A2≤V(pv2)≤B2..., and Aj≤V(pv)j≤Bj..., and Aλ≤V(pvλ)≤B, then
C1≤V(p′)≤C2, and postpone Δ T1The individual unit interval;
Wherein, AjAnd BjIt is m (p respectivelyvMThe interest mode correlation rule r of)=m (p ')=1gMiddle V (pvj) minimum value and
Maximum, C1And C2It is m (p respectivelyvMThe interest mode correlation rule r of)=m (p ')=1gThe minimum value and maximum of middle V (p '),
J=1,2 ..., λ, λ >=1, λ are the number of conditional-variable, Aj、Bj、C1And C2It is positive number.
2. m (pvM)=m (p ')=- 1 multiple correlation rules merge to form following prediction rule:
E1≤V(pv1)≤F1, and E2≤V(pv2)≤F2..., and Ej≤V(pv)j≤ F,j..., and Eη≤V(pvη)≤Fη, then
G1≤V(p′)≤G2, and postpone Δ T2The individual unit interval;
Wherein, EjAnd FjIt is m (p respectivelyvMV (p in)=m (p ')=- 1 interest mode correlation rulej) minimum value and most
Big value, G1And G2It is m (p respectivelyvMV (p ') minimum value and maximum, E in)=m (p ')=- 1 interest mode correlation rulej、
Fj、G1And G2It is negative, j=1,2 ..., η, η >=1, η is the number of conditional-variable;ΔT2=max (Δ (rg)), Δ (rg)=
IpvM-Ip′, IpvMIt is pvMStart Time value, Ip′It is p ' Start Time value;
The present invention is used for the performance for illustrating inventive algorithm using hit rate H, and H is defined as:
Wherein, N is the number of the interest mode of accurate match prediction rule in conditional-variable, and M is interest in conditional-variable
The total number of pattern.
Specific embodiment of the invention given below is, it is necessary to which explanation is that the invention is not limited in implement in detail below
Example, all equivalents done on the basis of technical scheme each fall within protection scope of the present invention.
Embodiment 1
The present embodiment provides the soil temperature for detecting Ruins of Great Wall in advance by inventive algorithm, wherein, Fig. 1 and Fig. 2 divide
Not Wei bright Ruins of Great Wall air themperature and rammed earth temperature timing diagram and its variable relation figure, the time series data is passed through above-mentioned three
The processing in individual stage, wherein, the present embodiment has only used prune rule candidate's interest mode clustering, its result in the stage one
Such as Fig. 5, wherein, what naive curves were represented be both without using prune rule or without using the cluster result of R-tree data structures,
Pruning-based curves represent the cluster result using prune rule, it can be seen that sequence quantity over time
Increase, the Euclidean distance that prune rule is not used calculates time journey exponential increase used, and enters line-spacing using prune rule
Time from calculating increasess slowly.
Embodiment 2
The present embodiment and the distinctive points of embodiment 1 are:The present embodiment has used R-tree data knot in the stage one
Structure combination prune rule candidate's interest mode clustering, this data structure will be used to by reducing the quantity of the pattern of processing
The time complexity of algorithm is reduced, as shown in Figure 6.Transverse axis is the quantity of time series, it can be seen that sequence quantity over time
Increase, using only prune rule enter row distance square using the prune rule combination R-tree time ratios for carrying out distance matrix calculating
The time that battle array is calculated increasess slowly.In the timing diagram of air themperature, p1And p2The candidate pattern collection recognized for inventive algorithm,
The MBR between two patterns can be built, Fig. 3 show pattern p1With pattern p2Between MBR.
Table 1 is six specific prediction rules that the present embodiment is generated,
The prediction rule of table 1
Fig. 7 is the present embodiment predicting the outcome according to above-mentioned six prediction rules, and transverse axis is monitored area in figure, and the longitudinal axis is
Hit rate.The contrast for the hit rate H that six prediction rules of generation are predicted in region 1,2,3,4,5 is illustrated in Fig. 7
As a result, wherein the mean hit rate highest of rule 3.
Claims (2)
1. a kind of mode prediction method of polynary time series data, it is characterised in that:Comprise the following steps:
Stage one:The time series formed to each conditional-variable and decision variable finds candidate's interest mode collection, respectively
Each candidate's interest mode collection is clustered;
Step 1:Find candidate's interest mode collection;
Step 1.1:Searching can use initial subsequence
For time series S={ s1,…,sl, from s1Start to find slope m successively1≠ 0 two adjacent time sequential values, will
The two adjacent time sequential values searched out first are as initial subsequence Si={ si,si+1, wherein, i=1,2 ..., l-1, l
For the length of time series, slope m1Calculation formula be:
<mrow>
<msub>
<mi>m</mi>
<mn>1</mn>
</msub>
<mo>=</mo>
<mi>s</mi>
<mi>g</mi>
<mi>n</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>s</mi>
<mrow>
<mi>i</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>-</mo>
<msub>
<mi>s</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<mn>1</mn>
<mo>,</mo>
</mrow>
</mtd>
<mtd>
<mrow>
<msub>
<mi>s</mi>
<mrow>
<mi>i</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>></mo>
<msub>
<mi>s</mi>
<mi>i</mi>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mn>0</mn>
<mo>,</mo>
</mrow>
</mtd>
<mtd>
<mrow>
<msub>
<mi>s</mi>
<mrow>
<mi>i</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>=</mo>
<msub>
<mi>s</mi>
<mi>i</mi>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mo>-</mo>
<mn>1</mn>
<mo>,</mo>
</mrow>
</mtd>
<mtd>
<mrow>
<msub>
<mi>s</mi>
<mrow>
<mi>i</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
<mo><</mo>
<msub>
<mi>s</mi>
<mi>i</mi>
</msub>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
</mrow>
Step 1.2:Calculate the slope of adjacent time sequential value
Increase next s to available initial subsequencei+2, calculate si+2And si+1Slope m2;
Step 1.3:Obtain interest mode
If m2It is not equal to m1, obtain interest mode pα={ si,si+1,si+2};
If m2Equal to m1, continue step 1.2, until mkIt is not equal to m1Untill, obtain interest mode pα={ si,si+1,…,
si+k, wherein, mkFor si+kAnd si+k-1Slope, k=1,2 ..., l-2;
Step 1.4 obtains candidate's interest mode collection
For time series S={ s1,…,sl, from interest mode pαLast time sequential value start, repeat step 1.1
To step 1.3, until finding whole time series S={ s1,…,slIn all interest mode, form candidate's interest mode collection
Pc={ p1,p2,…,pα,…,pβ,…,pn};
Step 2:Candidate's interest mode clustering;
Step 2.1:1. 2. it is assigned to using following prune rule with prune rule to the pattern distance value for meeting rule condition infinite
Greatly;
Prune rule is 1.:If candidate's interest mode collection PcIn any two interest mode pα,pβRegion is not appeared in simultaneously
Width is wsThe same area in, by the D in Distance matrix DαβIt is assigned to infinity;Wherein, wsIt is the parameter that user specifies, D is
The distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:If interest mode pαSlope be negative, and interest mode pβSlope for just, by Distance matrix D
DαβIt is assigned to infinity;
Step 2.2:Calculate D in distance matrixαβFor the distance of non-infinitely great element, and assignment corresponding position into distance matrix
Put;
Step 2.3:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pαAnd pβ
The less interest mode of number of time sequential value in two interest modes, finally gives new interest mode collection P;
Wherein, dminSome value between Euclidean distance minimum value and the maximum between two neighboring time sequential value is taken,
Specifically specified by user;
Stage two:Produce prediction rule
Step 3:Correlation rule is calculated with Apriori algorithm
The interest mode collection P for merging each time variable obtains Pall, using Apriori algorithm to PallIn interest mode carry out
Association rule mining, obtains multiple correlation rules between different time variable;
Step 4:Generate prediction rule
1. m (pvMMultiple correlation rules of)=m (p ')=1 merge to form following prediction rule:
A1≤V(pv1)≤B1, and A2≤V(pv2)≤B2..., and Aj≤V(pvj)≤Bj..., and Aλ≤V(pvλ)≤Bλ, then C1≤V
(p′)≤C2, and postpone Δ T1The individual unit interval;
Wherein, pvjBe conditional-variable formation interest mode, j=1,2 ..., λ, λ >=1, λ be conditional-variable number, p ' is to determine
The interest mode of plan variable formation;
m(pvM) it is sLAnd s1Between slope, m (pvM)=sgn (sL-s1), pvMIt is that maximum is influenceed on decision variable in conditional-variable
Interest mode, sLIt is pvMIn last time sequential value, s1It is pvMIn first time sequential value;M (p ') is s 'LAnd s1′
Between slope, s 'LRepresent the time sequential value of last in p ', s '1Represent first time sequential value in p ';
AjAnd BjIt is m (p respectivelyvMV (p in the interest mode correlation rule of)=m (p ')=1vj) minimum value and maximum, C1With
C2It is m (p respectivelyvMV (p ') minimum value and maximum, A in the interest mode correlation rule of)=m (p ')=1j、Bj、C1And C2
It is positive number;
V(pvj) it is interest mode p in conditional-variablevjVariable quantity, V (p ') is the variable quantity of interest mode p ' in decision variable,
V(pvj)=(max (pvj)-min(pvj))×m(pvj)
V (p ')=(max (p ')-min (p ')) × m (p ')
max(pvj) and min (pvj) interest mode p is represented respectivelyvjMaximum time sequential value and minimum time sequential value;
Time delay Δ T1=max (Δ (rg)), It is pvMStart Time value, Ip′It is p ' initial time
Value;
2. m (pvM)=m (p ')=- 1 multiple correlation rules merge to form following prediction rule:
E1≤V(pv1)≤F1, and E2≤V(pv2)≤F2..., and Ej≤V(pvj)≤Fj..., and Eη≤V(pvη)≤Fη, then G1≤V
(p′)≤G2, and postpone Δ T2The individual unit interval;
Wherein, EjAnd FjIt is m (p respectivelyvMV (p in)=m (p ')=- 1 interest mode correlation rulej) minimum value and maximum
Value, j=1,2 ..., η, η >=1, η is the number of conditional-variable, G1And G2It is m (p respectivelyvM)=m (p ')=- 1 interest mode
V (p ') minimum value and maximum in correlation rule;Ej、Fj、G1And G2It is negative;ΔT2=max (Δ (rg));
Stage three:The interest mode that the conditional-variable execution stage one of testing data is obtained removes the pre- gauge of matching stage two
Then, if meeting the former piece of prediction rule, predicting the outcome for decision variable is exported.
2. the mode prediction method of polynary time series data as claimed in claim 1, it is characterised in that:Waited in described step 2
Interest mode clustering method is selected to replace with the following method:
Step 2.1:The MBR for each pattern that candidate pattern is concentrated is built using R-tree, the data structure of rock mechanism is obtained
The index of pattern;
Step 2.2:To each child nodes i and j in R-tree data structures, using following prune rule 1 and rule 2 to full
The pattern distance value of sufficient rule condition is assigned to infinity.
Prune rule is 1.:If two interest mode pα,pβPeak width is not appeared in simultaneously for wsThe same area in, will
D in distance matrixαβIt is assigned to infinity;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:If interest mode pαSlope be negative, and interest mode pβSlope for just, by Distance matrix D
DαβIt is assigned to infinity;
Step 2.3:D in Distance matrix DαβCalculated for non-infinitely great element according to Euclidean distance, and assignment to away from
From corresponding position in matrix;
Step 2.4:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pαAnd pβ
The less interest mode of number of time series in two interest modes, finally gives new interest mode collection P;Wherein, dminTake
Some value between Euclidean distance minimum value and maximum between two neighboring time series, is specifically specified by user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710324105.6A CN107220483B (en) | 2017-05-09 | 2017-05-09 | Earth temperature mode prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710324105.6A CN107220483B (en) | 2017-05-09 | 2017-05-09 | Earth temperature mode prediction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107220483A true CN107220483A (en) | 2017-09-29 |
CN107220483B CN107220483B (en) | 2021-01-01 |
Family
ID=59944105
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710324105.6A Active CN107220483B (en) | 2017-05-09 | 2017-05-09 | Earth temperature mode prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107220483B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111797301A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Activity prediction method, activity prediction device, storage medium and electronic equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040086180A1 (en) * | 2002-11-01 | 2004-05-06 | Ajay Divakaran | Pattern discovery in video content using association rules on multiple sets of labels |
CN102637208A (en) * | 2012-03-28 | 2012-08-15 | 南京财经大学 | Method for filtering noise data based on pattern mining |
EP2772872A2 (en) * | 2013-02-28 | 2014-09-03 | Samsung Electronics Co., Ltd. | Method and apparatus for searching pattern in sequence data |
US20150012540A1 (en) * | 2013-07-02 | 2015-01-08 | Hewlett-Packard Development Company, L.P. | Deriving an interestingness measure for a cluster |
US20150269157A1 (en) * | 2014-03-21 | 2015-09-24 | International Business Machines Corporation | Knowledge discovery in data analytics |
CN105320756A (en) * | 2015-10-15 | 2016-02-10 | 江苏省邮电规划设计院有限责任公司 | Improved Apriori algorithm based method for mining database association rule |
CN106384128A (en) * | 2016-09-09 | 2017-02-08 | 西安交通大学 | Method for mining time series data state correlation |
-
2017
- 2017-05-09 CN CN201710324105.6A patent/CN107220483B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040086180A1 (en) * | 2002-11-01 | 2004-05-06 | Ajay Divakaran | Pattern discovery in video content using association rules on multiple sets of labels |
CN102637208A (en) * | 2012-03-28 | 2012-08-15 | 南京财经大学 | Method for filtering noise data based on pattern mining |
EP2772872A2 (en) * | 2013-02-28 | 2014-09-03 | Samsung Electronics Co., Ltd. | Method and apparatus for searching pattern in sequence data |
US20150012540A1 (en) * | 2013-07-02 | 2015-01-08 | Hewlett-Packard Development Company, L.P. | Deriving an interestingness measure for a cluster |
US20150269157A1 (en) * | 2014-03-21 | 2015-09-24 | International Business Machines Corporation | Knowledge discovery in data analytics |
CN105320756A (en) * | 2015-10-15 | 2016-02-10 | 江苏省邮电规划设计院有限责任公司 | Improved Apriori algorithm based method for mining database association rule |
CN106384128A (en) * | 2016-09-09 | 2017-02-08 | 西安交通大学 | Method for mining time series data state correlation |
Non-Patent Citations (2)
Title |
---|
JA-HWUNG SU ET AL: "Efficient Relevance Feedback for Content-Based Image Retrieval by Mining User Navigation Patterns", 《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 * |
吴志华: "基于知识发现的时序数据挖掘算法研究", 《中国优秀博硕士学位论文全文数据库 (硕士) 信息科技辑》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111797301A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Activity prediction method, activity prediction device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN107220483B (en) | 2021-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xuan et al. | Multi-model fusion short-term load forecasting based on random forest feature selection and hybrid neural network | |
Chen et al. | An improved artificial bee colony algorithm combined with extremal optimization and Boltzmann Selection probability | |
CN106326346A (en) | Text classification method and terminal device | |
US20230052730A1 (en) | Method for predicting operation state of power distribution network with distributed generations based on scene analysis | |
CN110851566A (en) | Improved differentiable network structure searching method | |
CN106570250A (en) | Power big data oriented microgrid short-period load prediction method | |
CN103455612B (en) | Based on two-stage policy non-overlapped with overlapping network community detection method | |
CN111861013A (en) | Power load prediction method and device | |
CN107798426A (en) | Wind power interval Forecasting Methodology based on Atomic Decomposition and interactive fuzzy satisfying method | |
CN112328578A (en) | Database query optimization method based on reinforcement learning and graph attention network | |
CN104951847A (en) | Rainfall forecast method based on kernel principal component analysis and gene expression programming | |
CN113780002A (en) | Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning | |
CN114169434A (en) | Load prediction method | |
CN110428015A (en) | A kind of training method and relevant device of model | |
CN105843907A (en) | Method for establishing memory index structure-distance tree and similarity connection algorithm based on distance tree | |
CN106296434A (en) | A kind of Grain Crop Yield Prediction method based on PSO LSSVM algorithm | |
CN116662412B (en) | Data mining method for big data of power grid distribution and utilization | |
CN110765582A (en) | Self-organization center K-means microgrid scene division method based on Markov chain | |
CN107220483A (en) | A kind of mode prediction method of polynary time series data | |
Kim et al. | A daily tourism demand prediction framework based on multi-head attention CNN: The case of the foreign entrant in South Korea | |
CN110111838B (en) | Method and device for predicting RNA folding structure containing false knot based on expansion structure | |
CN104657429B (en) | Technology-driven type Product Innovation Method based on complex network | |
CN103136515A (en) | Creative inflection point identification method based on draft action sequence and system using the same | |
Mo et al. | Simulated annealing for neural architecture search | |
CN114647679A (en) | Hydrological time series motif mining method based on numerical characteristic clustering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20221213 Address after: 710075 Room 021, F2003, 20th Floor, Building 4-A, Xixian Financial Port, Fengdong New City Energy Jinmao District, Xixian New District, Xi'an, Shaanxi Patentee after: Shaanxi Dahang Wujiang Information Technology Co.,Ltd. Address before: 710069 No. 229 Taibai North Road, Shaanxi, Xi'an Patentee before: NORTHWEST University |
|
TR01 | Transfer of patent right |