CN107220483A - A kind of mode prediction method of polynary time series data - Google Patents

A kind of mode prediction method of polynary time series data Download PDF

Info

Publication number
CN107220483A
CN107220483A CN201710324105.6A CN201710324105A CN107220483A CN 107220483 A CN107220483 A CN 107220483A CN 201710324105 A CN201710324105 A CN 201710324105A CN 107220483 A CN107220483 A CN 107220483A
Authority
CN
China
Prior art keywords
interest mode
mrow
rule
interest
variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710324105.6A
Other languages
Chinese (zh)
Other versions
CN107220483B (en
Inventor
肖云
许震洲
王欣
王选宏
高颢函
陈晓江
房鼎益
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Dahang Wujiang Information Technology Co ltd
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN201710324105.6A priority Critical patent/CN107220483B/en
Publication of CN107220483A publication Critical patent/CN107220483A/en
Application granted granted Critical
Publication of CN107220483B publication Critical patent/CN107220483B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16ZINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
    • G16Z99/00Subject matter not provided for in other main groups of this subclass

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of mode prediction method of polynary time series data, including three phases, the time series formed first to each conditional-variable and decision variable finds candidate's interest mode collection, each candidate's interest mode collection is clustered respectively;Secondly, the prediction between Production conditions variable and decision variable;The interest mode that finally the conditional-variable execution stage one of testing data is obtained goes the prediction rule of matching stage two, if meeting the former piece of prediction rule, exports predicting the outcome for decision variable.The mode prediction method amount of calculation of the polynary time series data of the present invention is small, the effective time complexity reduced in model prediction, solves the problem of time complexity is too high in conventional method.

Description

A kind of mode prediction method of polynary time series data
Technical field
The invention belongs to computer realm, the more particularly to Data Mining in computer, and in particular to a kind of polynary The mode prediction method of time series data.
Background technology
Time series forecasting is in weather forecast, and the field such as stock is a very important research direction.One in time series forecasting Individual most important method is exactly the behavior that can remove to predict its dependent variable according to the trend of some variables, and this is just called polynary sequential Prediction.For example, if we consider that two variable correlations, we may wonder that temperature is added for example in weather forecast Whether 10% have impact on the trend of humidity.
In polynary prediction, main certain methods can be divided into mathematics and artificial method by us.In mathematical method Such as ARIMA, (Autoregressive integrated Moving Average Model, nonstationary time series is converted into Stationary time series, then only carries out recurrence institute by dependent variable to its lagged value and the present worth of stochastic error and lagged value The model of foundation) or exponential smoothing algorithm handle real world in non-linear irregular data when it is unreliable.Manually Neutral net, SVMs and k nearest neighbor are all the machine learning methods that some are applied to time series forecasting.Yet with very Many time variables can be translated and stretched over time, and these traditional methods will fail.In order to solve this problem, one Solution is exactly one variate-value of behavior rather than consideration for considering a sequence.For example certain methods are in time series analysis Middle carry out model prediction.These methods all assume it is a kind of data are represented then as possible look for most frequent pattern.However, this The subject matter of solutions presence is a bit:Data represent not reduce the especially high dimension of data dimension in these methods According to, and they must also go to cause time complexity to improve with the method processing data of such as cluster;Another question is Their research has no ability to explain output rule and relation, therefore reduction and explanation output rule and the relation of time complexity Need effectively to solve.
The content of the invention
For the defect and deficiency of prior art, it is an object of the invention to provide a kind of model prediction of polynary time series data Method, solves the problem of existing data processing method time complexity is high.
To achieve these goals, the present invention, which is adopted the following technical scheme that, is achieved:
A kind of mode prediction method of polynary time series data, comprises the following steps:
Stage one:The time series formed to each conditional-variable and decision variable finds candidate's interest mode collection, Each candidate's interest mode collection is clustered respectively;
Step 1:Find candidate's interest mode collection;
Step 1.1:Searching can use initial subsequence
For time series S={ s1,…,sl, from s1Start to find slope m successively1≠ 0 two adjacent time sequences Value, regard search out first two adjacent time sequential values as initial subsequence Si={ si,si+1, wherein, i=1,2 ..., L-1, l are the length of time series, slope m1Calculation formula be:
Step 1.2:Calculate the slope of adjacent time sequential value
Increase next s to available initial subsequencei+2, calculate si+2And si+1Slope m2
Step 1.3:Obtain interest mode
If m2It is not equal to m1, obtain interest mode pα={ si,si+1,si+2};
If m2Equal to m1, continue step 1.2, until mkIt is not equal to m1Untill, obtain interest mode pα={ si,si+1,…, si+k, wherein, mkFor si+kAnd si+k-1Slope, k=1,2 ..., l-2;
Step 1.4 obtains candidate's interest mode collection
For time series S={ s1,…,sl, from interest mode pαLast time sequential value start, repeat walk Rapid 1.1 to step 1.3, until finding whole time series S={ s1,…,slIn all interest mode, form candidate's interest Set of patterns Pc={ p1,p2,…,pα,…,pβ,…,pn};
Step 2:Candidate's interest mode clustering;
Step 2.1:1. 2. it is assigned to using following prune rule with prune rule to the pattern distance value for meeting rule condition It is infinitely great;
Prune rule is 1.:If candidate's interest mode collection PcIn any two interest mode pα,pβDo not appear in simultaneously Peak width is wsThe same area in, by the D in Distance matrix DαβIt is assigned to infinity;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:If interest mode pαSlope be negative, and interest mode pβSlope for just, by Distance matrix D In DαβIt is assigned to infinity;
Step 2.2:Calculate D in distance matrixαβFor the distance of non-infinitely great element, and assignment is corresponding into distance matrix Position;
Step 2.3:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pα And pβThe less interest mode of number of time sequential value in two interest modes, finally gives new interest mode collection P;
Wherein, dminTake certain between Euclidean distance minimum value and the maximum between two neighboring time sequential value Individual value, is specifically specified by user;
Stage two:Produce prediction rule
Step 3:Correlation rule is calculated with Apriori algorithm
The interest mode collection P for merging each time variable obtains Pall, using Apriori algorithm to PallIn interest mode Rule digging is associated, multiple correlation rules between different time variable are obtained;
Step 4:Generate prediction rule
1. m (pvMMultiple correlation rules of)=m (p ')=1 merge to form following prediction rule:
A1≤V(pv1)≤B1, and A2≤V(pv2)≤B2..., and Aj≤V(pv)j≤Bj..., and Aλ≤V(p)≤B, then C1≤V(p′)≤C2, and postpone Δ T1The individual unit interval;
Wherein, pvjBe conditional-variable formation interest mode, j=1,2 ..., λ, λ >=1, λ be conditional-variable number, p ' It is the interest mode of decision variable formation;
m(pvM) it is sLAnd s1Between slope, m (pvM)=sgn (sL-s1), pvMIt is to decision variable shadow in conditional-variable Ring maximum interest mode, sLIt is pvMIn last time sequential value, s1It is pvMIn first time sequential value;M (p ') is s′LAnd s1' between slope, s 'LRepresent the time sequential value of last in p ', s1' represent first time sequential value in p ';
AjAnd BjIt is m (p respectivelyvMV (p in the interest mode correlation rule of)=m (p ')=1vj) minimum value and maximum, C1And C2It is m (p respectivelyvMV (p ') minimum value and maximum, A in the interest mode correlation rule of)=m (p ')=1j、Bj、C1 And C2It is positive number;
V(pvj) it is interest mode p in conditional-variablevjVariable quantity, V (p ') is the change of interest mode p ' in decision variable Change amount,
V(pvj)=(max (pvj)-min(pvj))×m(pvj)
V (p ')=(max (p ')-min (p ')) × m (p ')
max(pvj) and min (pvj) interest mode p is represented respectivelyvjMaximum time sequential value and minimum time sequential value;
Time delay Δ T1=max (Δ (rg)), Δ (rg)=IpvM-Ip′, IpvMIt is pvMStart Time value, Ip′It is p ' starting Time value;
2. m (pvM)=m (p ')=- 1 multiple correlation rules merge to form following prediction rule:
E1≤V(pv1)≤F1, and E2≤V(pv2)≤F2..., and Ej≤V(pv)j≤ F,j..., and Eη≤V(p)≤Fη, then G1≤V(p′)≤G2, and postpone Δ T2The individual unit interval;
Wherein, EjAnd FjIt is m (p respectivelyvMV (p in)=m (p ')=- 1 interest mode correlation rulej) minimum value and most Big value, G1And G2It is m (p respectivelyvMV (p ') minimum value and maximum, j in)=m (p ')=- 1 interest mode correlation rule =1,2 ..., η, η >=1, η are the number of conditional-variable;Ej、Fj、G1And G2It is negative, j is natural number;ΔT2=max (Δs (rg));
Stage three:The interest mode that the conditional-variable execution stage one of testing data is obtained goes the prediction of matching stage two Rule, if meeting the former piece of prediction rule, exports predicting the outcome for decision variable.
Further, candidate's interest mode clustering method can be replaced with the following method in described step 2:
Step 2.1:The MBR for each pattern concentrated using R-tree structure candidate patterns, the data structure of rock mechanism, Obtain the index of pattern;
Step 2.2:To each child nodes i and j in R-tree data structures, following prune rule 1 and rule 2 are utilized Infinity is assigned to the pattern distance value for meeting rule condition.
Prune rule is 1.:If two interest mode pα,pβPeak width is not appeared in simultaneously for wsThe same area In, by the D in distance matrixαβIt is assigned to infinity;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:If interest mode pαSlope be negative, and interest mode pβSlope for just, by Distance matrix D In DαβIt is assigned to infinity;
Step 2.3:D in Distance matrix DαβCalculated for non-infinitely great element according to Euclidean distance, and assignment The corresponding position into distance matrix;
Step 2.4:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pα And pβThe less interest mode of number of time series in two interest modes, finally gives new interest mode collection P;Wherein, dminSome value between Euclidean distance minimum value and the maximum between two neighboring time series is taken, specifically by user Specify.
Compared with prior art, the beneficial effects of the invention are as follows:The mode prediction method of the polynary time series data of the present invention Amount of calculation is small, the effective time complexity reduced in model prediction, solves that time complexity in conventional method is too high to ask Topic.
Brief description of the drawings
Fig. 1 is the timing diagram of air themperature and rammed earth temperature.
Fig. 2 is the variable relation figure of air themperature and rammed earth temperature.
Fig. 3 is the MBR between pattern.
Fig. 4 is the data structure diagram based on R-tree for being used to retrieve candidate pattern proposed by the invention.
Fig. 5 is that the use beta pruning of Euclidean distance measurement carries out the performance of distance matrix calculating with unused Pruning strategy With sequence number figure.
Fig. 6 is to enter row distance square using the Euclidean distance calculating of the prune rule time used and using beta pruning and R-tree The time performance that battle array is calculated compares figure.
Fig. 7 is the six regular performance evaluations generated in embodiment.
Explanation is further explained in detail to the particular content of the present invention with reference to embodiments.
Embodiment
Conditional-variable in the present invention is the variable that can be used for predicting its dependent variable, and decision variable is exactly can be by other The variable of variable prediction.
The procedure of the present invention is divided into three phases:
Stage one:The time series formed to each conditional-variable and decision variable finds candidate's interest mode collection, Each candidate's interest mode collection is clustered respectively;Stage two:Produce prediction rule;Stage three:Generated according to the stage two Prediction rule is predicted to testing data.
The wherein stage one and stage two is the training stage, is predicted after existing data are performed into stage one and stage two Rule, the stage three is to be directed to data to be measured, and data to be measured are undergone to the interest mode obtained after the stage one and go to match rank Duan Erzhong prediction rule, if meeting the former piece of prediction rule, exports predicting the outcome for decision variable.
In the stage one:Interest mode, and the behavior of summary data are found, for the change of data, slope is found for just It is negative pattern with slope.Because sequence data may be clustered and divided to these patterns comprising the pattern repeated, the algorithm Group, be specially:
Step 1:Find candidate's interest mode collection
Step 1.1:Searching can use initial subsequence
For time series S={ s1,…,sl, from s1Start to find slope m successively1Be 0 two adjacent time sequences Train value, regard search out first two adjacent time sequential values as initial subsequence Si={ si,si+1, wherein, i=1, 2 ..., l-1, l are the length of time series,Slope m1Calculation formula be:
Step 1.2:Calculate the slope of adjacent time sequential value
Increase next s to available initial subsequencei+2, calculate si+2And si+1Slope m2
Step 1.3:Obtain interest mode
If m2It is not equal to m1, obtain interest mode pα={ si,si+1,si+2};
If m2Equal to m1, continue step 1.2, until mkIt is not equal to m1Untill, obtain interest mode pα={ si,si+1,…, si+k, wherein, mkFor si+kAnd si+k-1Slope, k=1,2 ..., l-2;
Step 1.4 obtains candidate's interest mode collection
For time series S={ s1,…,sl, from interest mode pαLast time sequential value (i.e. si+k) start, Repeat step 1.1 is to step 1.3, until finding whole time series S={ s1,…,slN interest mode, formed candidate it is emerging Interesting set of patterns Pc={ p1,p2,…,pα,…,pβ,…,pn};
Step 2:Candidate's interest mode clustering;
Concentrated in candidate pattern and be grouped parallel pattern, the first step for finding parallel pattern is to generate one between modes Distance matrix.For every a pair of patterns, the element of a distance matrix shows the distance of two patterns.But traditional algorithm Time loss it is too big, in order to solve this problem, two methods are proposed in the present invention:One kind is prune rule, another For R-tree combination prune rules.
It is specific as follows for prune rule:
Step 2.1:1. 2. it is assigned to using following prune rule with prune rule to the pattern distance value for meeting rule condition It is infinitely great;
Prune rule is 1.:If candidate's interest mode collection PcIn any two interest mode pα,pβDo not appear in simultaneously Peak width is wsThe same area in, then the distance of the two patterns for infinity, by the D in Distance matrix DαβIt is assigned to nothing It is poor big;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:The slope of use pattern carries out beta pruning, if interest mode pαSlope be negative, and interest mould Formula pβSlope for just, then they be not construed as it is similar, by the D in Distance matrix DαβIt is filled with infinity;
Step 2.2:Calculate D in distance matrixαβFor the Euclidean distance between non-infinitely great element, and assignment to away from From corresponding position in matrix.
Step 2.3:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pα And pβThe less interest mode of number of time sequential value in two interest modes, finally gives new interest mode collection P;Its In, dminTake some value between Euclidean distance minimum value and the maximum between two neighboring time sequential value, time sequence Train value is specifically specified by user.
For R-tree combination prune rules, R-tree is used for index candidate set of patterns Pc, as shown in figure 4, P1-P9To wait The leaf node that each in lectotype, R tree constructions is set is the MBR of a candidate pattern.It is specific as follows:
Step 2.1:The MBR for each pattern concentrated using R-tree structure candidate patterns, the data structure of rock mechanism, Obtain the index of pattern;Each of which leaf node is the MBR of a pattern, R-Tree middle entry with neighbouring MBR come Indexing model.This data structure will be used to reduce the time complexity of algorithm by reducing the quantity of the pattern of processing.Fig. 3 Illustrate pattern p1With pattern p2Between MBR.
Step 2.2:To each child nodes i and j in R-tree data structures, following prune rule 1 and rule 2 are utilized Infinity is assigned to the pattern distance value for meeting rule condition.
Prune rule is 1.:If candidate's interest mode collection PcIn any two interest mode pα,pβDo not appear in simultaneously Peak width is wsThe same area in, then the distance of the two patterns for infinity, by the D in Distance matrix DαβIt is assigned to nothing It is poor big;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:The slope of use pattern carries out beta pruning, if interest mode pαSlope be negative, and interest mould Formula pβSlope for just, then they be not construed as it is similar, by the D in Distance matrix DαβIt is filled with infinity;
Step 2.3:Calculate D in distance matrixαβFor the Euclidean distance between non-infinitely great element, and assignment to away from From corresponding position in matrix.
Step 2.4:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pα And pβThe less interest mode of number of time sequential value in two interest modes, finally gives new interest mode collection P;Its In, dminTake some value between Euclidean distance minimum value and the maximum between two neighboring time sequential value, specifically by User specifies.
Stage two:Produce prediction rule
Step 3:Correlation rule is calculated with Apriori algorithm
Merge the interest mode collection P of each time variable into Pall, using Apriori algorithm to PallIn interest mode enter Row association rule mining, obtains the correlation rule of interest mode between different time variable:
Wherein, g=1,2 ..., R, pvjIt is the interest mode of conditional-variable formation, p ' is the interest mould of decision variable formation Formula;
Correlation rule r is calculated according to formula belowgDirection, time delay and variable quantity:
The direction calculating formula of correlation rule:
Wherein, m (pvM) it is sLAnd s1Between slope, m (pvM)=sgn (sL-s1), pvMIt is decision-making to be become in conditional-variable The maximum interest mode of amount influence, sLRepresent pvMIn last time sequential value, s1Represent pvMIn first time sequential value; M (p ') is s 'LSlope between s ', s 'LThe time sequential value of last in p ' is represented, s ' represents first time sequence in p ' Train value;
The calculation formula of time delay:
ΔT1=max (Δ (rg)), Δ (rg)=IpvM-Ip′ (4)
Wherein, IpvMIt is pvMStart Time value, Ip′It is p ' Start Time value;
The variable quantity of rule:
V(pvj)=(max (pvj)-min(pvj))×m(pvj)
V (p ')=(max (p ')-min (p ')) × m (p ') (5)
Wherein, V (pvj) it is interest mode p in conditional-variablevjVariable quantity, V (p ') is interest mode p ' in decision variable Variable quantity, max (pvj) and min (pvj) interest mode p is represented respectivelyvjMaximum time sequential value and minimum time sequential value.
Step 4:Generate prediction rule
1. m (pvMMultiple correlation rules of)=m (p ')=1 merge to form following prediction rule:
A1≤V(pv1)≤B1, and A2≤V(pv2)≤B2..., and Aj≤V(pv)j≤Bj..., and Aλ≤V(p)≤B, then C1≤V(p′)≤C2, and postpone Δ T1The individual unit interval;
Wherein, AjAnd BjIt is m (p respectivelyvMThe interest mode correlation rule r of)=m (p ')=1gMiddle V (pvj) minimum value and Maximum, C1And C2It is m (p respectivelyvMThe interest mode correlation rule r of)=m (p ')=1gThe minimum value and maximum of middle V (p '), J=1,2 ..., λ, λ >=1, λ are the number of conditional-variable, Aj、Bj、C1And C2It is positive number.
2. m (pvM)=m (p ')=- 1 multiple correlation rules merge to form following prediction rule:
E1≤V(pv1)≤F1, and E2≤V(pv2)≤F2..., and Ej≤V(pv)j≤ F,j..., and Eη≤V(p)≤Fη, then G1≤V(p′)≤G2, and postpone Δ T2The individual unit interval;
Wherein, EjAnd FjIt is m (p respectivelyvMV (p in)=m (p ')=- 1 interest mode correlation rulej) minimum value and most Big value, G1And G2It is m (p respectivelyvMV (p ') minimum value and maximum, E in)=m (p ')=- 1 interest mode correlation rulej、 Fj、G1And G2It is negative, j=1,2 ..., η, η >=1, η is the number of conditional-variable;ΔT2=max (Δ (rg)), Δ (rg)= IpvM-Ip′, IpvMIt is pvMStart Time value, Ip′It is p ' Start Time value;
The present invention is used for the performance for illustrating inventive algorithm using hit rate H, and H is defined as:
Wherein, N is the number of the interest mode of accurate match prediction rule in conditional-variable, and M is interest in conditional-variable The total number of pattern.
Specific embodiment of the invention given below is, it is necessary to which explanation is that the invention is not limited in implement in detail below Example, all equivalents done on the basis of technical scheme each fall within protection scope of the present invention.
Embodiment 1
The present embodiment provides the soil temperature for detecting Ruins of Great Wall in advance by inventive algorithm, wherein, Fig. 1 and Fig. 2 divide Not Wei bright Ruins of Great Wall air themperature and rammed earth temperature timing diagram and its variable relation figure, the time series data is passed through above-mentioned three The processing in individual stage, wherein, the present embodiment has only used prune rule candidate's interest mode clustering, its result in the stage one Such as Fig. 5, wherein, what naive curves were represented be both without using prune rule or without using the cluster result of R-tree data structures, Pruning-based curves represent the cluster result using prune rule, it can be seen that sequence quantity over time Increase, the Euclidean distance that prune rule is not used calculates time journey exponential increase used, and enters line-spacing using prune rule Time from calculating increasess slowly.
Embodiment 2
The present embodiment and the distinctive points of embodiment 1 are:The present embodiment has used R-tree data knot in the stage one Structure combination prune rule candidate's interest mode clustering, this data structure will be used to by reducing the quantity of the pattern of processing The time complexity of algorithm is reduced, as shown in Figure 6.Transverse axis is the quantity of time series, it can be seen that sequence quantity over time Increase, using only prune rule enter row distance square using the prune rule combination R-tree time ratios for carrying out distance matrix calculating The time that battle array is calculated increasess slowly.In the timing diagram of air themperature, p1And p2The candidate pattern collection recognized for inventive algorithm, The MBR between two patterns can be built, Fig. 3 show pattern p1With pattern p2Between MBR.
Table 1 is six specific prediction rules that the present embodiment is generated,
The prediction rule of table 1
Fig. 7 is the present embodiment predicting the outcome according to above-mentioned six prediction rules, and transverse axis is monitored area in figure, and the longitudinal axis is Hit rate.The contrast for the hit rate H that six prediction rules of generation are predicted in region 1,2,3,4,5 is illustrated in Fig. 7 As a result, wherein the mean hit rate highest of rule 3.

Claims (2)

1. a kind of mode prediction method of polynary time series data, it is characterised in that:Comprise the following steps:
Stage one:The time series formed to each conditional-variable and decision variable finds candidate's interest mode collection, respectively Each candidate's interest mode collection is clustered;
Step 1:Find candidate's interest mode collection;
Step 1.1:Searching can use initial subsequence
For time series S={ s1,…,sl, from s1Start to find slope m successively1≠ 0 two adjacent time sequential values, will The two adjacent time sequential values searched out first are as initial subsequence Si={ si,si+1, wherein, i=1,2 ..., l-1, l For the length of time series, slope m1Calculation formula be:
<mrow> <msub> <mi>m</mi> <mn>1</mn> </msub> <mo>=</mo> <mi>s</mi> <mi>g</mi> <mi>n</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mrow> <mi>i</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>-</mo> <msub> <mi>s</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>1</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>s</mi> <mrow> <mi>i</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>&gt;</mo> <msub> <mi>s</mi> <mi>i</mi> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>s</mi> <mrow> <mi>i</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>=</mo> <msub> <mi>s</mi> <mi>i</mi> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>-</mo> <mn>1</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>s</mi> <mrow> <mi>i</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>&lt;</mo> <msub> <mi>s</mi> <mi>i</mi> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
Step 1.2:Calculate the slope of adjacent time sequential value
Increase next s to available initial subsequencei+2, calculate si+2And si+1Slope m2
Step 1.3:Obtain interest mode
If m2It is not equal to m1, obtain interest mode pα={ si,si+1,si+2};
If m2Equal to m1, continue step 1.2, until mkIt is not equal to m1Untill, obtain interest mode pα={ si,si+1,…, si+k, wherein, mkFor si+kAnd si+k-1Slope, k=1,2 ..., l-2;
Step 1.4 obtains candidate's interest mode collection
For time series S={ s1,…,sl, from interest mode pαLast time sequential value start, repeat step 1.1 To step 1.3, until finding whole time series S={ s1,…,slIn all interest mode, form candidate's interest mode collection Pc={ p1,p2,…,pα,…,pβ,…,pn};
Step 2:Candidate's interest mode clustering;
Step 2.1:1. 2. it is assigned to using following prune rule with prune rule to the pattern distance value for meeting rule condition infinite Greatly;
Prune rule is 1.:If candidate's interest mode collection PcIn any two interest mode pα,pβRegion is not appeared in simultaneously Width is wsThe same area in, by the D in Distance matrix DαβIt is assigned to infinity;Wherein, wsIt is the parameter that user specifies, D is The distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:If interest mode pαSlope be negative, and interest mode pβSlope for just, by Distance matrix D DαβIt is assigned to infinity;
Step 2.2:Calculate D in distance matrixαβFor the distance of non-infinitely great element, and assignment corresponding position into distance matrix Put;
Step 2.3:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pαAnd pβ The less interest mode of number of time sequential value in two interest modes, finally gives new interest mode collection P;
Wherein, dminSome value between Euclidean distance minimum value and the maximum between two neighboring time sequential value is taken, Specifically specified by user;
Stage two:Produce prediction rule
Step 3:Correlation rule is calculated with Apriori algorithm
The interest mode collection P for merging each time variable obtains Pall, using Apriori algorithm to PallIn interest mode carry out Association rule mining, obtains multiple correlation rules between different time variable;
Step 4:Generate prediction rule
1. m (pvMMultiple correlation rules of)=m (p ')=1 merge to form following prediction rule:
A1≤V(pv1)≤B1, and A2≤V(pv2)≤B2..., and Aj≤V(pvj)≤Bj..., and Aλ≤V(p)≤Bλ, then C1≤V (p′)≤C2, and postpone Δ T1The individual unit interval;
Wherein, pvjBe conditional-variable formation interest mode, j=1,2 ..., λ, λ >=1, λ be conditional-variable number, p ' is to determine The interest mode of plan variable formation;
m(pvM) it is sLAnd s1Between slope, m (pvM)=sgn (sL-s1), pvMIt is that maximum is influenceed on decision variable in conditional-variable Interest mode, sLIt is pvMIn last time sequential value, s1It is pvMIn first time sequential value;M (p ') is s 'LAnd s1′ Between slope, s 'LRepresent the time sequential value of last in p ', s '1Represent first time sequential value in p ';
AjAnd BjIt is m (p respectivelyvMV (p in the interest mode correlation rule of)=m (p ')=1vj) minimum value and maximum, C1With C2It is m (p respectivelyvMV (p ') minimum value and maximum, A in the interest mode correlation rule of)=m (p ')=1j、Bj、C1And C2 It is positive number;
V(pvj) it is interest mode p in conditional-variablevjVariable quantity, V (p ') is the variable quantity of interest mode p ' in decision variable,
V(pvj)=(max (pvj)-min(pvj))×m(pvj)
V (p ')=(max (p ')-min (p ')) × m (p ')
max(pvj) and min (pvj) interest mode p is represented respectivelyvjMaximum time sequential value and minimum time sequential value;
Time delay Δ T1=max (Δ (rg)), It is pvMStart Time value, Ip′It is p ' initial time Value;
2. m (pvM)=m (p ')=- 1 multiple correlation rules merge to form following prediction rule:
E1≤V(pv1)≤F1, and E2≤V(pv2)≤F2..., and Ej≤V(pvj)≤Fj..., and Eη≤V(p)≤Fη, then G1≤V (p′)≤G2, and postpone Δ T2The individual unit interval;
Wherein, EjAnd FjIt is m (p respectivelyvMV (p in)=m (p ')=- 1 interest mode correlation rulej) minimum value and maximum Value, j=1,2 ..., η, η >=1, η is the number of conditional-variable, G1And G2It is m (p respectivelyvM)=m (p ')=- 1 interest mode V (p ') minimum value and maximum in correlation rule;Ej、Fj、G1And G2It is negative;ΔT2=max (Δ (rg));
Stage three:The interest mode that the conditional-variable execution stage one of testing data is obtained removes the pre- gauge of matching stage two Then, if meeting the former piece of prediction rule, predicting the outcome for decision variable is exported.
2. the mode prediction method of polynary time series data as claimed in claim 1, it is characterised in that:Waited in described step 2 Interest mode clustering method is selected to replace with the following method:
Step 2.1:The MBR for each pattern that candidate pattern is concentrated is built using R-tree, the data structure of rock mechanism is obtained The index of pattern;
Step 2.2:To each child nodes i and j in R-tree data structures, using following prune rule 1 and rule 2 to full The pattern distance value of sufficient rule condition is assigned to infinity.
Prune rule is 1.:If two interest mode pα,pβPeak width is not appeared in simultaneously for wsThe same area in, will D in distance matrixαβIt is assigned to infinity;Wherein, wsIt is the parameter that user specifies, D is the distance matrix of interest mode,
Dαβ=dαβ(pα,pβ), dαβFor pαAnd pβEuclidean distance;
Prune rule is 2.:If interest mode pαSlope be negative, and interest mode pβSlope for just, by Distance matrix D DαβIt is assigned to infinity;
Step 2.3:D in Distance matrix DαβCalculated for non-infinitely great element according to Euclidean distance, and assignment to away from From corresponding position in matrix;
Step 2.4:Compare dαβ(pα,pβ) and the d that specifies of userminBetween size, if dαβ≤dmin, from PcMiddle deletion pαAnd pβ The less interest mode of number of time series in two interest modes, finally gives new interest mode collection P;Wherein, dminTake Some value between Euclidean distance minimum value and maximum between two neighboring time series, is specifically specified by user.
CN201710324105.6A 2017-05-09 2017-05-09 Earth temperature mode prediction method Active CN107220483B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710324105.6A CN107220483B (en) 2017-05-09 2017-05-09 Earth temperature mode prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710324105.6A CN107220483B (en) 2017-05-09 2017-05-09 Earth temperature mode prediction method

Publications (2)

Publication Number Publication Date
CN107220483A true CN107220483A (en) 2017-09-29
CN107220483B CN107220483B (en) 2021-01-01

Family

ID=59944105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710324105.6A Active CN107220483B (en) 2017-05-09 2017-05-09 Earth temperature mode prediction method

Country Status (1)

Country Link
CN (1) CN107220483B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797301A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Activity prediction method, activity prediction device, storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040086180A1 (en) * 2002-11-01 2004-05-06 Ajay Divakaran Pattern discovery in video content using association rules on multiple sets of labels
CN102637208A (en) * 2012-03-28 2012-08-15 南京财经大学 Method for filtering noise data based on pattern mining
EP2772872A2 (en) * 2013-02-28 2014-09-03 Samsung Electronics Co., Ltd. Method and apparatus for searching pattern in sequence data
US20150012540A1 (en) * 2013-07-02 2015-01-08 Hewlett-Packard Development Company, L.P. Deriving an interestingness measure for a cluster
US20150269157A1 (en) * 2014-03-21 2015-09-24 International Business Machines Corporation Knowledge discovery in data analytics
CN105320756A (en) * 2015-10-15 2016-02-10 江苏省邮电规划设计院有限责任公司 Improved Apriori algorithm based method for mining database association rule
CN106384128A (en) * 2016-09-09 2017-02-08 西安交通大学 Method for mining time series data state correlation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040086180A1 (en) * 2002-11-01 2004-05-06 Ajay Divakaran Pattern discovery in video content using association rules on multiple sets of labels
CN102637208A (en) * 2012-03-28 2012-08-15 南京财经大学 Method for filtering noise data based on pattern mining
EP2772872A2 (en) * 2013-02-28 2014-09-03 Samsung Electronics Co., Ltd. Method and apparatus for searching pattern in sequence data
US20150012540A1 (en) * 2013-07-02 2015-01-08 Hewlett-Packard Development Company, L.P. Deriving an interestingness measure for a cluster
US20150269157A1 (en) * 2014-03-21 2015-09-24 International Business Machines Corporation Knowledge discovery in data analytics
CN105320756A (en) * 2015-10-15 2016-02-10 江苏省邮电规划设计院有限责任公司 Improved Apriori algorithm based method for mining database association rule
CN106384128A (en) * 2016-09-09 2017-02-08 西安交通大学 Method for mining time series data state correlation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JA-HWUNG SU ET AL: "Efficient Relevance Feedback for Content-Based Image Retrieval by Mining User Navigation Patterns", 《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 *
吴志华: "基于知识发现的时序数据挖掘算法研究", 《中国优秀博硕士学位论文全文数据库 (硕士) 信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797301A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Activity prediction method, activity prediction device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN107220483B (en) 2021-01-01

Similar Documents

Publication Publication Date Title
Xuan et al. Multi-model fusion short-term load forecasting based on random forest feature selection and hybrid neural network
Chen et al. An improved artificial bee colony algorithm combined with extremal optimization and Boltzmann Selection probability
CN106326346A (en) Text classification method and terminal device
US20230052730A1 (en) Method for predicting operation state of power distribution network with distributed generations based on scene analysis
CN110851566A (en) Improved differentiable network structure searching method
CN106570250A (en) Power big data oriented microgrid short-period load prediction method
CN103455612B (en) Based on two-stage policy non-overlapped with overlapping network community detection method
CN111861013A (en) Power load prediction method and device
CN107798426A (en) Wind power interval Forecasting Methodology based on Atomic Decomposition and interactive fuzzy satisfying method
CN112328578A (en) Database query optimization method based on reinforcement learning and graph attention network
CN104951847A (en) Rainfall forecast method based on kernel principal component analysis and gene expression programming
CN113780002A (en) Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning
CN114169434A (en) Load prediction method
CN110428015A (en) A kind of training method and relevant device of model
CN105843907A (en) Method for establishing memory index structure-distance tree and similarity connection algorithm based on distance tree
CN106296434A (en) A kind of Grain Crop Yield Prediction method based on PSO LSSVM algorithm
CN116662412B (en) Data mining method for big data of power grid distribution and utilization
CN110765582A (en) Self-organization center K-means microgrid scene division method based on Markov chain
CN107220483A (en) A kind of mode prediction method of polynary time series data
Kim et al. A daily tourism demand prediction framework based on multi-head attention CNN: The case of the foreign entrant in South Korea
CN110111838B (en) Method and device for predicting RNA folding structure containing false knot based on expansion structure
CN104657429B (en) Technology-driven type Product Innovation Method based on complex network
CN103136515A (en) Creative inflection point identification method based on draft action sequence and system using the same
Mo et al. Simulated annealing for neural architecture search
CN114647679A (en) Hydrological time series motif mining method based on numerical characteristic clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20221213

Address after: 710075 Room 021, F2003, 20th Floor, Building 4-A, Xixian Financial Port, Fengdong New City Energy Jinmao District, Xixian New District, Xi'an, Shaanxi

Patentee after: Shaanxi Dahang Wujiang Information Technology Co.,Ltd.

Address before: 710069 No. 229 Taibai North Road, Shaanxi, Xi'an

Patentee before: NORTHWEST University

TR01 Transfer of patent right