CN103279643B - A kind of computational methods of time series similarity - Google Patents
A kind of computational methods of time series similarity Download PDFInfo
- Publication number
- CN103279643B CN103279643B CN201310151558.5A CN201310151558A CN103279643B CN 103279643 B CN103279643 B CN 103279643B CN 201310151558 A CN201310151558 A CN 201310151558A CN 103279643 B CN103279643 B CN 103279643B
- Authority
- CN
- China
- Prior art keywords
- time series
- section
- chronon sequence
- sequence
- chronon
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Abstract
The invention discloses the computational methods of a kind of time series similarity in computer information technology processing technology field.Including: respectively by two time serieses S to be compared1And S2It is divided into chronon sequence S in the same fashion1(i) and S2(i);Set each chronon sequence S1(i) and S2Weight w of (i)i;Calculate the distance between corresponding chronon sequence;According to the distance between corresponding chronon sequence and the weight of chronon sequence, calculate time series S1And S2Similarity.The present invention can the shape similarity degree of preferably reflecting time sequence, its complexity is low and judges that speed is fast.
Description
Technical field
The invention belongs to computer information technology processing technology field, particularly relate to the calculating of a kind of time series similarity
Method.
Background technology
Time series is present in the application necks such as business, finance, medicine, Astronomy, Meteorology, Aero-Space, electric power energy in a large number
Territory.During data time series analysis, how to judge that the similarity of time series data is a underlying issue, extensively apply
In the work such as seasonal effect in time series inquiry, pattern match, classification and data mining.
Existing time series similarity computational methods based on distance are broadly divided into two kinds, i.e. based on Euclidean distance
The method of (or class Euclidean distance) and method based on dynamic time warping distance.Method based on Euclidean distance
Do not possesses form identification ability, it is impossible to recognition time sequence patterns of change under different resolution.Based on dynamic time warping
The method of distance carries out alignment coupling according to the Time Warp path of minimum cost, it would be preferable to support seasonal effect in time series time shaft is stretched
Contracting, but it is unsatisfactory for distance triangle inequality, particularly it calculates time complexity is O (n2) (the wherein length of n express time sequence
Degree), amount of calculation is very big, cannot actual apply under many circumstances.
The computational accuracy existed for current time series similarity computational methods is the highest, calculate the problems such as complicated, this
A kind of time series similarity computational methods based on improvement distance of bright offer, can realize the quick, accurate of time series similarity
Really judge.
Summary of the invention
It is an object of the invention to, it is provided that a kind of time series similarity computational methods based on improvement distance, be used for solving
The deficiency that certainly existing time series similarity computational methods exist.
To achieve these goals, the technical scheme that the present invention proposes is, the computational methods of a kind of time series similarity,
It is characterized in that described method includes:
Step 1: respectively by two time serieses S to be compared1And S2It is divided into chronon sequence S in the same fashion1
(i) and S2(i);Wherein, i=1,2 ..., n, n are the number of chronon sequence;
Step 2: set each chronon sequence SjWeight w of (i)i;Wherein, j=1,2, i=1,2 ..., n;
Step 3: calculate the distance between corresponding chronon sequence;
Step 4: according to the distance between corresponding chronon sequence and the weight of chronon sequence, calculates time series S1
And S2Similarity;
Described respectively by two time serieses S to be compared1And S2It is divided into chronon sequence S1(i) and S2I () is concrete
It is:
Use and wait point seasonal effect in time series model split chronon sequence, if time series S1Element number L1It is the whole of n
Several times, then by time series S1Being divided into n section, each section is a chronon sequence, the element number of each chronon sequence
For
If time series S1Element number L1It not the integral multiple of n, then by time series S1Being divided into n section, each section is
One chronon sequence, the element number of the 1st section to (n-1)th section is[] is rounding operation, the element number of n-th section
ForOr, the element number of the 2nd section to n-th section is[] is rounding operation, the element of the 1st section
Number is
If time series S2Element number L2It is the integral multiple of n, then by time series S2Being divided into n section, each section is
One chronon sequence, the element number of each chronon sequence is
If time series S2Element number L2It not the integral multiple of n, then by time series S2Being divided into n section, each section is
One chronon sequence, the element number of the 1st section to (n-1)th section is[] is rounding operation, the element number of n-th section
ForOr, the element number of the 2nd section to n-th section is[] is rounding operation, the element of the 1st section
Number is
Described each chronon sequence SjWeight w of (i)iAccording to formulaSet;Wherein, LjI () is chronon
Sequence SjThe element number of (i), LjFor time series SjElement number, j=1,2, i=1,2 ..., n, n are the number of subsequence
Mesh.
The described distance calculated between corresponding chronon sequence uses formula
Wherein:
d(S1(i),S2(i)) it is corresponding chronon sequence S1(i) and S2Distance between (i);
For chronon sequence S1Pth the element of (i);
For chronon sequence S2Pth the element of (i);
M is chronon sequence S1(i) and S2The element number of (i), i=1,2 ..., n;
σ is chronon sequence S1(i) and S2The standard deviation of the difference of i element that () is corresponding and
μ is chronon sequence S1(i) and S2The arithmetic average of the difference of i element that () is corresponding and
Described calculating time series S1And S2Similarity use formula
Wherein:
Sim(S1,S2) it is time series S1And S2Similarity;
d(S1(i),S2(i)) it is corresponding chronon sequence S1(i) and S2Distance between (i);
wiFor chronon sequence S1(i) and S2Weight between (i);
N is the number of subsequence.
Time series is carried out equal or different length segmentation and arranges the weight of each chronon sequence, Ke Yiman by the present invention
Actual similarity under the different situation of foot judges demand;It addition, the chronon sequence distance computational methods that the present invention provides, compare
Tradition distance calculating method, can the shape similarity degree of preferably reflecting time sequence (i.e. time series local trend is different
With), more meet mankind's daily experience and visual contrast, it is judged that more accurate;Finally, segmentation of the present invention calculate chronon sequence away from
From, calculating similarity further according to subsequence distance, computation complexity is low, it is possible to realize the quick judgement of time series similarity.
Accompanying drawing explanation
Fig. 1 is the computational methods flow chart of the time series similarity that the present invention provides.
Detailed description of the invention
Below in conjunction with the accompanying drawings, preferred embodiment is elaborated.It is emphasized that the description below is merely exemplary
Rather than in order to limit the scope of the present invention and application thereof.
Fig. 1 is the computational methods flow chart of the time series similarity that the present invention provides, as it is shown in figure 1, time series phase
Include like the computational methods spent:
Step 1: respectively by two time serieses S to be compared1And S2It is divided into chronon sequence S in the same fashion1
(i) and S2(i);Wherein, i=1,2 ..., n, n are the number of chronon sequence.
Concrete practical application request is judged, if the local similarity of time series day part is to time series according to similarity
Global similarity influence degree is identical, then point seasonal effect in time series model split subsequence such as can use.By time series by long
(if can not isometric divide equally, first or last chronon sequence length can not wait to spend the isometric n of being divided into a chronon sequence
In other sub-sequence length).The concrete length value of each chronon sequence is relevant to time series feature, according to different field
Time series fluctuating characteristic is arranged by the following method: seasonal effect in time series data value fluctuating margin is the least, and chronon sequence length takes
It is worth the biggest;The time series that data value fluctuation is the most violent, chronon sequence length value is the least.Generally, chronon
The length value of sequence is between 5-15.If time series global similarity is affected by the local similarity of time series day part
Difference, the i.e. data of some (or several) period are relatively big on the impact of whole seasonal effect in time series similarity, other period data pair
The impact of whole seasonal effect in time series similarity is less, then can use non-decile seasonal effect in time series model split chronon sequence.Ratio
As, the independent segmentation of data of the one (or several) period bigger on the impact of whole Time Series Similarity, this (or several
Individual) length of period is the length of this (or several) chronon sequence, and other period data of time series then press decile
Seasonal effect in time series model split subsequence.
Illustrate with decile seasonal effect in time series model split subsequence below.If time series S1(S2) element
Number L1(L2) be the integral multiple of n, then by time series S1(S2) it is divided into n section.Each section is a chronon sequence, Mei Geshi
Between the element number of subsequence beSuch as, chronon sequence to be divided has 20 elements, i.e. acquires 20
The data of time point, will be divided into 5 chronon sequences, then can make 4 elements (data of time point) is one section, group
Become chronon sequence, be like this just divided into 4 chronon sequences.
If time series S1(S2) element number L1(L2) not the integral multiple of n, then by time series S1(S2) it is divided into n
Section, each section is a chronon sequence, and the element number of the 1st section to (n-1)th section is [] is for rounding fortune
Calculating, the element number of n-th section isOr, the element number of the 2nd section to n-th section
For[] is rounding operation, and the element number of the 1st section isAlso
It is to have as a example by 20 elements by chronon sequence to be divided, if to be divided into 6 chronon sequences, then can make
1 section to the 5th section every section has 3 elements, and the 6th section is 5 elements;Or, the 2nd section can be made to have 3 elements to the 6th section every section,
1st section is 5 elements.
Step 2: set each chronon sequence SjWeight w of (i)i。
In the present invention, arranging its weight by the length scale of each chronon sequence, concrete grammar is: a chronon sequence
Weight value be i.e. the ratio of length and whole length of time series of this chronon sequence.The power of chronon sequence s (i)
Weight wiComputational methods such as following formula:
wi=Lj(i)/Lj (1)
In above formula (1), wiIt it is chronon sequence S1(i) and S2The weight of (i), LjI () is chronon sequence SjThe element of (i)
Number, namely chronon sequence SjThe length of (i);LjFor time series SjElement number, namely time series SjLength, j
=1,2, i=1,2 ..., n, n are the number of subsequence.
For using the chronon sequence of non-decile seasonal effect in time series model split, can be according to each chronon sequence pair
The size of time series global similarity influence degree arranges the weight of each chronon sequence, the chronon that influence degree is the biggest
Sequence weights value is the biggest, and the chronon sequence weights value that influence degree is the least is the least, but the span of weight to be made is
(0,1), ensures that the weight sum of all chronon sequences of time series division gained is equal to 1, i.e. satisfied following constraint simultaneously:
In above formula (2), wiIt it is chronon sequence S1(i) and S2I the weight of (), n is the number of chronon sequence.
Step 3: calculate the distance between corresponding chronon sequence.
Distance employing formula between the chronon sequence that calculating is corresponding:
In above-mentioned (3) formula, d (S1(i),S2(i)) it is corresponding chronon sequence S1(i) and S2Distance between (i),For chronon sequence S1Pth the element of (i), i.e. time subsequence S1The data of pth the time point of (i),For
Chronon sequence S2I pth the element of (), m is chronon sequence S1(i) and S2I the element number of (), σ is chronon sequence S1
(i) and S2The standard deviation of the difference of i element that () is corresponding, the computational methods of σ are as follows:
In above-mentioned (4) formula, μ is chronon sequence S1(i) and S2The arithmetic average of the difference of i element that () is corresponding, μ
Computational methods as follows:
In above-mentioned formula (3) and (4), i=1,2 ..., n, n are the number of chronon sequence.
Step 4: according to the distance between corresponding chronon sequence and the weight of chronon sequence, calculates time series S1
And S2Similarity.
Calculate time series S1And S2Similarity use formula
In above-mentioned formula (6), Sim (S1,S2) it is time series S1And S2Similarity, d (S1(i),S2(i)) it is corresponding
Chronon sequence S1(i) and S2Distance between (i), wiFor chronon sequence S1(i) and S2I the weight of (), n is chronon sequence
Number.
The present invention has a following remarkable advantage:
(1) this method carries out equal or different length segmentation according to the relation of time series local similarity with global similarity
And the weight of each chronon sequence is set, judge demand meeting the actual similarity of different user.
(2) difference of time series segmentation length, it is possible to portray seasonal effect in time series similar trend journey under different time scales
Degree, judges to provide the thickness granularity division for different demands for time series similarity, makes this method be provided with good
Time multi-resolution characteristics.
(3) consider time series plesiomorphism factor, devise new chronon sequence distance computational methods, compare tradition
Distance, new method can the shape similarity degree (i.e. the similarities and differences of time series local trend) of preferably reflecting time sequence, more accord with
Close mankind's daily experience and visual contrast, it is judged that more accurate.
(4) this method computation complexity is low, it is possible to realize the quick judgement of time series similarity.
The above, the only present invention preferably detailed description of the invention, but protection scope of the present invention is not limited thereto,
Any those familiar with the art in the technical scope that the invention discloses, the change that can readily occur in or replacement,
All should contain within protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims
It is as the criterion.
Claims (2)
1. computational methods for time series similarity, is characterized in that described method includes:
Step 1: respectively by two time serieses S to be compared1And S2It is divided into chronon sequence S in the same fashion1(i)
And S2(i);Wherein, i=1,2 ..., n, n are the number of chronon sequence;
Step 2: set each chronon sequence SjWeight w of (i)i;Wherein, j=1,2, i=1,2 ..., n;
Step 3: calculate the distance between corresponding chronon sequence;
Step 4: according to the distance between corresponding chronon sequence and the weight of chronon sequence, calculates time series S1And S2
Similarity;
Described respectively by two time serieses S to be compared1And S2It is divided into chronon sequence S1(i) and S2I () specifically:
Use and wait point seasonal effect in time series model split chronon sequence, if time series S1Element number L1It it is the integer of n
Times, then by time series S1Being divided into n section, each section is a chronon sequence, and the element number of each chronon sequence is
If time series S1Element number L1It not the integral multiple of n, then by time series S1Being divided into n section, each section is one
Chronon sequence, the element number of the 1st section to (n-1)th section is[] is rounding operation, and the element number of n-th section isOr, the element number of the 2nd section to n-th section is[] is rounding operation, the element number of the 1st section
For
If time series S2Element number L2It is the integral multiple of n, then by time series S2Being divided into n section, each section is one
Chronon sequence, the element number of each chronon sequence is
If time series S2Element number L2It not the integral multiple of n, then by time series S2Being divided into n section, each section is one
Chronon sequence, the element number of the 1st section to (n-1)th section is[] is rounding operation, and the element number of n-th section isOr, the element number of the 2nd section to n-th section is[] is rounding operation, the element number of the 1st section
For
Described each chronon sequence SjWeight w of (i)iAccording to formulaSet;Wherein, LjI () is chronon sequence
SjThe element number of (i), LjFor time series SjElement number, j=1,2, i=1,2 ..., n, n are the number of subsequence;
The described distance calculated between corresponding chronon sequence uses formula
Wherein:
d(S1(i),S2(i)) it is corresponding chronon sequence S1(i) and S2Distance between (i);
For chronon sequence S1Pth the element of (i);
For chronon sequence S2Pth the element of (i);
M is chronon sequence S1(i) and S2The element number of (i), i=1,2 ..., n;
σ is chronon sequence S1(i) and S2The standard deviation of the difference of i element that () is corresponding and
μ is chronon sequence S1(i) and S2The arithmetic average of the difference of i element that () is corresponding and
Computational methods the most according to claim 1, is characterized in that described calculating time series S1And S2Similarity use public affairs
FormulaWherein:
Sim(S1,S2) it is time series S1And S2Similarity;
d(S1(i),S2(i)) it is corresponding chronon sequence S1(i) and S2Distance between (i);
wiFor chronon sequence S1(i) and S2Weight between (i);
N is the number of subsequence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310151558.5A CN103279643B (en) | 2013-04-26 | 2013-04-26 | A kind of computational methods of time series similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310151558.5A CN103279643B (en) | 2013-04-26 | 2013-04-26 | A kind of computational methods of time series similarity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103279643A CN103279643A (en) | 2013-09-04 |
CN103279643B true CN103279643B (en) | 2016-08-24 |
Family
ID=49062158
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310151558.5A Expired - Fee Related CN103279643B (en) | 2013-04-26 | 2013-04-26 | A kind of computational methods of time series similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103279643B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103886195B (en) * | 2014-03-14 | 2015-08-26 | 浙江大学 | Time Series Similarity measure under shortage of data |
CN104063467B (en) * | 2014-06-26 | 2017-04-26 | 北京工商大学 | Intra-domain traffic flow pattern discovery method based on improved similarity search technology |
CN104182460B (en) * | 2014-07-18 | 2017-06-13 | 浙江大学 | Time Series Similarity querying method based on inverted index |
EP3294987B1 (en) * | 2015-05-13 | 2023-02-22 | ConocoPhillips Company | Time corrections for drilling data |
CN105093218A (en) * | 2015-06-12 | 2015-11-25 | 中国电子科技集团公司第四十一研究所 | Flying target high-low altitude determining method based on time sequence similarity |
CN107784311A (en) * | 2016-08-24 | 2018-03-09 | 中国海洋大学 | Global mesoscale eddy space-time hierarchical topology path construction technology |
CN106503725A (en) * | 2016-09-12 | 2017-03-15 | 新浪网技术(中国)有限公司 | A kind of graphic processing method and device |
CN108052628B (en) * | 2017-12-19 | 2020-07-14 | 河北省科学院应用数学研究所 | Turnout starting current detection method, system and terminal equipment |
CN108491436A (en) * | 2018-02-10 | 2018-09-04 | 大连智慧海洋软件有限公司 | A kind of steel plate thickness matching process based on self-adapting stretching dynamic time warping algorithm |
CN108573059B (en) * | 2018-04-26 | 2021-02-19 | 哈尔滨工业大学 | Time sequence classification method and device based on feature sampling |
CN110955862B (en) * | 2019-11-26 | 2023-10-13 | 新奥数能科技有限公司 | Evaluation method and device for equipment model trend similarity |
CN112365363A (en) * | 2020-10-14 | 2021-02-12 | 国网四川省电力公司电力科学研究院 | Calculation method for similarity of power load curves |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7770072B2 (en) * | 2007-01-16 | 2010-08-03 | Xerox Corporation | Method and system for analyzing time series data |
CN102364490B (en) * | 2011-10-26 | 2014-10-08 | 华北电力大学 | Automatic synchronization recognition method based on hierarchical analyzing model |
-
2013
- 2013-04-26 CN CN201310151558.5A patent/CN103279643B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN103279643A (en) | 2013-09-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103279643B (en) | A kind of computational methods of time series similarity | |
Teichgraeber et al. | Clustering methods to find representative periods for the optimization of energy systems: An initial framework and comparison | |
US20180351355A1 (en) | Method for identifying pattern of load cycle | |
Byron | The estimation of large social account matrices | |
US10387419B2 (en) | Method and system for managing databases having records with missing values | |
Khachatryan et al. | A search for pair production of new light bosons decaying into muons | |
CN109587713A (en) | A kind of network index prediction technique, device and storage medium based on ARIMA model | |
Ploch | Ordinal measures of association and the general linear model | |
CN104636325B (en) | A kind of method based on Maximum-likelihood estimation determination Documents Similarity | |
CN107563557A (en) | Determine the method and device of oil well output lapse rate | |
CN103544544A (en) | Energy consumption forecasting method and device | |
US11210673B2 (en) | Transaction feature generation | |
Artal-Tur et al. | The socio-economic impact of migration flows | |
CN103365842B (en) | A kind of page browsing recommends method and device | |
Zhang et al. | Decomposition methods for tourism demand forecasting: A comparative study | |
Calcagnini et al. | A time series analysis of labor productivity. Italy versus the European countries and the US | |
CN109255629A (en) | A kind of customer grouping method and device, electronic equipment, readable storage medium storing program for executing | |
CN113032403A (en) | Data insight method, device, electronic equipment and storage medium | |
Zhu et al. | Autoregressive optimal transport models | |
CN106203515A (en) | Multiple criteria fusion application is in the method for higher-dimension Small Sample Database feature selection | |
Cao et al. | Research on dynamic time warping multivariate time series similarity matching based on shape feature and inclination angle | |
CN108021985A (en) | A kind of model parameter training method and device | |
CN107133218A (en) | Trade name intelligent Matching method, system and computer-readable recording medium | |
CN102930290A (en) | Integrated classifier and classification method thereof | |
Amirteimoori et al. | Increasing the discrimination power of data envelopment analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160824 Termination date: 20200426 |
|
CF01 | Termination of patent right due to non-payment of annual fee |