CN104915434A - Multi-dimensional time sequence classification method based on mahalanobis distance DTW - Google Patents

Multi-dimensional time sequence classification method based on mahalanobis distance DTW Download PDF

Info

Publication number
CN104915434A
CN104915434A CN201510351181.7A CN201510351181A CN104915434A CN 104915434 A CN104915434 A CN 104915434A CN 201510351181 A CN201510351181 A CN 201510351181A CN 104915434 A CN104915434 A CN 104915434A
Authority
CN
China
Prior art keywords
series
dtw
time
multidimensional time
mahalanobis distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510351181.7A
Other languages
Chinese (zh)
Other versions
CN104915434B (en
Inventor
刘大同
陈静
彭宇
彭喜元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201510351181.7A priority Critical patent/CN104915434B/en
Publication of CN104915434A publication Critical patent/CN104915434A/en
Application granted granted Critical
Publication of CN104915434B publication Critical patent/CN104915434B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Abstract

The invention discloses a multi-dimensional time sequence classification method based on mahalanobis distance DTW, and relates to the multi-dimensional time sequence classification method. In order to solve the problems that aiming at satellite telemetry data, a fixed point segmentation effect is non-ideal, due to the facts that relativity exists between multi-dimensional time sequences and small deviation exists between the time sequences, a measuring result is not accurate, therefore a classification result is not accurate, and the multi-dimensional time sequence classification method based on the mahalanobis distance DTW is provided. The method comprises the steps that 1 a multi-dimensional time sequence X={x <1>, x <2>, ..., x<j>, ..., x<n>} used for training and a classification label L={l<1>, l<2>, ..., l<n>}are obtained; 2 a to-be-classified multi-dimensional time sequence X'={x' <1>, x' <2>, ..., x'<j>, ..., x'<n>} is extracted; 3 a DTW distance sequence between the X'={x' <1>, x' <2>, ..., x'<j>, ..., x'<n>} and the X={x <1>, x <2>, ..., x<j>, ..., x<n>} is calculated; 4 classification is conducted on the to-be-classified multi-dimensional time sequence X'={x' <1>, x' <2>, ..., x'<j>, ..., x'<m>} according to neighboring numbers of K which is set by using a KNN classification method based on the mahalanobis DTW distance, and the classification of the to-be-classified multi-dimensional time sequence is determined. The method is applied to the field of multi-dimensional time sequence classification.

Description

A kind of multidimensional time-series sorting technique based on mahalanobis distance DTW
Technical field
The present invention relates to the multidimensional time-series sorting technique based on mahalanobis distance DTW.
Background technology
By analyzing the yaw-position angle in satellite telemetering data, the overall variation trend at yaw-position angle as shown in Figure 1, its variations in detail as shown in Figure 2, show that satellite telemetering data has significantly periodically, and this characteristic provides unit to confirm with satellite telemetering data.By analyzing each cycle of telemetry, can show that whether the running status of satellite within this cycle be normal, according to the situation that point of fixity is undesirable to satellite telemetering data subsection efect, as shown in Figure 3, the degree of coupling between each time series obtained after segmentation is not high enough, there is certain deviation, and along with this deviation of propelling of time can be more obvious.
Classifying to satellite telemetering data is critical function satellite telemetering data being carried out to data mining, can complete several data mining task, such as pattern-recognition, abnormality detection etc. on the basis of classification.And satellite telemetering data has himself feature, such as: parameter is many, dimension is high, there is drift etc., these features cause and adopt classical Time Series Similarity measure in the classification for satellite telemetering data, as Euclidean distance, Pearson correlation coefficients etc., embody inadaptability.Classical Time Series Similarity measure, the interdependence effects between multidimensional time-series can not be got rid of, meanwhile, there is minor shifts for time series and can not realize asynchronous tolerance and make measurement results not accurate enough, and then cause the classification results of satellite telemetering data not accurate enough.
Summary of the invention
The object of the invention is in order to solve for satellite telemetering data be fixed a subsection efect undesirable, owing to there is correlativity between multidimensional time-series and time series exists minor shifts and makes the problem that measurement results is not accurate enough and then cause classification results not accurate enough, and propose a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW.
Above-mentioned goal of the invention is achieved through the following technical solutions:
Step one: the historical satellite telemetry Y under satellite normal operating condition being carried out segmentation with argument catastrophe point for identifying, obtaining normal multidimensional time-series X={x 1, x 2..., x j... x n, wherein, Y is n drow n athe historical satellite telemetry matrix of row, n dfor the dimension values of multidimensional time-series, n afor the number of data points of all historical satellite telemetries, x jfor n drow n lena jth sequence of column data matrix representation X, j=1,2 ..., n, n lenfor length of time series, n is the number of members in X;
Step 2, the multidimensional time-series X={x will obtained after segmentation 1, x 2..., x j... x n, be that c carries out cluster operation to sequence by hierarchy clustering method setting cluster target class number, thus obtain the class label L={l of multidimensional time-series 1, l 2..., l n; Wherein, c is greater than the positive integer that 1 is less than n, l srepresent s element of L sequence, its value is determined by hierarchical clustering result, wherein s=1,2 ..., n;
Step 3: to extract in up-to-date satellite telemetering data test data within the corresponding time point of adjacent m+1 argument catastrophe point and multidimensional time-series to be sorted be X '=x ' 1, x ' 2..., x ' m, wherein, m be greater than 0 positive integer;
Step 4, calculate multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mwith containing the multidimensional time-series X={x of class label 1, x 2..., x j... x nbetween DTW distance sequence
Wherein, d ijaccount form as follows:
d ij=DTW ma(x' i,x j)
X' irepresent i-th sequence of X ', i=1,2 ..., m; DTW marepresent the DTW distance algorithm based on mahalanobis distance; DTW, d ijfor x' iwith x jbetween the DTW distance based on mahalanobis distance;
Step 5, adopt the KNN sorting technique of the DTW distance based on mahalanobis distance, according to the k nearest neighbor number of setting to multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mclassify, determine multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mgeneric L '=l ' 1, l ' 2 ..., l ' m, wherein, K=1,2 ..., n; Generic l' is certain number determined in 1,2, L, c; KNN is K arest neighbors sorting algorithm; Namely a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW is completed.
Invention effect
Classifying to satellite telemetering data is critical function satellite telemetering data being carried out to data mining, can complete several data mining task, such as pattern-recognition, abnormality detection etc. on the basis of classification.And satellite telemetering data has himself feature, such as: parameter is many, dimension is high, there is drift etc., these features cause and adopt classical Time Series Similarity measure in the classification for satellite telemetering data, as Euclidean distance, Pearson correlation coefficients etc., embody inadaptability.Classical time series measure, the interdependence effects between multidimensional time-series can not be got rid of, meanwhile, there is minor shifts for time series and can not realize asynchronous tolerance and make measurement results not accurate enough, and then cause the classification results of satellite telemetering data not accurate enough.Therefore, the more rational Time Series Similarity measure of application is needed.For the satellite telemetering data that some complexity or feature are not quite similar, choose reasonable time sequence similarity measure, can guarantee that corresponding mode excavation obtains more good effect.The concrete invention effect of each several part is as follows:
The present invention is first for according to point of fixity to the undesirable situation of satellite telemetering data subsection efect as shown in Figure 3, proposing according to the argument catastrophe point in satellite telemetering data is the method that mark carries out segmentation, its subsection efect as shown in Figure 4, it is more compact with argument to be that mark carries out the segmentation result of segmentation, and the degree of coupling between each fragment sequence is higher, more reasonable.
Then, adopt dynamic time warping (the Dynamic Time Warping based on mahalanobis distance, DTW) distance is measured the distance between multidimensional satellite telemetering data time series, eliminate the interdependence effects between multidimensional time-series, realize asynchronous tolerance, solve the problem making measurement results true not because time series exists minor shifts.
Finally, in conjunction with K nearest-neighbors (K-Nearest Neighbor, KNN) sorting algorithm and satellite telemetering data history multidimensional time-series are classified to up-to-date remote measurement multidimensional time-series, achieve the differentiation to the current running status of satellite better.
Accompanying drawing explanation
Fig. 1 is the yaw-position angle sequence example schematic diagram that background technology proposes;
Fig. 2 is the yaw-position angle sequence details change example schematic diagram that background technology proposes;
Fig. 3 is that the employing point of fixity of embodiment one proposition is to the result of satellite telemetering data segmentation;
Fig. 4 be embodiment one propose with argument catastrophe point for the result of mark to satellite telemetering data segmentation;
Fig. 5 is Wafer parameter 1 example schematic diagram that embodiment proposes;
Fig. 6 is Wafer parameter 2 example schematic diagram that embodiment proposes;
Fig. 7 is Wafer parameter 3 example schematic diagram that embodiment proposes;
Fig. 8 is Wafer parameter 4 example schematic diagram that embodiment proposes;
Fig. 9 is Wafer parameter 5 example schematic diagram that embodiment proposes;
Figure 10 is Wafer parameter 6 example schematic diagram that embodiment proposes;
1st class schematic diagram data of the satellite telemetering data dimension 1 that Figure 11 (a) proposes for embodiment;
2nd class schematic diagram data of the satellite telemetering data dimension 1 that Figure 11 (b) proposes for embodiment;
3rd class schematic diagram data of the satellite telemetering data dimension 1 that Figure 11 (c) proposes for embodiment;
4th class schematic diagram data of the satellite telemetering data dimension 1 that Figure 11 (d) proposes for embodiment;
1st class schematic diagram data of the satellite telemetering data dimension 2 that Figure 12 (a) proposes for embodiment;
2nd class schematic diagram data of the satellite telemetering data dimension 2 that Figure 12 (b) proposes for embodiment;
3rd class schematic diagram data of the satellite telemetering data dimension 2 that Figure 12 (c) proposes for embodiment;
4th class schematic diagram data of the satellite telemetering data dimension 2 that Figure 12 (d) proposes for embodiment;
1st class schematic diagram data of the satellite telemetering data dimension 3 that Figure 13 (a) proposes for embodiment;
2nd class schematic diagram data of the satellite telemetering data dimension 3 that Figure 13 (b) proposes for embodiment;
3rd class schematic diagram data of the satellite telemetering data dimension 3 that Figure 13 (c) proposes for embodiment;
4th class schematic diagram data of the satellite telemetering data dimension 3 that Figure 13 (d) proposes for embodiment;
1st class result schematic diagram of the satellite telemetering data dimension 1 that Figure 14 (a) proposes for embodiment;
2nd class result schematic diagram of the satellite telemetering data dimension 1 that Figure 14 (b) proposes for embodiment;
3rd class result schematic diagram of the satellite telemetering data dimension 1 that Figure 14 (c) proposes for embodiment;
4th class result schematic diagram of the satellite telemetering data dimension 1 that Figure 14 (d) proposes for embodiment;
1st class result schematic diagram of the satellite telemetering data dimension 2 that Figure 15 (a) proposes for embodiment;
2nd class result schematic diagram of the satellite telemetering data dimension 2 that Figure 15 (b) proposes for embodiment;
3rd class result schematic diagram of the satellite telemetering data dimension 2 that Figure 15 (c) proposes for embodiment;
4th class result schematic diagram of the satellite telemetering data dimension 2 that Figure 15 (d) proposes for embodiment;
1st class result schematic diagram of the satellite telemetering data dimension 3 that Figure 16 (a) proposes for embodiment;
2nd class result schematic diagram of the satellite telemetering data dimension 3 that Figure 16 (b) proposes for embodiment;
3rd class result schematic diagram of the satellite telemetering data dimension 3 that Figure 16 (c) proposes for embodiment;
4th class result schematic diagram of the satellite telemetering data dimension 3 that Figure 16 (d) proposes for embodiment.
Embodiment
Embodiment one: a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW of present embodiment, specifically prepare according to following steps:
Step one: the historical satellite telemetry Y under satellite normal operating condition being carried out segmentation with argument catastrophe point for identifying, obtaining normal multidimensional time-series X={x 1, x 2..., x j... x n, wherein, Y is n drow n athe historical satellite telemetry matrix of row, n dfor the dimension values of multidimensional time-series, n afor the number of data points of all historical satellite telemetries, x jfor n drow n lena jth sequence of column data matrix representation X, j=1,2 ..., n, n lenfor length of time series, n is the number of members in X;
Step 2, the multidimensional time-series X={x will obtained after segmentation 1, x 2..., x j... x n, be that c carries out cluster operation to sequence by hierarchy clustering method setting cluster target class number, thus obtain the class label L={l of multidimensional time-series 1, l 2..., l n; Wherein, c is greater than the positive integer that 1 is less than n, l srepresent s element of L sequence, its value is determined by hierarchical clustering result, wherein s=1,2 ..., n; Classification assigned work herein, its method is not fixed, can realize classification specify any existing method can, hierarchy clustering method to realize one of method that classification specifies;
Step 3: to extract in up-to-date satellite telemetering data test data within the corresponding time point of adjacent m+1 argument catastrophe point and multidimensional time-series to be sorted be X '=x ' 1, x ' 2..., x ' m, wherein, m be greater than 0 positive integer;
Step 4, calculate multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mwith containing the multidimensional time-series X={x of class label 1, x 2..., x j... x nbetween DTW distance sequence
Wherein, d ijaccount form as follows:
d ij=DTW ma(x' i,x j)
X' irepresent i-th sequence of X ', i=1,2 ..., m; DTW marepresent the DTW distance algorithm based on mahalanobis distance; DTW (Dynamic Time Warping) is a kind of method for measuring similarity (existing theory) carrying out to carry out mating to time series form better mapping by bending time shaft, d ijfor x' iwith x jbetween the DTW distance based on mahalanobis distance;
Step 5, adopt the KNN sorting technique of the DTW distance based on mahalanobis distance, according to the k nearest neighbor number of setting to multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mclassify, determine multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mgeneric L '=l ' 1, l ' 2..., l ' m, wherein, K=1,2 ..., n; Generic l' is certain number determined in 1,2, L, c; KNN (K-Nearest Neighbor) is K arest neighbors sorting algorithm; Namely a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW is completed.
Present embodiment effect:
Classifying to satellite telemetering data is critical function satellite telemetering data being carried out to data mining, can complete several data mining task, such as pattern-recognition, abnormality detection etc. on the basis of classification.And satellite telemetering data has himself feature, such as: parameter is many, dimension is high, there is drift etc., these features cause and adopt classical Time Series Similarity measure in the classification for satellite telemetering data, as Euclidean distance, Pearson correlation coefficients etc., embody inadaptability.Classical time series measure, the interdependence effects between multidimensional time-series can not be got rid of, meanwhile, there is minor shifts for time series and can not realize asynchronous tolerance and make measurement results not accurate enough, and then cause the classification results of satellite telemetering data not accurate enough.Therefore, the more rational Time Series Similarity measure of application is needed.For the satellite telemetering data that some complexity or feature are not quite similar, choose reasonable time sequence similarity measure, can guarantee that corresponding mode excavation obtains more good effect.The concrete invention effect of each several part is as follows:
The present invention is first for according to point of fixity to the undesirable situation of satellite telemetering data subsection efect as shown in Figure 4, proposing according to the argument catastrophe point in satellite telemetering data is the method that mark carries out segmentation, its subsection efect as shown in Figure 5, it is more compact with argument to be that mark carries out the segmentation result of segmentation, and the degree of coupling between each fragment sequence is higher, more reasonable.
Then, adopt dynamic time warping (the Dynamic Time Warping based on mahalanobis distance, DTW) distance is measured the distance between multidimensional satellite telemetering data time series, eliminate the interdependence effects between multidimensional time-series, realize asynchronous tolerance, solve the problem making measurement results true not because time series exists minor shifts.
Finally, in conjunction with K nearest-neighbors (K-Nearest Neighbor, KNN) sorting algorithm and satellite telemetering data history multidimensional time-series are classified to up-to-date remote measurement multidimensional time-series, achieve the differentiation to the current running status of satellite better.
Embodiment two: present embodiment and embodiment one unlike: in step one, argument is one of test parameter of satellite telemetering data, argument Changing Pattern is for increase progressively successively from 0 ° ~ 360 °, have obvious periodicity, argument value becomes 0 ° for argument catastrophe point from 360 °.Other step and parameter identical with embodiment one.
Embodiment three: present embodiment and embodiment one or two unlike: in step one, the historical satellite telemetry Y under satellite normal operating condition being carried out segmentation with argument catastrophe point for identifying, obtaining normal multidimensional time-series X={x 1, x 2..., x j... x ndetailed process is:
(1) after argument reaches 360 °, then become 0 ° and restart to increase progressively, becoming 0 ° of this point from 360 ° is argument catastrophe point;
(2) the corresponding time of argument catastrophe point is recorded;
(3) corresponding according to the argument catastrophe point time, the test data extracted within adjacent two argument catastrophe points corresponding time is time series; Wherein multidimensional time-series is made up of many time serieses; Wherein, test data is yaw-position angle, Speed of Reaction Wheels and busbar voltage.Other step and parameter identical with embodiment one or two.
Embodiment four: one of present embodiment and embodiment one to three unlike: calculate d in step 4 ijdetailed process be:
(1) the covariance matrix C between each dimension of multidimensional time-series to be sorted is calculated cov, its account form is:
C cov=E{[Y-E(Y)][Y-E(Y)] T}
Wherein, Y is n drow n athe historical satellite telemetry matrix of row, E represents calculation expectation value;
(2) based on mahalanobis distance DTW distance namely two time serieses with between find optimum crooked route to obtain minimum mahalanobis distance metric DTW ma(x' i, x j); Mahalanobis distance is adopted to carry out calculating d (p k), account form is:
d ( p k ) = D M ( x &prime; ii &prime; k , x jj &prime; k ) = x &prime; ii &prime; k - x jj &prime; k ) T C cov - 1 ( x &prime; ii &prime; k , x jj &prime; k )
In crooked route, there is bending total Least-cost that an optimal path makes it, that is:
DTW m a ( x &prime; i , x j ) = min p &Sigma; k = 1 K &prime; d ( p k )
Wherein, P={p 1, p 2..., p k'represent crooked route, p krepresent a kth member of P, k=1,2 ..., K', and be used for representing x' iin i-th ' individual element x ' ii'kwith x jin jth ' individual element x jj'kbetween corresponding relation i'=1,2 ..., n len, j'=1,2 ..., n len, d (p k) represent x' ii'kwith x jj'kbending cost;
(3) in order to solve a cost matrix R (i', j') is constructed, that is: by dynamic programming
R(i',j')=d(i',j')+min{R(i',j'-1),R(i'-1,j'-1),R(i'-1,j')}
Wherein, R (0,0)=0, R (i', 0)=R (0, j')=+ ∞; R (n len, n len) be exactly DTW measuring period sequence x' iand x jlowest distance value, namely obtain DTW ma(x' i, x j)=R (n len, n len).Other step and parameter identical with one of embodiment one to three.
Embodiment five: one of present embodiment and embodiment one to four unlike the KNN sorting technique adopting the DTW distance based on mahalanobis distance in step 5, according to the k nearest neighbor number of setting to multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mclassify, determine multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mgeneric L '=l ' 1, l ' 2..., l ' mprocess be:
(1) determine with multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' min each member between based on the DTW of mahalanobis distance multidimensional time-series containing class label individual apart from minimum K, namely exist in, take out the individual minimum numerical value of K in every row element, determine the multidimensional time-series containing class label that the individual minimum numerical value of this K is corresponding, corresponding class label is
(2) classification often row class label is added up the classification that the middle frequency of occurrences is the highest, be the multidimensional time-series X ' of classification=x ' 1, x ' 2..., x ' mgeneric be L '=l ' 1, l ' 2..., l ' m.Other step and parameter identical with one of embodiment one to four.
Following examples are adopted to verify beneficial effect of the present invention:
Embodiment:
Carry out the KNN classification emulation experiment based on different time sequence similarity measure for Wafer data set, Wafer data set comprises 6 dimensions altogether, and each dimension data is as shown in Fig. 5 to Figure 10, and its classification results is as shown in table 1.
Table 1 adopts the classification results of different method for measuring similarity for Wafer data set
Result can find by experiment, the measurement results performance of tradition Euclidean distance is the poorest, and behave oneself best based on the DTW distance of mahalanobis distance, wherein when setting limited window length is 5, effect reaches best rate of accuracy reached to 98.10%, improves 10.85% relative to the accuracy rate of Euclidean distance.
Satellite telemetering data classification experiments:
The KNN classification experiments based on different time sequence similarity measure is carried out for satellite telemetering data, wherein number of training is 50, sample packages contains three its corresponding relations of dimension respectively: the corresponding dimension 1 in yaw-position angle, the corresponding dimension 2 of Speed of Reaction Wheels D, the corresponding dimension 3 of busbar voltage, it is always divided into 4 classification data of all categories such as Figure 11 (a) ~ (d) and shows to Figure 13 (a) ~ (d), test sample book is 50, its classification results is as shown in table 2, Figure 14 (a) ~ (d), be employing specifically to classify situation based on the KNN algorithm of the DTW distance of mahalanobis distance for Figure 15 (a) ~ (d) and Figure 16 (a) ~ (d), its classification results is as shown in table 2.
Table 2 adopts the classification results of different method for measuring similarity for satellite telemetering data
Result can find by experiment, and the measurement results still performance of traditional Euclidean distance is the poorest, and behaves oneself best based on the DTW distance of mahalanobis distance, and its rate of accuracy reached, to 98.00%, improves 4.35%. relative to the accuracy rate of Euclidean distance
The present invention also can have other various embodiments; when not deviating from the present invention's spirit and essence thereof; those skilled in the art are when making various corresponding change and distortion according to the present invention, but these change accordingly and are out of shape the protection domain that all should belong to the claim appended by the present invention.

Claims (5)

1., based on a multidimensional time-series sorting technique of mahalanobis distance DTW, it is characterized in that what a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW was specifically carried out according to following steps:
Step one: the historical satellite telemetry Y under satellite normal operating condition being carried out segmentation with argument catastrophe point for identifying, obtaining normal multidimensional time-series X={x 1, x 2..., x j... x n, wherein, Y is n drow n athe historical satellite telemetry matrix of row, n dfor the dimension values of multidimensional time-series, n afor the number of data points of all historical satellite telemetries, x jfor n drow n lena jth sequence of column data matrix representation X, j=1,2 ..., n, n lenfor length of time series, n is the number of members in X;
Step 2, the multidimensional time-series X={x will obtained after segmentation 1, x 2..., x j... x n, be that c carries out cluster operation to sequence by hierarchy clustering method setting cluster target class number, thus obtain the class label L={l of multidimensional time-series 1, l 2..., l n; Wherein, c is greater than the positive integer that 1 is less than n, l srepresent s element of L sequence, its value is determined by hierarchical clustering result, wherein s=1,2 ..., n;
Step 3: to extract in up-to-date satellite telemetering data test data within the corresponding time point of adjacent m+1 argument catastrophe point and multidimensional time-series to be sorted be X '=x ' 1, x ' 2..., x ' m, wherein, m be greater than 0 positive integer;
Step 4, calculate multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mwith containing the multidimensional time-series X={x of class label 1, x 2..., x j... x nbetween DTW distance sequence
Wherein, d ijaccount form as follows:
d ij=DTW ma(x' i,x j)
X' irepresent i-th sequence of X ', i=1,2 ..., m; DTW marepresent the DTW distance algorithm based on mahalanobis distance; DTW, d ijfor x' iwith x jbetween the DTW distance based on mahalanobis distance;
Step 5, adopt the KNN sorting technique of the DTW distance based on mahalanobis distance, according to the k nearest neighbor number of setting to multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mclassify, determine multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mgeneric L '=l ' 1, l ' 2..., l ' m, wherein, K=1,2 ..., n; Generic l' is certain number determined in 1,2, L, c; KNN is K arest neighbors sorting algorithm; Namely a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW is completed.
2. a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW according to claim 1, it is characterized in that: in step one, argument is one of test parameter of satellite telemetering data, argument Changing Pattern is for increase progressively successively from 0 ° ~ 360 °, have obvious periodicity, argument value becomes 0 ° for argument catastrophe point from 360 °.
3. a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW according to claim 1, it is characterized in that: in step one, the historical satellite telemetry Y under satellite normal operating condition being carried out segmentation with argument catastrophe point for identifying, obtaining normal multidimensional time-series X={x 1, x 2..., x j... x ndetailed process is:
(1) after argument reaches 360 °, then become 0 ° and restart to increase progressively, becoming 0 ° of this point from 360 ° is argument catastrophe point;
(2) the corresponding time of argument catastrophe point is recorded;
(3) corresponding according to the argument catastrophe point time, the test data extracted within adjacent two argument catastrophe points corresponding time is time series; Wherein multidimensional time-series is made up of many time serieses; Wherein, test data is yaw-position angle, Speed of Reaction Wheels and busbar voltage.
4. a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW according to claim 1, is characterized in that: calculate d in step 4 ijdetailed process be:
(1) the covariance matrix C between each dimension of multidimensional time-series to be sorted is calculated cov, its account form is:
C cov=E{[Y-E(Y)][Y-E(Y)] T}
Wherein, Y is n drow n athe historical satellite telemetry matrix of row, E represents calculation expectation value;
(2) based on mahalanobis distance DTW distance namely two time serieses with between find optimum crooked route to obtain minimum mahalanobis distance metric DTW ma(x' i, x j); Mahalanobis distance is adopted to carry out calculating d (p k), account form is:
d ( p k ) = D M ( x &prime; ii &prime; k , x jj &prime; k ) = x &prime; ii &prime; k , x jj &prime; k ) T C cov - 1 ( x &prime; ii &prime; k , x jj &prime; k )
In crooked route, there is bending total Least-cost that an optimal path makes it, that is:
DTW m a ( x &prime; i , x i ) = min p &Sigma; k = 1 K &prime; d ( p k )
Wherein, P={p 1, p 2..., p k'represent crooked route, p krepresent a kth member of P, k=1,2 ..., K', and be used for representing x' iin i-th ' individual element x ' ii'kwith x jin jth ' individual element x jj'kbetween corresponding relation i'=1,2 ..., n len, j'=1,2 ..., n len, d (p k) represent x' ii'kwith x jj'kbending cost;
(3) in order to solve a cost matrix R (i', j') is constructed, that is: by dynamic programming
R(i',j')=d(i',j')+min{R(i',j'-1),R(i'-1,j'-1),R(i'-1,j')}
Wherein, R (0,0)=0, R (i', 0)=R (0, j')=+ ∞; R (n len, n len) be exactly DTW measuring period sequence x' iand x jlowest distance value, namely obtain DTW ma(x' i, x j)=R (n len, n len).
5. a kind of multidimensional time-series sorting technique based on mahalanobis distance DTW according to claim 1, it is characterized in that: the KNN sorting technique adopting the DTW distance based on mahalanobis distance in step 5, according to setting k nearest neighbor number to multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mclassify, determine multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' mgeneric L '=l ' 1, l ' 2..., l ' mprocess be:
(1) determine with multidimensional time-series X ' to be sorted=x ' 1, x ' 2..., x ' min each member between based on the DTW of mahalanobis distance multidimensional time-series containing class label individual apart from minimum K, namely exist in, take out the individual minimum numerical value of K in every row element, determine the multidimensional time-series containing class label that the individual minimum numerical value of this K is corresponding, corresponding class label is
(2) classification often row class label is added up the classification that the middle frequency of occurrences is the highest, be the multidimensional time-series X ' of classification=x ' 1, x ' 2..., x ' mgeneric be L '=l ' 1, l ' 2..., l ' m.
CN201510351181.7A 2015-06-24 2015-06-24 A kind of multidimensional time-series sorting technique based on mahalanobis distance DTW Active CN104915434B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510351181.7A CN104915434B (en) 2015-06-24 2015-06-24 A kind of multidimensional time-series sorting technique based on mahalanobis distance DTW

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510351181.7A CN104915434B (en) 2015-06-24 2015-06-24 A kind of multidimensional time-series sorting technique based on mahalanobis distance DTW

Publications (2)

Publication Number Publication Date
CN104915434A true CN104915434A (en) 2015-09-16
CN104915434B CN104915434B (en) 2018-03-27

Family

ID=54084497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510351181.7A Active CN104915434B (en) 2015-06-24 2015-06-24 A kind of multidimensional time-series sorting technique based on mahalanobis distance DTW

Country Status (1)

Country Link
CN (1) CN104915434B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709509A (en) * 2016-11-30 2017-05-24 哈尔滨工业大学 Satellite telemetry data clustering method based on time series special points
CN107451231A (en) * 2017-07-24 2017-12-08 上海电力学院 Indicator card sorting algorithm based on similarity query
CN108228832A (en) * 2018-01-04 2018-06-29 南京大学 A kind of time series data complementing method based on distance matrix
CN109034179A (en) * 2018-05-30 2018-12-18 河南理工大学 A kind of rock stratum classification method based on mahalanobis distance IDTW
CN109241077A (en) * 2018-08-30 2019-01-18 东北大学 Production target variation tendency visual query system and method based on similitude
CN109362036A (en) * 2018-10-17 2019-02-19 桂林电子科技大学 A kind of multi-modal indoor orientation method combined based on image with WIFI
CN109816211A (en) * 2018-12-29 2019-05-28 北京英视睿达科技有限公司 Judge Polluted area similitude and improves the method and device of pollution administration efficiency
CN109828952A (en) * 2019-01-18 2019-05-31 上海卫星工程研究所 PCM system satellite telemetering data classification extracting method, system
CN110288003A (en) * 2019-05-29 2019-09-27 北京师范大学 Data variation recognition methods and equipment
CN110289986A (en) * 2019-05-27 2019-09-27 武汉大学 A kind of accuracy quantization method of network simulation data
CN111104438A (en) * 2019-11-21 2020-05-05 新浪网技术(中国)有限公司 Method and device for determining periodicity of time sequence and electronic equipment
WO2020220438A1 (en) * 2019-04-29 2020-11-05 东北大学 Method for predicting concurrent volume of services of different types for virtual machine
CN112380992A (en) * 2020-11-13 2021-02-19 上海交通大学 Method and device for evaluating and optimizing accuracy of monitoring data in machining process
CN116504416A (en) * 2023-06-27 2023-07-28 福建无止境光学仪器有限公司 Eye degree prediction method based on machine learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101561878A (en) * 2009-05-31 2009-10-21 河海大学 Unsupervised anomaly detection method and system based on improved CURE clustering algorithm
US8005771B2 (en) * 2007-10-04 2011-08-23 Siemens Corporation Segment-based change detection method in multivariate data stream
CN103646167A (en) * 2013-11-22 2014-03-19 北京空间飞行器总体设计部 Satellite abnormal condition detection system based on telemeasuring data
CN104102726A (en) * 2014-07-22 2014-10-15 南昌航空大学 Modified K-means clustering algorithm based on hierarchical clustering
CN104123368A (en) * 2014-07-24 2014-10-29 中国软件与技术服务股份有限公司 Big data attribute significance and recognition degree early warning method and system based on clustering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8005771B2 (en) * 2007-10-04 2011-08-23 Siemens Corporation Segment-based change detection method in multivariate data stream
CN101561878A (en) * 2009-05-31 2009-10-21 河海大学 Unsupervised anomaly detection method and system based on improved CURE clustering algorithm
CN103646167A (en) * 2013-11-22 2014-03-19 北京空间飞行器总体设计部 Satellite abnormal condition detection system based on telemeasuring data
CN104102726A (en) * 2014-07-22 2014-10-15 南昌航空大学 Modified K-means clustering algorithm based on hierarchical clustering
CN104123368A (en) * 2014-07-24 2014-10-29 中国软件与技术服务股份有限公司 Big data attribute significance and recognition degree early warning method and system based on clustering

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709509B (en) * 2016-11-30 2021-05-28 哈尔滨工业大学 Satellite telemetry data clustering method based on time series special points
CN106709509A (en) * 2016-11-30 2017-05-24 哈尔滨工业大学 Satellite telemetry data clustering method based on time series special points
CN107451231A (en) * 2017-07-24 2017-12-08 上海电力学院 Indicator card sorting algorithm based on similarity query
CN108228832A (en) * 2018-01-04 2018-06-29 南京大学 A kind of time series data complementing method based on distance matrix
CN108228832B (en) * 2018-01-04 2022-04-22 南京大学 Time series data completion method based on distance matrix
CN109034179A (en) * 2018-05-30 2018-12-18 河南理工大学 A kind of rock stratum classification method based on mahalanobis distance IDTW
CN109241077A (en) * 2018-08-30 2019-01-18 东北大学 Production target variation tendency visual query system and method based on similitude
CN109362036A (en) * 2018-10-17 2019-02-19 桂林电子科技大学 A kind of multi-modal indoor orientation method combined based on image with WIFI
CN109816211B (en) * 2018-12-29 2023-11-24 北京英视睿达科技股份有限公司 Method and device for judging similarity of pollution areas and improving pollution treatment efficiency
CN109816211A (en) * 2018-12-29 2019-05-28 北京英视睿达科技有限公司 Judge Polluted area similitude and improves the method and device of pollution administration efficiency
CN109828952B (en) * 2019-01-18 2021-05-11 上海卫星工程研究所 PCM system satellite telemetry data classification extraction method and system
CN109828952A (en) * 2019-01-18 2019-05-31 上海卫星工程研究所 PCM system satellite telemetering data classification extracting method, system
WO2020220438A1 (en) * 2019-04-29 2020-11-05 东北大学 Method for predicting concurrent volume of services of different types for virtual machine
CN110289986B (en) * 2019-05-27 2021-05-18 武汉大学 Accuracy quantification method of network simulation data
CN110289986A (en) * 2019-05-27 2019-09-27 武汉大学 A kind of accuracy quantization method of network simulation data
CN110288003A (en) * 2019-05-29 2019-09-27 北京师范大学 Data variation recognition methods and equipment
CN110288003B (en) * 2019-05-29 2022-01-18 北京师范大学 Data change identification method and equipment
CN111104438A (en) * 2019-11-21 2020-05-05 新浪网技术(中国)有限公司 Method and device for determining periodicity of time sequence and electronic equipment
CN112380992A (en) * 2020-11-13 2021-02-19 上海交通大学 Method and device for evaluating and optimizing accuracy of monitoring data in machining process
CN112380992B (en) * 2020-11-13 2022-12-20 上海交通大学 Method and device for evaluating and optimizing accuracy of monitoring data in machining process
CN116504416A (en) * 2023-06-27 2023-07-28 福建无止境光学仪器有限公司 Eye degree prediction method based on machine learning
CN116504416B (en) * 2023-06-27 2023-09-08 福建无止境光学仪器有限公司 Eye degree prediction method based on machine learning

Also Published As

Publication number Publication date
CN104915434B (en) 2018-03-27

Similar Documents

Publication Publication Date Title
CN104915434A (en) Multi-dimensional time sequence classification method based on mahalanobis distance DTW
CN109271975B (en) Power quality disturbance identification method based on big data multi-feature extraction collaborative classification
CN110336534B (en) Fault diagnosis method based on photovoltaic array electrical parameter time series feature extraction
Zheng et al. A new unsupervised data mining method based on the stacked autoencoder for chemical process fault diagnosis
CN112101220B (en) Rolling bearing service life prediction method based on unsupervised model parameter migration
Chen et al. A just-in-time-learning-aided canonical correlation analysis method for multimode process monitoring and fault detection
Zhou et al. Bearing fault recognition method based on neighbourhood component analysis and coupled hidden Markov model
CN106845717B (en) Energy efficiency evaluation method based on multi-model fusion strategy
CN103048041B (en) Fault diagnosis method of electromechanical system based on local tangent space and support vector machine
CN109297689B (en) Large-scale hydraulic machinery intelligent diagnosis method introducing weight factors
CN104931263B (en) A kind of Method for Bearing Fault Diagnosis based on symbolization probabilistic finite state machine
CN110018670A (en) A kind of industrial process unusual service condition prediction technique excavated based on dynamic association rules
CN104915568A (en) Satellite telemetry data abnormity detection method based on DTW
CN104899327A (en) Method for detecting abnormal time sequence without class label
CN105205288A (en) Mode evolution-based forecasting method for satellite long-term on-orbit running state
CN104794484A (en) Time series data nearest-neighbor classifying method based on subsection orthogonal polynomial decomposition
CN109542952A (en) A kind of detection method of time series abnormal point
Ji et al. A divisive hierarchical clustering approach to hyperspectral band selection
Parvez et al. Comparison of the Smith-Waterman and Needleman-Wunsch algorithms for online similarity analysis of industrial alarm floods
CN111783336A (en) Uncertain structure frequency response dynamic model correction method based on deep learning theory
CN109145764B (en) Method and device for identifying unaligned sections of multiple groups of detection waveforms of comprehensive detection vehicle
CN112801329A (en) Solar panel power generation system abnormity diagnosis and analysis device and method combining factor hidden Markov model and power generation amount prediction
CN116894180B (en) Product manufacturing quality prediction method based on different composition attention network
CN105205145A (en) Track modeling and searching method
CN104679844A (en) Intermittent process batch data synchronizing method based on improved DTW (Dynamic Time Wrapping) algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant