CN110019543A - A kind of method and device of Time Series Clustering - Google Patents
A kind of method and device of Time Series Clustering Download PDFInfo
- Publication number
- CN110019543A CN110019543A CN201710817446.7A CN201710817446A CN110019543A CN 110019543 A CN110019543 A CN 110019543A CN 201710817446 A CN201710817446 A CN 201710817446A CN 110019543 A CN110019543 A CN 110019543A
- Authority
- CN
- China
- Prior art keywords
- time series
- series data
- amplitude spectrum
- clustering
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 230000002123 temporal effect Effects 0.000 claims abstract description 37
- 238000012545 processing Methods 0.000 claims abstract description 28
- 239000011159 matrix material Substances 0.000 claims abstract description 21
- 238000001228 spectrum Methods 0.000 claims description 69
- 230000001932 seasonal effect Effects 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000004321 preservation Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 9
- 238000001514 detection method Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000013450 outlier detection Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000004069 differentiation Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000026676 system process Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
Abstract
The invention discloses a kind of method and devices of Time Series Clustering, are related to intelligent information and communication technique field, and method includes: that algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;Algorithm assembly reads time series data from time series database, carries out clustering processing to read time series data according to the received algorithm parameter of institute, obtains the cluster result comprising temporal aspect matrix and Time Series Clustering label;The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result database by algorithm assembly, and shows the cluster result by display component.
Description
Technical field
The present invention relates to intelligent IC T (Information and Communication Technology, information and communications
Technology) field, in particular to a kind of method and device of Time Series Clustering.
Background technique
IT (Internet Technology, Internet technology) cluster has extensive utilization in all trades and professions, is transported with telecommunications
It seeks for quotient, core net, network management center and data center etc. are to rely on IT cluster.In general, IT cluster scale is huge
Greatly, the hardware and software number and type of configuration are various.IT cluster is have strict demand to the uptime uninterrupted again
System sharply declines if software error and hardware fault occur and family experience not being used only, and expends a large amount of maintenance costs.Therefore
The management of cluster and O&M are always important and challenging task, need uninterruptedly monitor cluster performance data with
Just detection incipient fault or exception are carried out.
With the introducing of the technologies such as virtualization and SDN (Software Defined Network, software defined network), pass
IT cluster of uniting changes to cloudization, and cluster scale further increases, and upper layer software (applications) is applied and type of service increases increasingly, required monitoring
Performance indicator quantity have million grades or even more.Therefore, the method that traditional artificial given threshold is monitored has been difficult to full
Sufficient application demand, not only cost of labor increases, and O&M efficiency and accuracy decline.Automation O&M pair is realized based on machine learning
It solves the problems, such as that this is of great significance, in the industry cycle obtains common concern.One key of automation O&M is dug using data
The method of pick carries out abnormality detection performance data, since clustering performance data class is multifarious, does not have according in machine learning
There is free lunch theorem, all timing can not be solved the problems, such as using a kind of Outlier Detection Algorithm, is needed for different spies
Property timing select respectively suitable Outlier Detection Algorithm.
Summary of the invention
The technical issues of scheme provided according to embodiments of the present invention solves is the automatic of the performance data of IT cluster acquisition
Classification problem, so that respectively suitable Outlier Detection Algorithm provides basis for different classes of data selection.
A kind of method of the Time Series Clustering provided according to embodiments of the present invention, comprising:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read timing
Data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result by algorithm assembly
In database, and the cluster result is shown by display component.
A kind of device of the Time Series Clustering provided according to embodiments of the present invention, comprising:
Receiving module, for receiving the algorithm parameter for being clustered to time series data;
Cluster module, for reading time series data from time series database, according to the received algorithm parameter of institute to being read
Time series data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
Preservation and display module, for saving the cluster result comprising temporal aspect matrix and Time Series Clustering label
The cluster result is shown into cluster result database, and through display component.
A kind of electronic equipment of the Time Series Clustering provided according to embodiments of the present invention, the electronic equipment include: processor
And memory, wherein the memory is for storing executable program code;The processor is by reading in the memory
The executable program code of storage runs program corresponding with executable program code, for executing following steps:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read timing
Data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result by algorithm assembly
In database, and the cluster result is shown by display component.
The scheme provided according to embodiments of the present invention, (1) notify algorithm parameter to algorithm assembly by parameter component, can mention
The flexibility of high algorithm assembly.User can need to select suitable algorithm parameter according to problem, so that oneself can be solved by obtaining
The algorithm service of problem.(2) algorithm assembly can also be more in addition to providing cluster result to display component in a manner of providing service
A application component provides service, these application components can be front end display interface, be also possible to anomaly detection component (for every class
Timing provides suitable Outlier Detection Algorithm) etc., improve the reusability of algorithm assembly.(3) algorithm assembly uses hierarchical clustering
Mode, timing is first divided into preiodic type and Non-periodic Type two major classes, then carry out clustering inside each major class, can reduced poly-
The complexity of alanysis.
Detailed description of the invention
Fig. 1 is a kind of method flow diagram of Time Series Clustering provided in an embodiment of the present invention;
Fig. 2 is a kind of schematic device of Time Series Clustering provided in an embodiment of the present invention;
Fig. 3 is Time Series Clustering system process flow diagram provided in an embodiment of the present invention;
Fig. 4 is the flow chart that algorithm assembly provided in an embodiment of the present invention is clustered;
Fig. 5 is periodical method of discrimination flow chart provided in an embodiment of the present invention;
Fig. 6 is all kinds of time diagrams of preiodic type provided in an embodiment of the present invention;
Fig. 7 is all kinds of time diagrams of Non-periodic Type provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with attached drawing to a preferred embodiment of the present invention will be described in detail, it should be understood that described below is excellent
Select embodiment only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention.
Fig. 1 is a kind of method flow diagram of Time Series Clustering provided in an embodiment of the present invention, as shown in Figure 1, comprising:
Step S101: algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Step S102: algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to institute
The time series data of reading carries out clustering processing, obtains the cluster result comprising temporal aspect matrix and Time Series Clustering label;
Step S103: the cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in by algorithm assembly
In cluster result database, and the cluster result is shown by display component.
Wherein, the algorithm parameter includes timing cycles preset value, temporal aspect collection and Time Series Clustering number;When described
Sequence characteristics collection includes seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, relative entropy, Sample Entropy, self-similarity
And one or more of Liapunov coefficient;The Time Series Clustering number includes period Time Series Clustering number and aperiodic
Time Series Clustering number.
Wherein, the algorithm assembly carries out clustering processing packet to read time series data according to the received algorithm parameter of institute
Include: algorithm assembly extracts the structure characteristic collection of the time series data, and root according to the temporal aspect collection in the algorithm parameter
According to the timing cycles preset value in the algorithm parameter, determine period time series data in the time series data and it is aperiodic when
Ordinal number evidence;Algorithm assembly is according to taken out time series data structure characteristic collection and the Time Series Clustering number, respectively to institute's week
Phase time series data and aperiodic time series data carry out clustering processing.
Specifically, the algorithm assembly is according to the timing cycles preset value in the algorithm parameter, when determining described
Period time series data and aperiodic time series data of the ordinal number in include: the amplitude spectrum maximum that algorithm assembly calculates time series data
Value, amplitude spectrum average value and amplitude spectrum standard deviation, and judge the difference of the amplitude spectrum maximum value Yu the amplitude spectrum average value
Whether value is greater than the multiple of amplitude spectrum standard deviation;If judging the difference of the amplitude spectrum maximum value and the amplitude spectrum average value not
Greater than the multiple of amplitude spectrum standard deviation, then algorithm assembly differentiates that the time series data is aperiodic time series data;If described in judgement
Amplitude spectrum maximum value and the difference of the amplitude spectrum average value are greater than the multiple of amplitude spectrum standard deviation, then algorithm assembly is further sentenced
Whether the corresponding timing cycles of the amplitude spectrum maximum value of breaking are equal to the timing cycles preset value;If judging the amplitude spectrum most
It is worth corresponding timing cycles greatly equal to the timing cycles preset value, then algorithm assembly differentiates that the time series data is period timing
Data;If judging the corresponding timing cycles of the amplitude spectrum maximum value not equal to the timing cycles preset value, algorithm assembly
Differentiate that the time series data is aperiodic time series data.
Specifically, the algorithm assembly is according to taken out time series data structure characteristic collection and the Time Series Clustering number,
Respectively to institute's period time series data and aperiodic time series data carry out clustering processing include: algorithm assembly according to it is taken out when
Ordinal number carries out clustering processing to the period time series data according to structure characteristic collection, according to the period Time Series Clustering number;Algorithm
Component according to taken out time series data structure characteristic collection, according to the aperiodic Time Series Clustering number to it is described aperiodic when ordinal number
According to progress clustering processing.
Fig. 2 is a kind of schematic device of Time Series Clustering provided in an embodiment of the present invention, as shown in Figure 2, comprising: receives mould
Block 201, for receiving the algorithm parameter for being clustered to time series data;Cluster module 202, for from time series database
Time series data is read, clustering processing is carried out to read time series data according to the received algorithm parameter of institute, is obtained comprising timing
The cluster result of eigenmatrix and Time Series Clustering label;Preservation and display module 203, for will described include temporal aspect matrix
It is saved in cluster result database with the cluster result of Time Series Clustering label, and shows that the cluster is tied by display component
Fruit.
Wherein, the algorithm parameter includes timing cycles preset value, temporal aspect collection and Time Series Clustering number;When described
Sequence characteristics collection includes seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, relative entropy, Sample Entropy, self-similarity
And one or more of Liapunov coefficient;The Time Series Clustering number includes period Time Series Clustering number and aperiodic
Time Series Clustering number.
Wherein, the cluster module 202 includes: processing unit, for according to the temporal aspect collection in the algorithm parameter,
The structure characteristic collection of the time series data is extracted, and according to the timing cycles preset value in the algorithm parameter, determines institute
State the period time series data and aperiodic time series data in time series data;Cluster cell, for according to taken out time series data knot
Structure feature set and the Time Series Clustering number respectively carry out at cluster institute's period time series data and aperiodic time series data
Reason.
Specifically, the processing unit includes: feature extraction subelement, for according to the timing in the algorithm parameter
Feature set extracts the structure characteristic collection of the time series data;Period differentiates subelement, for calculating the amplitude of time series data
Maximum value, amplitude spectrum average value and amplitude spectrum standard deviation are composed, and judges that the amplitude spectrum maximum value and the amplitude spectrum are average
Whether the difference of value is greater than the multiple of amplitude spectrum standard deviation, if judging the amplitude spectrum maximum value and the amplitude spectrum average value
Difference is not more than the multiple of amplitude spectrum standard deviation, then differentiates that the time series data is aperiodic time series data, and if judging institute
The difference of amplitude spectrum maximum value and the amplitude spectrum average value is stated greater than the multiple of amplitude spectrum standard deviation, then described in further judgement
Whether the corresponding timing cycles of amplitude spectrum maximum value are equal to the timing cycles preset value, if judging the amplitude spectrum maximum value pair
The timing cycles answered are equal to the timing cycles preset value, then differentiate that the time series data is period time series data, if judging institute
The corresponding timing cycles of amplitude spectrum maximum value are stated not equal to the timing cycles preset value, then differentiate that the time series data is non-week
Phase time series data.
Specifically, the cluster cell includes: the first cluster subelement, for special according to taken out time series data structure
Collection carries out clustering processing to the period time series data according to the period Time Series Clustering number;Second cluster subelement, is used
According to taken out time series data structure characteristic collection, according to the aperiodic Time Series Clustering number to the aperiodic time series data
Carry out clustering processing.
Fig. 3 is Time Series Clustering system process flow diagram provided in an embodiment of the present invention, as shown in Figure 3, comprising: parameter group
Part, algorithm assembly, time series database, cluster result database and display component.It specifically includes:
Step 301: related algorithm parameter is notified that, to algorithm assembly, algorithm parameter includes by parameter component in a manner of message
Timing cycles preset value pt, selected temporal aspect collection and Time Series Clustering number.
Step 302: algorithm assembly reads time series data from time series database, and the algorithm according to transmitted by parameter component is joined
Count up into Time Series Clustering.
Step 303: cluster result is stored in cluster result database by algorithm assembly, and cluster result includes temporal aspect square
Battle array, Time Series Clustering label etc..
Step 304: cluster result is sent to display component by algorithm assembly.
Step 305: display component result be presented to user, including timing diagram and timing tag.
Specifically, ordinal number when time series data is first divided into preiodic type time series data and Non-periodic Type by the algorithm assembly of step 2
According to two major classes, then clustering is carried out inside each major class, to reduce the complexity of clustering.
Fig. 4 is the flow chart that algorithm assembly provided in an embodiment of the present invention is clustered, as shown in Figure 4, comprising:
Step 401: reading time series data from time series database, and time series data is pre-processed, including fill up missing
Value, removal noise etc..
Step 402: extracting the structure feature of time series data.
According to the temporal aspect collection that parameter component is sent into, the structure feature of time series data is extracted.Temporal aspect collection is by such as
One or more compositions of lower feature: seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, relative entropy, sample
Entropy, self-similarity, Liapunov coefficient.
Step 403: preiodic type differentiation being carried out to time series data, time series data is first divided into two major classes: preiodic type and non-week
Phase type.
As shown in figure 5, periodical method of discrimination are as follows:
1) Fourier (FFT) transformation is done to timing, if the length of FFT transform is fft_size.
2) amplitude of FFT coefficient is taken to obtain corresponding amplitude frequency spectrum.
3) 20 times of logarithmic transformations are done to amplitude frequency spectrum, frequency spectrum is carried out smooth.
4) frequency spectrum subscript [1, fft_size/4] section corresponding maximum value MAX, mean value m and standard deviation std after converting are sought.
Id is designated as under note MAX is corresponding.
If 5) MAX > m+3.2*std, enter step 6), otherwise determines that timing is Non-periodic Type.
6) calculating maximum spectrum point corresponding period is that (fs is p=fft_size/ (Id*fs) * fs=fft_size/Id
Sample frequency), if the period is equal to preset value pt, timing is determined as preiodic type and returns to its period p, is otherwise determined as non-week
Phase type.
Step 404: the other timing of each major class being clustered using clustering algorithm, clusters the structure feature according to extraction
It carries out.
It clusters the Time Series Clustering number that number is passed to by parameter component to determine, i.e. preiodic type Time Series Clustering number k1 and non-week
Phase type Time Series Clustering number k2.
Step 405: according to cluster result, exporting each timing generic.
Algorithm assembly can also be sent to more application components as a result, for example in addition to sending cluster result to display component
Cluster result is sent to some anomaly detection component, the anomaly detection component is according to timing generic using corresponding abnormal inspection
Method of determining and calculating.
Embodiment:
In the present embodiment, the time series data stored in attached drawing 3 is 407 network port datas on flows, and acquisition time is long
Degree is 2 weeks, the acquisition granularity 15 minutes (i.e. 96 points of acquisition daily).
The timing cycles preset value pt that parameter component passes to algorithm assembly in attached drawing 3 is equal to 1 day, i.e., periodically judgement is calculated
Whether method determines timing using day as the period.The characteristic set of transmitting are as follows: seasonal indicator, tendency index, the degree of bias, relative entropy,
Sample Entropy, self-similarity, Liapunov coefficient;7 structure features will be extracted in algorithm assembly.The preiodic type timing of transmitting
Cluster number k1=4, Non-periodic Type Time Series Clustering number k2=5;Preiodic type timing is divided into 4 classes, Non-periodic Type data point
For 5 classes.
The basic process of algorithm assembly used in the present embodiment, comprising:
Step 1: reading time series data, time series data is pre-processed, i.e., fills up missing by the way of linear interpolation
Value, and remove noise.
Step 2: extracting the seasonal indicator of time series data, tendency index, the degree of bias, relative entropy, Sample Entropy, self-similarity
With 7 structure features such as Liapunov coefficient.
Step 3: to time series data carry out preiodic type differentiation, time series data is first divided into two major classes: preiodic type with it is aperiodic
Type.
Step 4: the other timing of each major class being clustered using K mean cluster algorithm, it is 4 that preiodic type data, which are gathered,
Class, Non-periodic Type data are gathered for 5 classes.
Step 5: according to cluster result, exporting each timing generic.
The treatment process of the preiodic type method of discrimination of step 3, comprising:
1) Fourier (FFT) transformation is done to timing, the length of FFT transform is fft_size.
2) amplitude of FFT coefficient is taken to obtain corresponding amplitude spectrum.
3) 20 times of logarithmic transformations are done to amplitude spectrum, frequency spectrum is carried out smooth.
4) frequency spectrum subscript [1, fft_size/4] section corresponding maximum value MAX, mean value m and standard deviation std after converting are sought.
Id is designated as under note MAX is corresponding.
If 5) MAX > m+3.2*std, enter step 6), otherwise determines that timing is Non-periodic Type.
6) calculating maximum spectrum point corresponding period is that (fs is p=fft_size/ (Id*fs) * fs=fft_size/Id
Sample frequency), if the period is equal to preset value pt, timing is determined as preiodic type and returns to its period p, is otherwise determined as non-week
Phase type.
By preiodic type distinguished number, 407 timing have 165 timing to be divided into preiodic type data, and 242 timing are divided
At Non-periodic Type data.Then clustering is carried out using data of k mean value (k-means) algorithm to each major class.Attached drawing 6 is
The exemplary waveform diagram of each categorical data of preiodic type timing;Attached drawing 7 is the typical waveform of each categorical data of Non-periodic Type timing
Figure.The timing number of each subclass summarizes as shown in table 1.
Table 1: each timing subclass numbers summary sheet
After completing cluster, cluster result is stored in the cluster result database (feature vector and classification mark of each timing
Label).In this example it is shown that component completes front end display function, cluster result is presented to the user, i.e. display such as attached drawing 6
With the waveform diagram of the timing of all categories of attached drawing 7.
In other embodiments, it can choose the technical solution in different implementation detail realization summary of the invention.For example it answers
It is equal to 1 week with the period preset value pt that component 1 is passed to;Incoming preiodic type cluster number k1 is equal to 2, and Non-periodic Type clusters number
Equal to 4;Selected temporal aspect integrates as seasonal indicator, tendency index, auto-correlation coefficient, relative entropy, Sample Entropy;Increase
Other application component calls the cluster result of algorithm assembly.
A kind of electronic equipment of the Time Series Clustering provided according to embodiments of the present invention, the electronic equipment include: processor
And memory, wherein the memory is for storing executable program code;The processor is by reading in the memory
The executable program code of storage runs program corresponding with executable program code, for executing following steps:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read timing
Data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result by algorithm assembly
In database, and the cluster result is shown by display component.
A kind of computer storage medium provided according to embodiments of the present invention, is stored with the program of Time Series Clustering, when described
The program of sequence cluster when being executed by processor the following steps are included:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read timing
Data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result by algorithm assembly
In database, and the cluster result is shown by display component.
The scheme provided according to embodiments of the present invention gives algorithm assembly to transmit relevant parameter, when realization by parameter component
The flexible deployment of sequence clustering algorithm.Algorithm assembly reduces the complexity of clustering by the way of hierarchical cluster.Algorithm groups
Cluster result is supplied to other application component as a kind of service by part, improves its function reusability.
Although describing the invention in detail above, but the invention is not restricted to this, those skilled in the art of the present technique
It can be carry out various modifications with principle according to the present invention.Therefore, all to be modified according to made by the principle of the invention, all it should be understood as
Fall into protection scope of the present invention.
Claims (10)
1. a kind of method of Time Series Clustering, comprising:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read time series data
Clustering processing is carried out, the cluster result comprising temporal aspect matrix and Time Series Clustering label is obtained;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result data by algorithm assembly
In library, and the cluster result is shown by display component.
2. according to the method described in claim 1, the algorithm parameter includes timing cycles preset value, temporal aspect collection with timely
Sequence clusters number;The temporal aspect collection includes seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, opposite
One or more of entropy, Sample Entropy, self-similarity and Liapunov coefficient;When the Time Series Clustering number includes the period
Sequence clusters number and aperiodic Time Series Clustering number.
3. according to the method described in claim 2, the algorithm assembly is according to the received algorithm parameter of institute to read timing
Data carry out clustering processing
Algorithm assembly extracts the structure characteristic collection of the time series data according to the temporal aspect collection in the algorithm parameter, and
According to the timing cycles preset value in the algorithm parameter, period time series data in the time series data and aperiodic is determined
Time series data;
Algorithm assembly according to taken out time series data structure characteristic collection and the Time Series Clustering number, respectively to institute the period when
Ordinal number evidence and aperiodic time series data carry out clustering processing.
4. according to the method described in claim 3, the algorithm assembly is according to the timing cycles preset value in the algorithm parameter,
The period time series data and aperiodic time series data determined in the time series data include:
Algorithm assembly calculates amplitude spectrum maximum value, amplitude spectrum average value and the amplitude spectrum standard deviation of time series data, and judges
Whether the amplitude spectrum maximum value and the difference of the amplitude spectrum average value are greater than the multiple of amplitude spectrum standard deviation;
If judging, the amplitude spectrum maximum value is not more than the multiple of amplitude spectrum standard deviation with the difference of the amplitude spectrum average value,
Algorithm assembly differentiates that the time series data is aperiodic time series data;
If judging, the amplitude spectrum maximum value is greater than the multiple of amplitude spectrum standard deviation with the difference of the amplitude spectrum average value, calculates
Method component further judges whether the corresponding timing cycles of the amplitude spectrum maximum value are equal to the timing cycles preset value;
If judging, the corresponding timing cycles of the amplitude spectrum maximum value are equal to the timing cycles preset value, and algorithm assembly differentiates
The time series data is period time series data;
If judging the corresponding timing cycles of the amplitude spectrum maximum value not equal to the timing cycles preset value, algorithm assembly is sentenced
The not described time series data is aperiodic time series data.
5. according to the method described in claim 3, the algorithm assembly is according to taken out time series data structure characteristic collection and described
Time Series Clustering number, carrying out clustering processing to institute's period time series data and aperiodic time series data respectively includes:
Algorithm assembly is according to taken out time series data structure characteristic collection, when according to the period Time Series Clustering number to the period
Ordinal number is according to progress clustering processing;
Algorithm assembly is according to taken out time series data structure characteristic collection, according to the aperiodic Time Series Clustering number to the non-week
Phase time series data carries out clustering processing.
6. a kind of device of Time Series Clustering, comprising:
Receiving module, for receiving the algorithm parameter for being clustered to time series data;
Cluster module, for reading time series data from time series database, according to the received algorithm parameter of institute to it is read when
Ordinal number obtains the cluster result comprising temporal aspect matrix and Time Series Clustering label according to clustering processing is carried out;
Preservation and display module, it is poly- for the cluster result comprising temporal aspect matrix and Time Series Clustering label to be saved in
In class result database, and the cluster result is shown by display component.
7. device according to claim 6, the algorithm parameter includes timing cycles preset value, temporal aspect collection with timely
Sequence clusters number;The temporal aspect collection includes seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, opposite
One or more of entropy, Sample Entropy, self-similarity and Liapunov coefficient;When the Time Series Clustering number includes the period
Sequence clusters number and aperiodic Time Series Clustering number.
8. device according to claim 7, the cluster module include:
Processing unit, for extracting the structure feature of the time series data according to the temporal aspect collection in the algorithm parameter
Collection, and according to the timing cycles preset value in the algorithm parameter, determine period time series data in the time series data and
Aperiodic time series data;
Cluster cell is used for according to taken out time series data structure characteristic collection and the Time Series Clustering number, respectively to described in institute
Period time series data and aperiodic time series data carry out clustering processing.
9. device according to claim 8, the processing unit include:
Feature extraction subelement, for extracting the knot of the time series data according to the temporal aspect collection in the algorithm parameter
Structure feature set;
Period differentiates subelement, for calculating amplitude spectrum maximum value, amplitude spectrum average value and the amplitude spectrum mark of time series data
It is quasi- poor, and judge the difference of the amplitude spectrum maximum value and the amplitude spectrum average value whether be greater than amplitude spectrum standard deviation again
Number, if judging, the amplitude spectrum maximum value is not more than the multiple of amplitude spectrum standard deviation with the difference of the amplitude spectrum average value,
Differentiate that the time series data is aperiodic time series data, and if judging the amplitude spectrum maximum value and the amplitude spectrum average value
Difference be greater than amplitude spectrum standard deviation multiple, then further judge the corresponding timing cycles of the amplitude spectrum maximum value whether etc.
In the timing cycles preset value, if judging, the corresponding timing cycles of the amplitude spectrum maximum value are default equal to the timing cycles
Value then differentiates that the time series data is period time series data, if judging, the corresponding timing cycles of the amplitude spectrum maximum value are differed
In the timing cycles preset value, then differentiate that the time series data is aperiodic time series data.
10. a kind of electronic equipment of Time Series Clustering, the electronic equipment include: processor and memory, wherein the memory
For storing executable program code;The processor is transported by reading the executable program code stored in the memory
Row program corresponding with executable program code, for executing following steps:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read time series data
Clustering processing is carried out, the cluster result comprising temporal aspect matrix and Time Series Clustering label is obtained;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result data by algorithm assembly
In library, and the cluster result is shown by display component.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710817446.7A CN110019543A (en) | 2017-09-12 | 2017-09-12 | A kind of method and device of Time Series Clustering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710817446.7A CN110019543A (en) | 2017-09-12 | 2017-09-12 | A kind of method and device of Time Series Clustering |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110019543A true CN110019543A (en) | 2019-07-16 |
Family
ID=67186268
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710817446.7A Pending CN110019543A (en) | 2017-09-12 | 2017-09-12 | A kind of method and device of Time Series Clustering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110019543A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113887812A (en) * | 2021-10-14 | 2022-01-04 | 广东电网有限责任公司 | Clustering-based small sample load prediction method, device, equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140149412A1 (en) * | 2012-11-26 | 2014-05-29 | Ricoh Company, Ltd. | Information processing apparatus, clustering method, and recording medium storing clustering program |
CN105205112A (en) * | 2015-09-01 | 2015-12-30 | 西安交通大学 | System and method for excavating abnormal features of time series data |
CN105608758A (en) * | 2015-12-17 | 2016-05-25 | 山东鲁能软件技术有限公司 | Big data analysis platform apparatus and method based on algorithm configuration and distributed stream computing |
-
2017
- 2017-09-12 CN CN201710817446.7A patent/CN110019543A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140149412A1 (en) * | 2012-11-26 | 2014-05-29 | Ricoh Company, Ltd. | Information processing apparatus, clustering method, and recording medium storing clustering program |
CN105205112A (en) * | 2015-09-01 | 2015-12-30 | 西安交通大学 | System and method for excavating abnormal features of time series data |
CN105608758A (en) * | 2015-12-17 | 2016-05-25 | 山东鲁能软件技术有限公司 | Big data analysis platform apparatus and method based on algorithm configuration and distributed stream computing |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113887812A (en) * | 2021-10-14 | 2022-01-04 | 广东电网有限责任公司 | Clustering-based small sample load prediction method, device, equipment and storage medium |
CN113887812B (en) * | 2021-10-14 | 2023-07-07 | 广东电网有限责任公司 | Clustering-based small sample load prediction method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110851338B (en) | Abnormality detection method, electronic device, and storage medium | |
CN106383766B (en) | System monitoring method and apparatus | |
CN107707376B (en) | A kind of method and system of monitoring and alarm | |
CN111475804A (en) | Alarm prediction method and system | |
CN110096410A (en) | Alarm information processing method, system, computer installation and readable storage medium storing program for executing | |
CN106888194A (en) | Intelligent grid IT assets security monitoring systems based on distributed scheduling | |
CN108427725A (en) | Data processing method, device and system | |
CN110058977A (en) | Monitor control index method for detecting abnormality, device and equipment based on Stream Processing | |
CN109684052B (en) | Transaction analysis method, device, equipment and storage medium | |
CN107040608A (en) | A kind of data processing method and system | |
CN109389518A (en) | Association analysis method and device | |
CN107391571A (en) | The processing method and processing device of sensing data | |
CN110414778A (en) | Case work dispatching method and device | |
CN109960839B (en) | Service link discovery method and system of service support system based on machine learning | |
CN115563180A (en) | Dynamic threshold generation method, device, equipment and storage medium | |
CN111147306B (en) | Fault analysis method and device of Internet of things equipment and Internet of things platform | |
CN111339052A (en) | Unstructured log data processing method and device | |
CN113342939B (en) | Data quality monitoring method and device and related equipment | |
CN110019543A (en) | A kind of method and device of Time Series Clustering | |
CN114500543A (en) | Distributed elastic edge acquisition system and application method thereof | |
CN109818808A (en) | Method for diagnosing faults, device and electronic equipment | |
CN107065605B (en) | A kind of fault diagnosis and alarm method | |
CN109375146A (en) | A kind of filling mining method, system and the terminal device of electricity consumption data | |
CN106130929B (en) | The service message automatic processing method and system of internet insurance field based on graph-theoretical algorithm | |
CN106506282A (en) | A kind of monitoring method for improving cloud platform monitoring performance and scale |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190716 |