CN110019543A - A kind of method and device of Time Series Clustering - Google Patents

A kind of method and device of Time Series Clustering Download PDF

Info

Publication number
CN110019543A
CN110019543A CN201710817446.7A CN201710817446A CN110019543A CN 110019543 A CN110019543 A CN 110019543A CN 201710817446 A CN201710817446 A CN 201710817446A CN 110019543 A CN110019543 A CN 110019543A
Authority
CN
China
Prior art keywords
time series
series data
amplitude spectrum
clustering
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710817446.7A
Other languages
Chinese (zh)
Inventor
刘建伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201710817446.7A priority Critical patent/CN110019543A/en
Publication of CN110019543A publication Critical patent/CN110019543A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Abstract

The invention discloses a kind of method and devices of Time Series Clustering, are related to intelligent information and communication technique field, and method includes: that algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;Algorithm assembly reads time series data from time series database, carries out clustering processing to read time series data according to the received algorithm parameter of institute, obtains the cluster result comprising temporal aspect matrix and Time Series Clustering label;The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result database by algorithm assembly, and shows the cluster result by display component.

Description

A kind of method and device of Time Series Clustering
Technical field
The present invention relates to intelligent IC T (Information and Communication Technology, information and communications Technology) field, in particular to a kind of method and device of Time Series Clustering.
Background technique
IT (Internet Technology, Internet technology) cluster has extensive utilization in all trades and professions, is transported with telecommunications It seeks for quotient, core net, network management center and data center etc. are to rely on IT cluster.In general, IT cluster scale is huge Greatly, the hardware and software number and type of configuration are various.IT cluster is have strict demand to the uptime uninterrupted again System sharply declines if software error and hardware fault occur and family experience not being used only, and expends a large amount of maintenance costs.Therefore The management of cluster and O&M are always important and challenging task, need uninterruptedly monitor cluster performance data with Just detection incipient fault or exception are carried out.
With the introducing of the technologies such as virtualization and SDN (Software Defined Network, software defined network), pass IT cluster of uniting changes to cloudization, and cluster scale further increases, and upper layer software (applications) is applied and type of service increases increasingly, required monitoring Performance indicator quantity have million grades or even more.Therefore, the method that traditional artificial given threshold is monitored has been difficult to full Sufficient application demand, not only cost of labor increases, and O&M efficiency and accuracy decline.Automation O&M pair is realized based on machine learning It solves the problems, such as that this is of great significance, in the industry cycle obtains common concern.One key of automation O&M is dug using data The method of pick carries out abnormality detection performance data, since clustering performance data class is multifarious, does not have according in machine learning There is free lunch theorem, all timing can not be solved the problems, such as using a kind of Outlier Detection Algorithm, is needed for different spies Property timing select respectively suitable Outlier Detection Algorithm.
Summary of the invention
The technical issues of scheme provided according to embodiments of the present invention solves is the automatic of the performance data of IT cluster acquisition Classification problem, so that respectively suitable Outlier Detection Algorithm provides basis for different classes of data selection.
A kind of method of the Time Series Clustering provided according to embodiments of the present invention, comprising:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read timing Data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result by algorithm assembly In database, and the cluster result is shown by display component.
A kind of device of the Time Series Clustering provided according to embodiments of the present invention, comprising:
Receiving module, for receiving the algorithm parameter for being clustered to time series data;
Cluster module, for reading time series data from time series database, according to the received algorithm parameter of institute to being read Time series data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
Preservation and display module, for saving the cluster result comprising temporal aspect matrix and Time Series Clustering label The cluster result is shown into cluster result database, and through display component.
A kind of electronic equipment of the Time Series Clustering provided according to embodiments of the present invention, the electronic equipment include: processor And memory, wherein the memory is for storing executable program code;The processor is by reading in the memory The executable program code of storage runs program corresponding with executable program code, for executing following steps:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read timing Data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result by algorithm assembly In database, and the cluster result is shown by display component.
The scheme provided according to embodiments of the present invention, (1) notify algorithm parameter to algorithm assembly by parameter component, can mention The flexibility of high algorithm assembly.User can need to select suitable algorithm parameter according to problem, so that oneself can be solved by obtaining The algorithm service of problem.(2) algorithm assembly can also be more in addition to providing cluster result to display component in a manner of providing service A application component provides service, these application components can be front end display interface, be also possible to anomaly detection component (for every class Timing provides suitable Outlier Detection Algorithm) etc., improve the reusability of algorithm assembly.(3) algorithm assembly uses hierarchical clustering Mode, timing is first divided into preiodic type and Non-periodic Type two major classes, then carry out clustering inside each major class, can reduced poly- The complexity of alanysis.
Detailed description of the invention
Fig. 1 is a kind of method flow diagram of Time Series Clustering provided in an embodiment of the present invention;
Fig. 2 is a kind of schematic device of Time Series Clustering provided in an embodiment of the present invention;
Fig. 3 is Time Series Clustering system process flow diagram provided in an embodiment of the present invention;
Fig. 4 is the flow chart that algorithm assembly provided in an embodiment of the present invention is clustered;
Fig. 5 is periodical method of discrimination flow chart provided in an embodiment of the present invention;
Fig. 6 is all kinds of time diagrams of preiodic type provided in an embodiment of the present invention;
Fig. 7 is all kinds of time diagrams of Non-periodic Type provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with attached drawing to a preferred embodiment of the present invention will be described in detail, it should be understood that described below is excellent Select embodiment only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention.
Fig. 1 is a kind of method flow diagram of Time Series Clustering provided in an embodiment of the present invention, as shown in Figure 1, comprising:
Step S101: algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Step S102: algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to institute The time series data of reading carries out clustering processing, obtains the cluster result comprising temporal aspect matrix and Time Series Clustering label;
Step S103: the cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in by algorithm assembly In cluster result database, and the cluster result is shown by display component.
Wherein, the algorithm parameter includes timing cycles preset value, temporal aspect collection and Time Series Clustering number;When described Sequence characteristics collection includes seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, relative entropy, Sample Entropy, self-similarity And one or more of Liapunov coefficient;The Time Series Clustering number includes period Time Series Clustering number and aperiodic Time Series Clustering number.
Wherein, the algorithm assembly carries out clustering processing packet to read time series data according to the received algorithm parameter of institute Include: algorithm assembly extracts the structure characteristic collection of the time series data, and root according to the temporal aspect collection in the algorithm parameter According to the timing cycles preset value in the algorithm parameter, determine period time series data in the time series data and it is aperiodic when Ordinal number evidence;Algorithm assembly is according to taken out time series data structure characteristic collection and the Time Series Clustering number, respectively to institute's week Phase time series data and aperiodic time series data carry out clustering processing.
Specifically, the algorithm assembly is according to the timing cycles preset value in the algorithm parameter, when determining described Period time series data and aperiodic time series data of the ordinal number in include: the amplitude spectrum maximum that algorithm assembly calculates time series data Value, amplitude spectrum average value and amplitude spectrum standard deviation, and judge the difference of the amplitude spectrum maximum value Yu the amplitude spectrum average value Whether value is greater than the multiple of amplitude spectrum standard deviation;If judging the difference of the amplitude spectrum maximum value and the amplitude spectrum average value not Greater than the multiple of amplitude spectrum standard deviation, then algorithm assembly differentiates that the time series data is aperiodic time series data;If described in judgement Amplitude spectrum maximum value and the difference of the amplitude spectrum average value are greater than the multiple of amplitude spectrum standard deviation, then algorithm assembly is further sentenced Whether the corresponding timing cycles of the amplitude spectrum maximum value of breaking are equal to the timing cycles preset value;If judging the amplitude spectrum most It is worth corresponding timing cycles greatly equal to the timing cycles preset value, then algorithm assembly differentiates that the time series data is period timing Data;If judging the corresponding timing cycles of the amplitude spectrum maximum value not equal to the timing cycles preset value, algorithm assembly Differentiate that the time series data is aperiodic time series data.
Specifically, the algorithm assembly is according to taken out time series data structure characteristic collection and the Time Series Clustering number, Respectively to institute's period time series data and aperiodic time series data carry out clustering processing include: algorithm assembly according to it is taken out when Ordinal number carries out clustering processing to the period time series data according to structure characteristic collection, according to the period Time Series Clustering number;Algorithm Component according to taken out time series data structure characteristic collection, according to the aperiodic Time Series Clustering number to it is described aperiodic when ordinal number According to progress clustering processing.
Fig. 2 is a kind of schematic device of Time Series Clustering provided in an embodiment of the present invention, as shown in Figure 2, comprising: receives mould Block 201, for receiving the algorithm parameter for being clustered to time series data;Cluster module 202, for from time series database Time series data is read, clustering processing is carried out to read time series data according to the received algorithm parameter of institute, is obtained comprising timing The cluster result of eigenmatrix and Time Series Clustering label;Preservation and display module 203, for will described include temporal aspect matrix It is saved in cluster result database with the cluster result of Time Series Clustering label, and shows that the cluster is tied by display component Fruit.
Wherein, the algorithm parameter includes timing cycles preset value, temporal aspect collection and Time Series Clustering number;When described Sequence characteristics collection includes seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, relative entropy, Sample Entropy, self-similarity And one or more of Liapunov coefficient;The Time Series Clustering number includes period Time Series Clustering number and aperiodic Time Series Clustering number.
Wherein, the cluster module 202 includes: processing unit, for according to the temporal aspect collection in the algorithm parameter, The structure characteristic collection of the time series data is extracted, and according to the timing cycles preset value in the algorithm parameter, determines institute State the period time series data and aperiodic time series data in time series data;Cluster cell, for according to taken out time series data knot Structure feature set and the Time Series Clustering number respectively carry out at cluster institute's period time series data and aperiodic time series data Reason.
Specifically, the processing unit includes: feature extraction subelement, for according to the timing in the algorithm parameter Feature set extracts the structure characteristic collection of the time series data;Period differentiates subelement, for calculating the amplitude of time series data Maximum value, amplitude spectrum average value and amplitude spectrum standard deviation are composed, and judges that the amplitude spectrum maximum value and the amplitude spectrum are average Whether the difference of value is greater than the multiple of amplitude spectrum standard deviation, if judging the amplitude spectrum maximum value and the amplitude spectrum average value Difference is not more than the multiple of amplitude spectrum standard deviation, then differentiates that the time series data is aperiodic time series data, and if judging institute The difference of amplitude spectrum maximum value and the amplitude spectrum average value is stated greater than the multiple of amplitude spectrum standard deviation, then described in further judgement Whether the corresponding timing cycles of amplitude spectrum maximum value are equal to the timing cycles preset value, if judging the amplitude spectrum maximum value pair The timing cycles answered are equal to the timing cycles preset value, then differentiate that the time series data is period time series data, if judging institute The corresponding timing cycles of amplitude spectrum maximum value are stated not equal to the timing cycles preset value, then differentiate that the time series data is non-week Phase time series data.
Specifically, the cluster cell includes: the first cluster subelement, for special according to taken out time series data structure Collection carries out clustering processing to the period time series data according to the period Time Series Clustering number;Second cluster subelement, is used According to taken out time series data structure characteristic collection, according to the aperiodic Time Series Clustering number to the aperiodic time series data Carry out clustering processing.
Fig. 3 is Time Series Clustering system process flow diagram provided in an embodiment of the present invention, as shown in Figure 3, comprising: parameter group Part, algorithm assembly, time series database, cluster result database and display component.It specifically includes:
Step 301: related algorithm parameter is notified that, to algorithm assembly, algorithm parameter includes by parameter component in a manner of message Timing cycles preset value pt, selected temporal aspect collection and Time Series Clustering number.
Step 302: algorithm assembly reads time series data from time series database, and the algorithm according to transmitted by parameter component is joined Count up into Time Series Clustering.
Step 303: cluster result is stored in cluster result database by algorithm assembly, and cluster result includes temporal aspect square Battle array, Time Series Clustering label etc..
Step 304: cluster result is sent to display component by algorithm assembly.
Step 305: display component result be presented to user, including timing diagram and timing tag.
Specifically, ordinal number when time series data is first divided into preiodic type time series data and Non-periodic Type by the algorithm assembly of step 2 According to two major classes, then clustering is carried out inside each major class, to reduce the complexity of clustering.
Fig. 4 is the flow chart that algorithm assembly provided in an embodiment of the present invention is clustered, as shown in Figure 4, comprising:
Step 401: reading time series data from time series database, and time series data is pre-processed, including fill up missing Value, removal noise etc..
Step 402: extracting the structure feature of time series data.
According to the temporal aspect collection that parameter component is sent into, the structure feature of time series data is extracted.Temporal aspect collection is by such as One or more compositions of lower feature: seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, relative entropy, sample Entropy, self-similarity, Liapunov coefficient.
Step 403: preiodic type differentiation being carried out to time series data, time series data is first divided into two major classes: preiodic type and non-week Phase type.
As shown in figure 5, periodical method of discrimination are as follows:
1) Fourier (FFT) transformation is done to timing, if the length of FFT transform is fft_size.
2) amplitude of FFT coefficient is taken to obtain corresponding amplitude frequency spectrum.
3) 20 times of logarithmic transformations are done to amplitude frequency spectrum, frequency spectrum is carried out smooth.
4) frequency spectrum subscript [1, fft_size/4] section corresponding maximum value MAX, mean value m and standard deviation std after converting are sought. Id is designated as under note MAX is corresponding.
If 5) MAX > m+3.2*std, enter step 6), otherwise determines that timing is Non-periodic Type.
6) calculating maximum spectrum point corresponding period is that (fs is p=fft_size/ (Id*fs) * fs=fft_size/Id Sample frequency), if the period is equal to preset value pt, timing is determined as preiodic type and returns to its period p, is otherwise determined as non-week Phase type.
Step 404: the other timing of each major class being clustered using clustering algorithm, clusters the structure feature according to extraction It carries out.
It clusters the Time Series Clustering number that number is passed to by parameter component to determine, i.e. preiodic type Time Series Clustering number k1 and non-week Phase type Time Series Clustering number k2.
Step 405: according to cluster result, exporting each timing generic.
Algorithm assembly can also be sent to more application components as a result, for example in addition to sending cluster result to display component Cluster result is sent to some anomaly detection component, the anomaly detection component is according to timing generic using corresponding abnormal inspection Method of determining and calculating.
Embodiment:
In the present embodiment, the time series data stored in attached drawing 3 is 407 network port datas on flows, and acquisition time is long Degree is 2 weeks, the acquisition granularity 15 minutes (i.e. 96 points of acquisition daily).
The timing cycles preset value pt that parameter component passes to algorithm assembly in attached drawing 3 is equal to 1 day, i.e., periodically judgement is calculated Whether method determines timing using day as the period.The characteristic set of transmitting are as follows: seasonal indicator, tendency index, the degree of bias, relative entropy, Sample Entropy, self-similarity, Liapunov coefficient;7 structure features will be extracted in algorithm assembly.The preiodic type timing of transmitting Cluster number k1=4, Non-periodic Type Time Series Clustering number k2=5;Preiodic type timing is divided into 4 classes, Non-periodic Type data point For 5 classes.
The basic process of algorithm assembly used in the present embodiment, comprising:
Step 1: reading time series data, time series data is pre-processed, i.e., fills up missing by the way of linear interpolation Value, and remove noise.
Step 2: extracting the seasonal indicator of time series data, tendency index, the degree of bias, relative entropy, Sample Entropy, self-similarity With 7 structure features such as Liapunov coefficient.
Step 3: to time series data carry out preiodic type differentiation, time series data is first divided into two major classes: preiodic type with it is aperiodic Type.
Step 4: the other timing of each major class being clustered using K mean cluster algorithm, it is 4 that preiodic type data, which are gathered, Class, Non-periodic Type data are gathered for 5 classes.
Step 5: according to cluster result, exporting each timing generic.
The treatment process of the preiodic type method of discrimination of step 3, comprising:
1) Fourier (FFT) transformation is done to timing, the length of FFT transform is fft_size.
2) amplitude of FFT coefficient is taken to obtain corresponding amplitude spectrum.
3) 20 times of logarithmic transformations are done to amplitude spectrum, frequency spectrum is carried out smooth.
4) frequency spectrum subscript [1, fft_size/4] section corresponding maximum value MAX, mean value m and standard deviation std after converting are sought. Id is designated as under note MAX is corresponding.
If 5) MAX > m+3.2*std, enter step 6), otherwise determines that timing is Non-periodic Type.
6) calculating maximum spectrum point corresponding period is that (fs is p=fft_size/ (Id*fs) * fs=fft_size/Id Sample frequency), if the period is equal to preset value pt, timing is determined as preiodic type and returns to its period p, is otherwise determined as non-week Phase type.
By preiodic type distinguished number, 407 timing have 165 timing to be divided into preiodic type data, and 242 timing are divided At Non-periodic Type data.Then clustering is carried out using data of k mean value (k-means) algorithm to each major class.Attached drawing 6 is The exemplary waveform diagram of each categorical data of preiodic type timing;Attached drawing 7 is the typical waveform of each categorical data of Non-periodic Type timing Figure.The timing number of each subclass summarizes as shown in table 1.
Table 1: each timing subclass numbers summary sheet
After completing cluster, cluster result is stored in the cluster result database (feature vector and classification mark of each timing Label).In this example it is shown that component completes front end display function, cluster result is presented to the user, i.e. display such as attached drawing 6 With the waveform diagram of the timing of all categories of attached drawing 7.
In other embodiments, it can choose the technical solution in different implementation detail realization summary of the invention.For example it answers It is equal to 1 week with the period preset value pt that component 1 is passed to;Incoming preiodic type cluster number k1 is equal to 2, and Non-periodic Type clusters number Equal to 4;Selected temporal aspect integrates as seasonal indicator, tendency index, auto-correlation coefficient, relative entropy, Sample Entropy;Increase Other application component calls the cluster result of algorithm assembly.
A kind of electronic equipment of the Time Series Clustering provided according to embodiments of the present invention, the electronic equipment include: processor And memory, wherein the memory is for storing executable program code;The processor is by reading in the memory The executable program code of storage runs program corresponding with executable program code, for executing following steps:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read timing Data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result by algorithm assembly In database, and the cluster result is shown by display component.
A kind of computer storage medium provided according to embodiments of the present invention, is stored with the program of Time Series Clustering, when described The program of sequence cluster when being executed by processor the following steps are included:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read timing Data carry out clustering processing, obtain the cluster result comprising temporal aspect matrix and Time Series Clustering label;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result by algorithm assembly In database, and the cluster result is shown by display component.
The scheme provided according to embodiments of the present invention gives algorithm assembly to transmit relevant parameter, when realization by parameter component The flexible deployment of sequence clustering algorithm.Algorithm assembly reduces the complexity of clustering by the way of hierarchical cluster.Algorithm groups Cluster result is supplied to other application component as a kind of service by part, improves its function reusability.
Although describing the invention in detail above, but the invention is not restricted to this, those skilled in the art of the present technique It can be carry out various modifications with principle according to the present invention.Therefore, all to be modified according to made by the principle of the invention, all it should be understood as Fall into protection scope of the present invention.

Claims (10)

1. a kind of method of Time Series Clustering, comprising:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read time series data Clustering processing is carried out, the cluster result comprising temporal aspect matrix and Time Series Clustering label is obtained;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result data by algorithm assembly In library, and the cluster result is shown by display component.
2. according to the method described in claim 1, the algorithm parameter includes timing cycles preset value, temporal aspect collection with timely Sequence clusters number;The temporal aspect collection includes seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, opposite One or more of entropy, Sample Entropy, self-similarity and Liapunov coefficient;When the Time Series Clustering number includes the period Sequence clusters number and aperiodic Time Series Clustering number.
3. according to the method described in claim 2, the algorithm assembly is according to the received algorithm parameter of institute to read timing Data carry out clustering processing
Algorithm assembly extracts the structure characteristic collection of the time series data according to the temporal aspect collection in the algorithm parameter, and According to the timing cycles preset value in the algorithm parameter, period time series data in the time series data and aperiodic is determined Time series data;
Algorithm assembly according to taken out time series data structure characteristic collection and the Time Series Clustering number, respectively to institute the period when Ordinal number evidence and aperiodic time series data carry out clustering processing.
4. according to the method described in claim 3, the algorithm assembly is according to the timing cycles preset value in the algorithm parameter, The period time series data and aperiodic time series data determined in the time series data include:
Algorithm assembly calculates amplitude spectrum maximum value, amplitude spectrum average value and the amplitude spectrum standard deviation of time series data, and judges Whether the amplitude spectrum maximum value and the difference of the amplitude spectrum average value are greater than the multiple of amplitude spectrum standard deviation;
If judging, the amplitude spectrum maximum value is not more than the multiple of amplitude spectrum standard deviation with the difference of the amplitude spectrum average value, Algorithm assembly differentiates that the time series data is aperiodic time series data;
If judging, the amplitude spectrum maximum value is greater than the multiple of amplitude spectrum standard deviation with the difference of the amplitude spectrum average value, calculates Method component further judges whether the corresponding timing cycles of the amplitude spectrum maximum value are equal to the timing cycles preset value;
If judging, the corresponding timing cycles of the amplitude spectrum maximum value are equal to the timing cycles preset value, and algorithm assembly differentiates The time series data is period time series data;
If judging the corresponding timing cycles of the amplitude spectrum maximum value not equal to the timing cycles preset value, algorithm assembly is sentenced The not described time series data is aperiodic time series data.
5. according to the method described in claim 3, the algorithm assembly is according to taken out time series data structure characteristic collection and described Time Series Clustering number, carrying out clustering processing to institute's period time series data and aperiodic time series data respectively includes:
Algorithm assembly is according to taken out time series data structure characteristic collection, when according to the period Time Series Clustering number to the period Ordinal number is according to progress clustering processing;
Algorithm assembly is according to taken out time series data structure characteristic collection, according to the aperiodic Time Series Clustering number to the non-week Phase time series data carries out clustering processing.
6. a kind of device of Time Series Clustering, comprising:
Receiving module, for receiving the algorithm parameter for being clustered to time series data;
Cluster module, for reading time series data from time series database, according to the received algorithm parameter of institute to it is read when Ordinal number obtains the cluster result comprising temporal aspect matrix and Time Series Clustering label according to clustering processing is carried out;
Preservation and display module, it is poly- for the cluster result comprising temporal aspect matrix and Time Series Clustering label to be saved in In class result database, and the cluster result is shown by display component.
7. device according to claim 6, the algorithm parameter includes timing cycles preset value, temporal aspect collection with timely Sequence clusters number;The temporal aspect collection includes seasonal indicator, tendency index, the degree of bias, kurtosis, auto-correlation coefficient, opposite One or more of entropy, Sample Entropy, self-similarity and Liapunov coefficient;When the Time Series Clustering number includes the period Sequence clusters number and aperiodic Time Series Clustering number.
8. device according to claim 7, the cluster module include:
Processing unit, for extracting the structure feature of the time series data according to the temporal aspect collection in the algorithm parameter Collection, and according to the timing cycles preset value in the algorithm parameter, determine period time series data in the time series data and Aperiodic time series data;
Cluster cell is used for according to taken out time series data structure characteristic collection and the Time Series Clustering number, respectively to described in institute Period time series data and aperiodic time series data carry out clustering processing.
9. device according to claim 8, the processing unit include:
Feature extraction subelement, for extracting the knot of the time series data according to the temporal aspect collection in the algorithm parameter Structure feature set;
Period differentiates subelement, for calculating amplitude spectrum maximum value, amplitude spectrum average value and the amplitude spectrum mark of time series data It is quasi- poor, and judge the difference of the amplitude spectrum maximum value and the amplitude spectrum average value whether be greater than amplitude spectrum standard deviation again Number, if judging, the amplitude spectrum maximum value is not more than the multiple of amplitude spectrum standard deviation with the difference of the amplitude spectrum average value, Differentiate that the time series data is aperiodic time series data, and if judging the amplitude spectrum maximum value and the amplitude spectrum average value Difference be greater than amplitude spectrum standard deviation multiple, then further judge the corresponding timing cycles of the amplitude spectrum maximum value whether etc. In the timing cycles preset value, if judging, the corresponding timing cycles of the amplitude spectrum maximum value are default equal to the timing cycles Value then differentiates that the time series data is period time series data, if judging, the corresponding timing cycles of the amplitude spectrum maximum value are differed In the timing cycles preset value, then differentiate that the time series data is aperiodic time series data.
10. a kind of electronic equipment of Time Series Clustering, the electronic equipment include: processor and memory, wherein the memory For storing executable program code;The processor is transported by reading the executable program code stored in the memory Row program corresponding with executable program code, for executing following steps:
Algorithm assembly receives the algorithm parameter for being clustered to time series data that parameter component is sent;
Algorithm assembly reads time series data from time series database, according to the received algorithm parameter of institute to read time series data Clustering processing is carried out, the cluster result comprising temporal aspect matrix and Time Series Clustering label is obtained;
The cluster result comprising temporal aspect matrix and Time Series Clustering label is saved in cluster result data by algorithm assembly In library, and the cluster result is shown by display component.
CN201710817446.7A 2017-09-12 2017-09-12 A kind of method and device of Time Series Clustering Pending CN110019543A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710817446.7A CN110019543A (en) 2017-09-12 2017-09-12 A kind of method and device of Time Series Clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710817446.7A CN110019543A (en) 2017-09-12 2017-09-12 A kind of method and device of Time Series Clustering

Publications (1)

Publication Number Publication Date
CN110019543A true CN110019543A (en) 2019-07-16

Family

ID=67186268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710817446.7A Pending CN110019543A (en) 2017-09-12 2017-09-12 A kind of method and device of Time Series Clustering

Country Status (1)

Country Link
CN (1) CN110019543A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887812A (en) * 2021-10-14 2022-01-04 广东电网有限责任公司 Clustering-based small sample load prediction method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149412A1 (en) * 2012-11-26 2014-05-29 Ricoh Company, Ltd. Information processing apparatus, clustering method, and recording medium storing clustering program
CN105205112A (en) * 2015-09-01 2015-12-30 西安交通大学 System and method for excavating abnormal features of time series data
CN105608758A (en) * 2015-12-17 2016-05-25 山东鲁能软件技术有限公司 Big data analysis platform apparatus and method based on algorithm configuration and distributed stream computing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149412A1 (en) * 2012-11-26 2014-05-29 Ricoh Company, Ltd. Information processing apparatus, clustering method, and recording medium storing clustering program
CN105205112A (en) * 2015-09-01 2015-12-30 西安交通大学 System and method for excavating abnormal features of time series data
CN105608758A (en) * 2015-12-17 2016-05-25 山东鲁能软件技术有限公司 Big data analysis platform apparatus and method based on algorithm configuration and distributed stream computing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887812A (en) * 2021-10-14 2022-01-04 广东电网有限责任公司 Clustering-based small sample load prediction method, device, equipment and storage medium
CN113887812B (en) * 2021-10-14 2023-07-07 广东电网有限责任公司 Clustering-based small sample load prediction method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110851338B (en) Abnormality detection method, electronic device, and storage medium
CN106383766B (en) System monitoring method and apparatus
CN107707376B (en) A kind of method and system of monitoring and alarm
CN111475804A (en) Alarm prediction method and system
CN110096410A (en) Alarm information processing method, system, computer installation and readable storage medium storing program for executing
CN106888194A (en) Intelligent grid IT assets security monitoring systems based on distributed scheduling
CN108427725A (en) Data processing method, device and system
CN110058977A (en) Monitor control index method for detecting abnormality, device and equipment based on Stream Processing
CN109684052B (en) Transaction analysis method, device, equipment and storage medium
CN107040608A (en) A kind of data processing method and system
CN109389518A (en) Association analysis method and device
CN107391571A (en) The processing method and processing device of sensing data
CN110414778A (en) Case work dispatching method and device
CN109960839B (en) Service link discovery method and system of service support system based on machine learning
CN115563180A (en) Dynamic threshold generation method, device, equipment and storage medium
CN111147306B (en) Fault analysis method and device of Internet of things equipment and Internet of things platform
CN111339052A (en) Unstructured log data processing method and device
CN113342939B (en) Data quality monitoring method and device and related equipment
CN110019543A (en) A kind of method and device of Time Series Clustering
CN114500543A (en) Distributed elastic edge acquisition system and application method thereof
CN109818808A (en) Method for diagnosing faults, device and electronic equipment
CN107065605B (en) A kind of fault diagnosis and alarm method
CN109375146A (en) A kind of filling mining method, system and the terminal device of electricity consumption data
CN106130929B (en) The service message automatic processing method and system of internet insurance field based on graph-theoretical algorithm
CN106506282A (en) A kind of monitoring method for improving cloud platform monitoring performance and scale

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190716