CN111488924A - Multivariate time sequence data clustering method - Google Patents

Multivariate time sequence data clustering method Download PDF

Info

Publication number
CN111488924A
CN111488924A CN202010265442.4A CN202010265442A CN111488924A CN 111488924 A CN111488924 A CN 111488924A CN 202010265442 A CN202010265442 A CN 202010265442A CN 111488924 A CN111488924 A CN 111488924A
Authority
CN
China
Prior art keywords
data
clustering
value
model
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010265442.4A
Other languages
Chinese (zh)
Other versions
CN111488924B (en
Inventor
王婷
崔运鹏
刘娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Information Institute of CAAS
Original Assignee
Agricultural Information Institute of CAAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Information Institute of CAAS filed Critical Agricultural Information Institute of CAAS
Priority to CN202010265442.4A priority Critical patent/CN111488924B/en
Publication of CN111488924A publication Critical patent/CN111488924A/en
Application granted granted Critical
Publication of CN111488924B publication Critical patent/CN111488924B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multivariate time sequence data clustering method, which comprises the steps of carrying out normalization pretreatment on multivariate time sequence data; constructing a sparse self-encoder of a deep learning unsupervised learning model, and performing feature extraction on multivariate time sequence data to construct a new feature sequence; acquiring a clustering K value of a new characteristic sequence of sample data; calculating the distance between new characteristic sequences of different sample data based on the Euclidean distance; clustering the new characteristic sequence set of the sample data; and analyzing potential patterns of the multivariate time sequence data according to the clustering result. According to the invention, through the sparse self-encoder model and the clustering method, the efficiency of processing large-scale data is improved, the sparse self-encoder model is constructed to improve the performance of the model for extracting a new characteristic sequence from the multivariate time sequence data, and meanwhile, the multivariate distance calculation model is constructed according to the Euclidean distance to realize the clustering of the multivariate time sequence data.

Description

Multivariate time sequence data clustering method
Technical Field
The invention relates to the field of data clustering, in particular to a multivariate time series data clustering method.
Background
With the rapid development of the internet of things, research based on time series data is widely applied to multiple fields such as finance and medical treatment. Clustering is an effective time sequence data analysis method, characteristics of time sequence data can be analyzed by mining potential patterns of the time sequence, and application problems of the time sequence data can be further researched on the basis.
At present, a time series clustering method mainly comprises the following steps: (1) time series data clustering method based on division. The number of data categories and an initial clustering center point are first determined, and then the sample points are classified into different categories by calculating the distance between each sample point and the clustering center point until convergence. (2) A time series data clustering method based on density. The category radius and the number of samples within a category are first determined and then clustered until the density of neighboring regions exceeds a set threshold. (3) A time series data clustering method based on hierarchy. The method can be divided into a top-down mode and a bottom-up mode, wherein the top-down mode takes all samples as root nodes, and then the splitting is performed recursively until a single sample class appears; the latter starts from a single sample and merges until a stop condition is met. These methods usually cannot accurately and comprehensively mine the inherent characteristics of time series data, and research on time series data is relatively limited, especially for mining analysis of potential patterns of multivariate time series data, so we develop a multivariate time series data clustering method here.
Disclosure of Invention
The invention aims to provide a multivariate time sequence clustering method, which combines an unsupervised learning model sparse self-encoder and a traditional clustering method Kmeans, constructs a sparse self-encoder model by taking a deep learning L STM model as a basic unit, extracts a new feature sequence set of single variable time sequence data through the sparse self-encoder model, constructs a multivariate distance calculation method according to Euclidean distances to calculate the distances between multiple variable time sequence data of different samples, and then clusters the new feature sequences of all samples by using the Kmeans clustering method, thereby effectively mining the potential pattern of the multivariate time sequence data based on a clustering result.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention comprises the following steps:
s10, preprocessing the multivariate time sequence data, wherein the preprocessing comprises the steps of carrying out validation and normalization operation on the data;
s20, constructing a sparse self-encoder of a deep learning unsupervised learning model by taking a deep learning L STM model as a basic unit, and performing feature extraction on multivariate time series data to construct a new feature sequence;
s30, acquiring a cluster K value of the new characteristic sequence of the sample data;
s40, calculating the distance between new feature sequences of different sample data based on Euclidean distance;
s50, clustering the new characteristic sequence set of the sample data;
s60, analyzing the potential mode of the multivariate time sequence data according to the clustering result, averaging all the sample point data in each category to obtain the average value of each time point in the multivariate time sequence data, and acquiring a new multivariate average time sequence with the category as the unit.
Further, the self-encoder model comprises an encoder and a decoder, new characteristic values of the data are extracted by encoding the input data, and then an output is obtained by further decoding, and the output is equal to the input as a model optimization target. The self-encoder model is a training process of adding a sparse term in an optimization function to limit model parameters so as to optimize the model, and the training process is as follows:
(1) taking data points in the time series data one by one as the input of an L STM unit in an encoder, and taking a sequence obtained after the input of the last data point as a new characteristic sequence of sample data;
(2) and taking the new characteristic sequence of the sample data as the input of a decoder, and continuously training a sparse self-encoder model by taking a mean square error function added with a sparse term as an optimization function. The calculation formula of the optimization function is as follows:
Figure BDA0002441105510000031
Figure BDA0002441105510000041
further, obtaining a clustering K value of the multivariate time series data, specifically comprising the following steps:
(1) selecting a specific K value within the value range of (1-100), randomly generating samples with the same number as the initial samples in a specific three-dimensional area where the samples are located according to the uniform distribution principle, and clustering by adopting a Kmeans method to obtain WkThe calculation formula is as follows:
Figure BDA0002441105510000042
(2) obtaining s bykThe value n takes the value of 100, and the calculation formula is as follows:
Figure BDA0002441105510000043
wherein, WknDenotes W under the condition of specific n valuek
(4) And performing a second step and a third step on all K values in the K value range, selecting the K value with the fastest Wk drop as the optimal clustering number, and adopting the following calculation formula:
Figure BDA0002441105510000044
further, the validation comprises deleting data with missing value proportion larger than 80% in the data, and the normalization comprises mapping the data of different medication types of the case to the intervals (0,1) by adopting a most value normalization method, wherein the specific formula is as follows:
Figure BDA0002441105510000051
further, the method for clustering the new feature sequences of all sample data specifically comprises the following steps:
(1) firstly, randomly dividing all sample points into K categories according to K values;
(2) calculating a new category center point for each category, and clustering all sample points again according to the distance between each sample point and each category center point;
(3) the second step is repeated until the value of K is satisfied.
And evaluating the effectiveness of the method provided by the invention by taking the contour coefficient SC as an evaluation standard of the clustering performance of the multivariate time series data.
Figure BDA0002441105510000052
Where a (i) represents the average distance of sample i to other samples in the same cluster, and b (i) represents the average distance of sample i to all samples in other clusters.
Compared with the prior art, the invention has the beneficial effects that:
the multivariate time sequence data clustering method provided by the invention fully utilizes the high-performance feature extraction of the unsupervised learning model sparse self-encoder on large-scale data in deep learning and the excellent sequence memory performance of the L STM model on time sequence data in deep learning by combining a new deep learning method and a traditional clustering method, and effectively solves the problem that the traditional Kmeans clustering method cannot well process the large-scale data, thereby better mining and analyzing the potential pattern of the multivariate time sequence data.
Drawings
FIG. 1 is a flow chart of a multivariate time series data clustering method;
FIG. 2 is a block diagram of a sparse self-encoder model;
FIG. 3 is a result of performance evaluation of a multivariate time series data clustering method;
FIG. 4 is a potential pattern mining result of multivariate time series data
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
As shown in figure 1, the method is suitable for a multivariate sequence data clustering method, and is used for preprocessing multivariate sequence data, namely, multi-type patient medication data, acquired by a medical institution, firstly, a new characteristic sequence of the data is constructed through characteristic extraction, then, distance measurement between a clustering K value and the new characteristic sequence is acquired, finally, the new characteristic sequence is clustered based on a Kmeans clustering method, and a potential mode of original data is analyzed according to a clustering result.
Step S10: multivariate time series data preprocessing
The medical institution acquires various types of medication data of patients, and the data are respectively subjected to validation and normalization processing to construct a clustered data set. Taking the Medicare data set as an example, 90-day administration data containing about 32 ten thousand cases for two types of drugs (referred to simply as drug a and drug B), the following pre-processing is performed:
and (4) activating. The case data with the missing value proportion of more than 80 percent in the data is deleted, and the number of cases is reduced from about 32 ten thousands to about 31 ten thousands.
And (6) normalizing. The data of different medication types of the case are mapped between the intervals (0,1) by adopting a most value normalization method, and the specific formula is as follows:
Figure BDA0002441105510000071
step S20: construction of new feature sequences by feature extraction of multivariate time series data
In the step, a deep learning unsupervised learning model sparse self-encoder is constructed by taking a deep learning long-short term memory (L ong-term memory, L STM) model as a basic unit, and the multivariate time series data is subjected to feature extraction to construct a new feature sequence of the sample data.
L STM is a special deep learning RNN model, can solve the problem of gradient disappearance appearing in the long-time sequence data training process, has better time sequence memory performance than the ordinary RNN model L STM contains 3 gates, which are respectively (1) update gate, used for controlling input gate (2) output gate, used for controlling the past degree of the existing content (3) forget gate, used for controlling output, the update formula of model parameters is as follows:
Figure BDA0002441105510000081
a<t>o*tanh(c<t>)
f=σ(Wf[a<t-1>,x<t>]+bf)
u=σ(Wu[a<t-1>,x<t>]+bu)
o=σ(Wo[a<t-1>,x<t-1>]+bo)
Figure BDA0002441105510000082
the sparse self-encoder model respectively comprises an encoder part and a decoder part, and as shown in fig. 2, the part a belongs to the encoder part of the sparse self-encoder model; part B belongs to the decoder part of the sparse self-encoder model.
The method comprises the steps of firstly, encoding input data to extract a new characteristic value of the data, and then, further decoding to obtain an output, wherein the output is equal to the input and is used as a model optimization target. The sparse self-encoder model is a training process for optimizing a model by adding a sparse term to an optimization function of the self-encoder model to limit model parameters, and the training process is as follows:
taking data points in the time series data one by one as the input of an L STM unit in an encoder, and taking a sequence obtained after the input of the last data point as a new characteristic sequence of sample data;
and taking the new characteristic sequence of the time sequence data as the input of a decoder, and continuously training a sparse self-encoder model by taking a mean square error function added with sparse items as an optimization function. The calculation formula of the optimization function is as follows:
Figure BDA0002441105510000091
in the case, the administration data of 90 days of case-to-case drug A and drug B are respectively converted into 50-dimensional new characteristic sequences by data extraction.
Step S30: and acquiring a clustering K value of the new sample data characteristic sequence set.
In the step, a Gap statistical method is used for acquiring a clustering K value of multivariate time sequence data, and the specific process is as follows:
setting a value range of the K value;
selecting a specific K value in a value range, randomly generating samples with the same number as the initial samples in a specific three-dimensional area where the samples are located according to a uniform distribution principle, and clustering by adopting Kmeans to obtain Wk, wherein the calculation formula is as follows:
Figure BDA0002441105510000092
the sk value is obtained by repeating the second step 2-5 times, and the calculation formula is as follows:
Figure BDA0002441105510000101
repeating the second step and the third step for all K values in the K value range, selecting the K value with the fastest Wk drop as the optimal clustering number, and adopting the following calculation formula:
Figure BDA0002441105510000102
in the case, based on the above calculation of Gap static, 4 is selected as the cluster K value as shown in fig. 3.
Step S40: calculating the distance between new characteristic sequences of different sample data
In the step, a multivariate distance calculation model is constructed according to Euclidean distances, and the distance between new characteristic sequences of different sample data is calculated, wherein the calculation formula is as follows:
Figure BDA0002441105510000103
step S50: clustering new characteristic sequence set of sample data based on Kmeans clustering method
The new characteristic sequences of all sample data are clustered in the step, and the specific process is as follows:
(1) firstly, randomly dividing all sample points into K categories according to K values;
(2) calculating a new category center point for each category, and clustering all sample points again according to the distance between each sample point and each category center point;
(3) the second step is repeated until the value of K is satisfied.
And evaluating the effectiveness of the method provided by the invention by taking the contour coefficient SC as an evaluation standard of the clustering performance of the multivariate time series data.
Figure BDA0002441105510000111
Where a (i) represents the average distance of sample i to other samples in the same cluster, and b (i) represents the average distance of sample i to all samples in other clusters.
As shown in Table 1, the SC value of the method provided by the invention is higher than that of other existing methods, and the SC value is highest and the clustering performance is optimal under the condition that the Euclidean distance is taken as the clustering measurement.
TABLE 1 clustering performance results of different clustering methods under different distance metrics
Distance measurement Hierarchical clustering method k-means bi-kmeans k-medoids The invention
Euclidean 0.65 0.56 0.69 0.63 0.88
Pearson 0.41 0.49 0.65 0.59 0.72
LCSS 0.55 0.52 0.67 0.53 0.70
DTW 0.63 0.54 0.61 0.47 0.67
EDR 0.57 0.58 0.59 0.51 0.66
Step S60: analyzing potential patterns of multivariate time sequence data according to clustering results
Averaging all sample point data in each category to obtain an average value of each time point in the multivariate time sequence data, acquiring a new multivariate average time sequence with the category as a unit, and further researching the potential mode of the multivariate time sequence data on the basis of the new multivariate average time sequence.
According to the above analysis method, the potential patterns of the data of two drugs administered to the patient in the case are shown in fig. 4, and can be divided into 4 types: (1) type a, i.e. ultra low dose administration. The dosage of the two medicines is about 0, and the number of cases accounts for 32.3 percent of the total number of cases; (2) type B, i.e. low dose administration. The dosage of OPI is less than 30 percent, the dosage of BZD is less than 2 percent, and the number of cases accounts for 57.5 percent of the total number of cases; (3) type C, i.e. low dose BZD ultra high dose administration of OPI. The dosage interval range of OPI is (30,50), the dosage interval range of BZD is (13, 19), and the number of cases accounts for 5.0 percent of the total number of cases; (4) type D, i.e. high dose administration. The dosage of OPI is more than 220 percent and the dosage of BZD is more than 5 percent, and the number of cases accounts for 5.2 percent of the total number of people.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (5)

1. A multivariate time series data clustering method is characterized by comprising the following steps:
s10, preprocessing the multivariate time sequence data, wherein the preprocessing comprises the steps of carrying out validation and normalization operation on the data;
s20, constructing a sparse self-encoder of a deep learning unsupervised learning model by taking a deep learning L STM model as a basic unit, and performing feature extraction on multivariate time series data to construct a new feature sequence;
s30, acquiring a cluster K value of the new characteristic sequence of the sample data;
s40, calculating the distance between new feature sequences of different sample data based on Euclidean distance;
s50, clustering the new characteristic sequence set of the sample data;
s60, analyzing the potential mode of the multivariate time sequence data according to the clustering result, averaging all the sample point data in each category to obtain the average value of each time point in the multivariate time sequence data, and acquiring a new multivariate average time sequence with the category as the unit.
2. The method of claim 2, wherein the self-encoder model comprises an encoder and a decoder, and the self-encoder model is used for encoding input data to extract new eigenvalues of the data, and then further decoding to obtain an output, and the output is equal to the input as the model optimization target. The self-encoder model is a training process of adding a sparse term in an optimization function to limit model parameters so as to optimize the model, and the training process is as follows:
(1) taking data points in the time series data one by one as the input of an L STM unit in an encoder, and taking a sequence obtained after the input of the last data point as a new characteristic sequence of sample data;
(2) and taking the new characteristic sequence of the sample data as the input of a decoder, and continuously training a sparse self-encoder model by taking a mean square error function added with a sparse term as an optimization function. The calculation formula of the optimization function is as follows:
Figure FDA0002441105500000021
Figure FDA0002441105500000022
Figure FDA0002441105500000023
3. the method as claimed in claim 1, wherein the clustering K value of the multivariate time series data is obtained by the following steps:
(1) selecting a specific K value within the value range of (1-100), randomly generating samples with the same number as the initial samples in a specific three-dimensional area where the samples are located according to the uniform distribution principle, and clustering by adopting a Kmeans method to obtain WkThe calculation formula is as follows:
Figure FDA0002441105500000024
(2) obtaining s bykThe value n takes the value of 100, and the calculation formula is as follows:
Figure FDA0002441105500000031
Figure FDA0002441105500000032
Figure FDA0002441105500000033
wherein, WknDenotes W under the condition of specific n valuek
(3) Repeating the second step and the third step for all K values in the K value range, selecting the K value with the fastest Wk drop as the optimal clustering number, and adopting the following calculation formula:
Figure FDA0002441105500000034
4. the multivariate time series data clustering method as claimed in claim 1, wherein the validation comprises deleting data with missing value ratio greater than 80%, and the normalization comprises mapping data of different medication types of a case to intervals (0,1) by using a most-valued normalization method, wherein the specific formula is as follows:
Figure FDA0002441105500000035
5. the method according to claim 1, wherein the new signature sequences of all sample data are clustered by the following steps:
(1) firstly, randomly dividing all sample points into K categories according to K values;
(2) calculating a new category center point for each category, and clustering all sample points again according to the distance between each sample point and each category center point;
(3) the second step is repeated until the value of K is satisfied.
And evaluating the effectiveness of the method provided by the invention by taking the contour coefficient SC as an evaluation standard of the clustering performance of the multivariate time series data.
Figure FDA0002441105500000041
Figure FDA0002441105500000042
Where a (i) represents the average distance of sample i to other samples in the same cluster, and b (i) represents the average distance of sample i to all samples in other clusters.
CN202010265442.4A 2020-04-07 2020-04-07 Multivariable time sequence data clustering method Active CN111488924B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010265442.4A CN111488924B (en) 2020-04-07 2020-04-07 Multivariable time sequence data clustering method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010265442.4A CN111488924B (en) 2020-04-07 2020-04-07 Multivariable time sequence data clustering method

Publications (2)

Publication Number Publication Date
CN111488924A true CN111488924A (en) 2020-08-04
CN111488924B CN111488924B (en) 2024-04-26

Family

ID=71811758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010265442.4A Active CN111488924B (en) 2020-04-07 2020-04-07 Multivariable time sequence data clustering method

Country Status (1)

Country Link
CN (1) CN111488924B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112345261A (en) * 2020-10-29 2021-02-09 南京航空航天大学 Aero-engine pumping system abnormity detection method based on improved DBSCAN algorithm

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3188111A1 (en) * 2015-12-28 2017-07-05 Deutsche Telekom AG A method for extracting latent context patterns from sensors
CN109472321A (en) * 2018-12-03 2019-03-15 北京工业大学 A kind of prediction towards time series type surface water quality big data and assessment models construction method
CN109636061A (en) * 2018-12-25 2019-04-16 深圳市南山区人民医院 Training method, device, equipment and the storage medium of medical insurance Fraud Prediction network
CN109919189A (en) * 2019-01-29 2019-06-21 华南理工大学 A kind of depth K mean cluster method towards time series data
CN110070145A (en) * 2019-04-30 2019-07-30 天津开发区精诺瀚海数据科技有限公司 LSTM wheel hub single-item energy consumption prediction based on increment cluster
CN110459292A (en) * 2019-07-02 2019-11-15 南京邮电大学 A kind of risk management stage division based on cluster and PNN

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3188111A1 (en) * 2015-12-28 2017-07-05 Deutsche Telekom AG A method for extracting latent context patterns from sensors
CN109472321A (en) * 2018-12-03 2019-03-15 北京工业大学 A kind of prediction towards time series type surface water quality big data and assessment models construction method
CN109636061A (en) * 2018-12-25 2019-04-16 深圳市南山区人民医院 Training method, device, equipment and the storage medium of medical insurance Fraud Prediction network
CN109919189A (en) * 2019-01-29 2019-06-21 华南理工大学 A kind of depth K mean cluster method towards time series data
CN110070145A (en) * 2019-04-30 2019-07-30 天津开发区精诺瀚海数据科技有限公司 LSTM wheel hub single-item energy consumption prediction based on increment cluster
CN110459292A (en) * 2019-07-02 2019-11-15 南京邮电大学 A kind of risk management stage division based on cluster and PNN

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张潇龙;齐林海;: "融合稀疏降噪自编码与聚类算法的配电网台区分类研究", 电力信息与通信技术, no. 12, pages 15 - 23 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112345261A (en) * 2020-10-29 2021-02-09 南京航空航天大学 Aero-engine pumping system abnormity detection method based on improved DBSCAN algorithm
CN112345261B (en) * 2020-10-29 2022-05-03 南京航空航天大学 Aero-engine pumping system abnormity detection method based on improved DBSCAN algorithm

Also Published As

Publication number Publication date
CN111488924B (en) 2024-04-26

Similar Documents

Publication Publication Date Title
CN107016438B (en) System based on traditional Chinese medicine syndrome differentiation artificial neural network algorithm model
Xia et al. Research in clustering algorithm for diseases analysis
CN111000553B (en) Intelligent classification method for electrocardiogram data based on voting ensemble learning
WO2019041628A1 (en) Method for mining multivariate time series association rule based on eclat
CN108763590B (en) Data clustering method based on double-variant weighted kernel FCM algorithm
CN107122643B (en) Identity recognition method based on feature fusion of PPG signal and respiratory signal
CN107103048A (en) Medicine information matching process and system
Wong et al. Herd clustering: A synergistic data clustering approach using collective intelligence
CN107203686A (en) medicine information difference processing method and system
CN111488924B (en) Multivariable time sequence data clustering method
CN110335160B (en) Medical care migration behavior prediction method and system based on grouping and attention improvement Bi-GRU
Peng et al. The health care fraud detection using the pharmacopoeia spectrum tree and neural network analytic contribution hierarchy process
Gossmann et al. Test data reuse for the evaluation of continuously evolving classification algorithms using the area under the receiver operating characteristic curve
Tzacheva et al. Support confidence and utility of action rules triggered by meta-actions
Idris et al. Applications of machine learning for prediction of liver disease
Sebayang et al. Optimization on Purity K-means using variant distance measure
US20240170104A1 (en) Method and system for predicting adverse drug-drug interactions by recovering the multi-attribute information of drugs, and medium
CN104616027A (en) Non-adjacent graph structure sparse face recognizing method
Vidyasagar Probabilistic methods in cancer biology
Yang et al. Clustering inter-arrival time of health care encounters for high utilizers
Yazdi et al. Hierarchical tree clustering of fuzzy number
Thirumagal et al. Lung cancer classification using exponential mean saturation linear unit activation function in various generative adversarial network models
Egho et al. Healthcare trajectory mining by combining multidimensional component and itemsets
CN114298126A (en) Brain function network classification method based on condition mutual information and kernel density estimation
Pedrycz et al. Genetic design of feature spaces for pattern classifiers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant