CN111428768A - Hellinger distance-Gaussian mixture model-based clustering method - Google Patents
Hellinger distance-Gaussian mixture model-based clustering method Download PDFInfo
- Publication number
- CN111428768A CN111428768A CN202010190288.9A CN202010190288A CN111428768A CN 111428768 A CN111428768 A CN 111428768A CN 202010190288 A CN202010190288 A CN 202010190288A CN 111428768 A CN111428768 A CN 111428768A
- Authority
- CN
- China
- Prior art keywords
- mixture model
- gaussian mixture
- gaussian
- sample
- parameters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Physics & Mathematics (AREA)
- Educational Administration (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Development Economics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Evolutionary Computation (AREA)
Abstract
The invention discloses a clustering method based on a Hellinger distance-Gaussian mixture model, which is applied to the field of mechanical fault diagnosis and cluster analysis. In order to solve the problem of low identification precision of label-free data in the prior art, the method improves the clustering capability of a Gaussian mixture model, introduces a regular term which is minimized based on Hellinger distance and used for measuring the distance between probability distributions of the Gaussian mixture model in a data manifold space, and introduces a regular term which is minimized based on Hellinger distance and used for restricting the updating process of posterior probability, and gradually updates the parameters of the Gaussian mixture model by combining with a generalized expectation maximization algorithm, so that the probability of generating given data by the probability distribution of the obtained mixture model is maximized, thereby realizing automatic learning and clustering of data, accurately judging the category information of label-free data, and providing a feasible method for intelligent learning of big data.
Description
Technical Field
The invention belongs to the field of mechanical fault diagnosis and cluster analysis, and particularly relates to a data clustering technology based on a Hellinger distance-Gaussian mixture model.
Background
The upsizing and complication of modern mechanical equipment make it highly susceptible to various faults during continuous operation, and therefore, in order to ensure safe and reliable operation of the equipment, it is necessary to detect potential fault risks early through monitoring and diagnosis, avoid possible accidents and corresponding maintenance consumption, and maximize the use efficiency of the equipment. In recent years, with the proposal of the concept of "big data", data-driven intelligent fault diagnosis methods are popularized and applied. According to the method, the physical failure mechanism of the equipment does not need to be searched, the health state of the monitoring equipment can be automatically judged by learning the statistical rules and the internal characteristics of a large amount of data, and a useful tool is provided for online monitoring and fault monitoring of industrial equipment.
The intelligent fault diagnosis provides an important means for online monitoring and health prediction of large-scale equipment and rotating equipment by automatically extracting fault information implicit in large-scale monitoring data, but the existing large-scale diagnosis methods are based on the assumption that typical faults of the monitoring data are complete and state marks are clear, which is difficult to meet in engineering practice. In the actual operation process of the equipment, in order to ensure continuous and safe operation of the equipment, the equipment cannot be frequently stopped to detect faults and mark the equipment state, so that the monitored data only has little or even no marking information, and the equipment state corresponding to the related data cannot be known. Therefore, there is a need to implement accurate intelligent diagnosis of devices using unsupervised learning methods.
Unsupervised learning is a machine learning method that finds its intrinsic regularity or structure from unlabeled data, where clustering can group a given large amount of data into several categories according to their similarity or distance of features. Existing clustering methods can be divided into two categories: the hard clustering (such as K-means) and the soft clustering (such as Gaussian mixture models) can be used, the former judges that the sample can only belong to one class, and the latter divides the class to simultaneously mine the longitudinal structure (similarity) and the transverse structure (dimension reduction) of the data, so that a more accurate clustering result is obtained. The Gaussian mixture model can fit the distribution of any data by linearly combining a plurality of Gaussian distribution functions, but the existing Gaussian mixture model is influenced by factors such as parameter initialization and complex operation, and the algorithm research and clustering application are still few. Therefore, how to improve the gaussian mixture model, the model parameters are optimized by combining the expectation maximization algorithm, and the recognition capability of the model parameters on the sample class is improved.
Disclosure of Invention
In order to solve the technical problems, the invention provides a clustering method based on a Hellinger distance-Gaussian mixture model, which introduces a regularization term based on the Hellinger distance on the basis of maximizing the probability distribution of a sample, constructs an internal manifold structure of the sample through a generalized expectation maximization algorithm, and further realizes the automatic judgment of the data category. The steps are described as follows:
data feature composition sample set X to be classified is X ═ { XiI-1, …, n }, and contains n samples, each sample xiIncluding d-dimensional features.
And S1, setting parameters and initializing.
1) Setting K components in the Gaussian mixture model, and initializing Gaussian model parameters by adopting a K-mean algorithm
2) A regularization coefficient λ is set.
3) An update coefficient γ is set, and its initial value is set to 0.9.
4) And setting the number l of adjacent neighbors.
5) The iteration number t is initialized to 1, i.e., t equals 1.
6) The iteration end value is set to a smaller value.
S2, constructing a model optimization objective function: and defining an objective function of the parameter optimization of the Gaussian mixture model. In the parameter optimization process, Hellinger distance is introduced to calculate the closeness degree between the two distributions.
The constructed gaussian mixture model is composed of K gaussian distributions,
wherein, theta is (pi)1,μ1,Σ1,…,πK,μK,ΣK) Is a parameter of the Gaussian mixture model, μkSum-sigmak(K1, …, K) is the mean and covariance of the kth gaussian, Nk(xiμk,Σk) Is the Gaussian distribution density of the kth partial model, pikIs the corresponding mixing coefficient and satisfies
In order to realize data clustering, the gaussian mixture model parameters Θ are updated by iterative operations. Thus, X is defined as the observation sample set, and Z ═ ZiI is 1, …, n, and X and Z form a complete sample set, and on the basis of maximizing a complete sample log-likelihood function, a regularization term is introduced to form an optimization objective function, which is defined as follows:
wherein, the lambda is a regularization coefficient,for the regularization term, the Hellinger distance is incorporated herein into the regularization term. Hellinger distance is typically used in order and asymptotic statistics, then probability distribution PiAnd PjThe square of the Hellinger distance between is:
Wherein, P (k | x)i) And P (k | x)j) Are respectively a sample xiAnd xjThe posterior probability generated by the k-th Gaussian component, the Laplace matrix L is L ═ D-W, where the relationship between matrices D and W is T denotes transposition.
For sample xiFrom the Hellinger distance, its l nearest neighbors, l ∈ { n-1 }. sample x in the nearest neighbor graph can be determinediSample x adjacent to itjW of the weight betweenijIs defined as:
wherein the content of the first and second substances,represents a sample xjI neighbor sample sets.
S3, calculating the posterior probability of the sample: according to the parameters theta of the mixed modelt-1And calculating the posterior probability by adopting a generalized expectation maximization algorithm.
Obtaining a Gaussian mixture model parameter theta according to the t-1 iterationt-1Calculating posterior probability
On the basis, a Q function is defined by adopting a generalized expectation maximization algorithm for iterative operation of model parameters, and the Q function is expressed as
The objective of iterative optimization is to maximize Q (Θ, Θ) separatelyt-1) And minimizing the regularization term
S4, updating model parameters: and updating the posterior probability and the Gaussian mixture model parameters by adopting a generalized expected maximum algorithm.
First, minimize the regularization term(equation (17)), applying Newton-Laprison's method to obtain the update of the posterior probability as:
second, maximize Q (Θ, Θ)t-1) Updated gaussian mixture model parameters Θ can be obtainedt:
And S5, calculating a regularization likelihood function value.
S6, judging iteration termination:
1) if it is notThe update coefficient 0.9 γ → γ is set (i.e., the current update coefficient γ is multiplied by 0.9 as the update coefficient γ of the next iteration), returning to S4.
2) If it is notStopping iteration, and determining the parameters theta of the Gaussian mixture modeltOutputs the posterior probability P (k | x)i) (i ═ 1, …, n; k ═ 1, …, K); otherwise, adding 1 to the iteration number, namely t ← t +1, and returning to S3.
S7, data type judgment: for each sample, a gaussian component label K (K is 1, …, K) corresponding to the maximum posterior probability is taken as the clustering result of the sample.
The invention has the beneficial effects that: the invention provides a clustering method based on a Hellinger distance-Gaussian mixture model, which is characterized in that a data clustering algorithm is constructed by using the unsupervised learning capability of the Gaussian mixture model, each cluster is determined by one Gaussian distribution, and each data is formed by the comprehensive action of a plurality of probability clusters; the neighbor samples are defined on a data manifold structure through Hellinger distance and regularization terms, and simultaneously Gaussian distribution parameters and coefficients in the hybrid model are gradually updated by combining a generalized expectation maximization algorithm, so that the probability of generating given data by probability distribution determined by the hybrid model is the maximum, automatic learning and clustering of the data are realized, and the class information of the data without labels can be accurately judged. The method not only expands the probabilistic clustering algorithm, but also can improve the mining capability of the potential structure of the unmarked data, and can be applied to intelligent diagnosis of industrial data.
Drawings
FIG. 1 is a flow chart of the clustering method based on Hellinger distance-Gaussian mixture model of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Fig. 1 is a flow chart of the clustering method based on the Hellinger distance-gaussian mixture model according to the present invention.
The Iris data set is used in the embodiment to clarify the implementation result. The dataset is also called iris data set, and is a type of dataset for multi-variable analysis. It contains 150 data samples, each data containing 4 attributes (features), namely calyx length and width, petal length and width; the categories to which the data belongs are of 3 types: irises Iris (Iris Setosa), Iris versicolor Iris (Iris versicolor) and irises Virginica (Iris Virginica), 50 data per category.
The parameter settings and initial value settings of the gaussian mixture model are as follows:
1) the Gaussian component K is 3, and a K mean value algorithm is adopted to initialize model parameters
2) The regularization coefficient λ is 0.1.
3) The initial value of the update coefficient is γ equal to 0.9.
4) The number of neighbors l is 2.
5) Iteration end value of 10-5。
According to the method, the sample is input into the Gaussian mixture model, and the model parameter value is updated through iterative operation until the termination condition is met. For each sample to be classified, the model outputs posterior probability values obtained by calculating 1 st, 2 nd and 3 rd Gaussian components when iteration is terminated, and the label of the Gaussian component corresponding to the maximum value is taken as the class information of the sample. For example, for sample x1The posterior probability values of the outputs of the 1 st, 2 nd and 3 rd Gaussian components in the mixed model are (2.66 × 10)-40,7.98×10-281), then the cluster label is (0,0, 1); the true class of this sample is Iris tectorum (Iris Setosa), and correspondingly, the true label is (1,0, 0). The corresponding relation between the clustering label and the real label can be determined by adopting a Kuhn-Munkres algorithm on all samples, the cluster obtained by the 3 rd Gaussian component corresponds to the 1 st Iris pallida, and therefore, the sample x is subjected to1The classification result of (2) is correct.
By adopting the method, the accuracy of the clustering analysis can be checked by contrasting the class information of the sample given by the Iris data set, namely the identification accuracy is identified, and the calculation formula is as follows:
the results of the recognition accuracy of the two clustering models with different sample characteristics are compared in table 1. One of the models is a clustering method based on a Hellinger distance-Gaussian mixture model, which is provided by the invention and is abbreviated as HGMM; another approach is the conventional gaussian mixture model (no regularization term based on Hellinger distance is introduced). Meanwhile, considering the influence of the sensitivity and the correlation of the sample features on the cluster analysis, the second row in table 1 lists the cluster analysis results of all the sample features (4 features) selected, and the third row and the fourth row respectively list the highest and lowest analysis results of some features (3 features selected).
TABLE 1 comparison of classification correctness of two clustering models using different sample characteristics
Sample characterization | HGMM | GMM |
1,2,3,4 | 98% | 77% |
1,3,4 | 97% | 61% |
1,2,4 | 88% | 61% |
Note: HGMM: hellinger distance-Gaussian mixture model (Hellinger distance Gaussian mixture model)
GMM: gaussian mixture Model (Gaussian Mixed Model)
"1": attribute value "length of calyx" (sepal length, unit: cm)
"2": attribute value "length of calyx" (sepal width, unit: cm)
"3": attribute value "petal length" (unit: centimeter)
"4": attribute value "petal width" (Total width, unit: centimeter)
The table shows that the clustering capability of the Gaussian mixture model can be remarkably improved by adopting the improved method provided by the invention, and the recognition accuracy of the label-free data is 98% at most and 88% at least; meanwhile, the method provided by the invention can obtain higher identification accuracy without reducing multidimensional characteristics, can obtain an intelligent classification model through unsupervised learning, and can be further popularized and applied to unsupervised learning of other data.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (6)
1. The data clustering method based on the Hellinger distance-Gaussian mixture model is characterized by comprising the following steps of:
s1, parameter setting and initialization: setting initial values of parameters of a Gaussian mixture model, and initial values and set values of related parameters, wherein the initial values of the parameters of the Gaussian mixture model comprise: the number K of Gaussian distributions in the mixed model, and the initial value of each Gaussian distribution parameter, namely the mean valueSum covarianceAnd the mixing coefficient corresponding to the Gaussian distributionAnd satisfyThe initial value of the Gaussian mixture model parameter isSetting other parameter initial values and set values, namely a regularization coefficient lambda, an initial value of an updating coefficient gamma, a neighbor number l and an iteration termination value, and initializing an iteration sequence number t to 1, namely t is 1;
s2, constructing a model optimization objective function: defining an objective function for optimizing parameters of the Gaussian mixture model, and introducing a regularization term to update the parameters of the Gaussian mixture model, wherein the approximation degree between two Gaussian distributions is calculated by using a Hellinger distance;
s3, calculating the posterior probability of the sample: calculating the posterior probability of the sample according to the Gaussian mixture model parameters obtained in the previous iteration;
s4, updating Gaussian mixture model parameters: updating posterior probability and Gaussian mixture model parameters by adopting a generalized expectation maximum algorithm;
s5, calculating a regularization likelihood function value;
s6, judging iteration termination: comparing regularization likelihood function values before and after updating of the Gaussian mixture model parameters, and continuing the iteration process of the steps S3-S5 until an iteration termination condition is met;
s7, data type judgment: and for each sample, taking a Gaussian component label corresponding to the maximum posterior probability as the clustering result of the sample.
2. The Hellinger distance-gaussian mixture model-based clustering method according to claim 1, wherein the step S2 is implemented by:
the gaussian mixture model to be optimized is composed of K gaussian distributions,
wherein, theta is (pi)1,μ1,Σ1,…,πK,μK,ΣK) Is a parameter of the Gaussian mixture model, μkSum-sigmakIs the mean and covariance of the kth Gaussian distribution, Nk(xi|μk,Σk) For a corresponding Gaussian distribution density, pikIs the corresponding mixing coefficient and satisfiesxiRepresents one sample in the sample set X, i ═ 1, …, n, each sample XiIncluding d-dimensional features;
to achieve data clustering, the gaussian mixture model parameters Θ are updated by iterative operations, so X is defined as the observation sample set, and Z ═ ZiI is 1, …, n, and X and Z form a complete sample set, and on the basis of maximizing the log-likelihood function of the complete sample set, a regularization term is introduced to form an optimization objective function, which is defined as follows:
wherein, the lambda is a regularization coefficient,for the regularization term, where the Hellinger distance is introduced into the regularization term, then the probability distribution PiAnd PjH (P) of Hellinger distance betweeni,Pj) The square of (d) is:
Wherein, P (k | x)i) And P (k | x)j) Are respectively a sample xiAnd xjThe posterior probability generated by the kth gaussian component, laplacian matrix L may be represented as L ═ D-W, where the relationship of matrices D and W is T represents transposition;
for sample xiFrom Hellinger distance, its l nearest neighbors, l ∈ { n-1}, can be determined, sample x in the nearest neighbor graphiSample x adjacent to itjW of the weight betweenijIs defined as:
3. The Hellinger distance-gaussian mixture model-based clustering method according to claim 1, wherein the step S3 is implemented by:
obtaining a Gaussian mixture model parameter theta according to the t-1 iterationt-1The posterior probability is calculated as:
on the basis, a Q function is defined by adopting a generalized expectation maximization algorithm and is used for iterative operation of model parameters, and the method is represented as follows:
the objective of iterative optimization is to maximize Q (Θ, Θ) separatelyt-1) And minimizing the regularization term
4. The Hellinger distance-gaussian mixture model-based clustering method according to claim 1, wherein the posterior probability and the gaussian mixture model parameters are respectively updated by adopting a generalized expectation-maximization algorithm in the step S4, and the method is implemented in two steps:
first, minimize the regularization term RkThe update of the posterior probability obtained by applying the Newton-Laplacian method is as follows:
second, maximize Q (Θ, Θ)t-1) Obtaining updated Gaussian mixture model parameters thetat:
6. the Hellinger distance-Gaussian mixture model-based clustering method according to claim 1, wherein the step S6 specifically comprises: comparing the regularization likelihood function values calculated in step S5, and determining the flow direction of the program:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010190288.9A CN111428768A (en) | 2020-03-18 | 2020-03-18 | Hellinger distance-Gaussian mixture model-based clustering method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010190288.9A CN111428768A (en) | 2020-03-18 | 2020-03-18 | Hellinger distance-Gaussian mixture model-based clustering method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111428768A true CN111428768A (en) | 2020-07-17 |
Family
ID=71546416
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010190288.9A Pending CN111428768A (en) | 2020-03-18 | 2020-03-18 | Hellinger distance-Gaussian mixture model-based clustering method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111428768A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111898954A (en) * | 2020-07-31 | 2020-11-06 | 沙师弟(重庆)网络科技有限公司 | Vehicle matching method based on improved Gaussian mixture model clustering |
CN112428263A (en) * | 2020-10-16 | 2021-03-02 | 北京理工大学 | Mechanical arm control method and device and cluster model training method |
CN113243891A (en) * | 2021-05-31 | 2021-08-13 | 平安科技(深圳)有限公司 | Mild cognitive impairment recognition method and device, computer equipment and storage medium |
CN113312851A (en) * | 2021-06-16 | 2021-08-27 | 华电山东新能源有限公司 | Early warning method for temperature abnormity of main bearing of wind driven generator |
CN113569910A (en) * | 2021-06-25 | 2021-10-29 | 石化盈科信息技术有限责任公司 | Account type identification method and device, computer equipment and storage medium |
CN113889192A (en) * | 2021-09-29 | 2022-01-04 | 西安热工研究院有限公司 | Single cell RNA-seq data clustering method based on deep noise reduction self-encoder |
CN114139621A (en) * | 2021-11-29 | 2022-03-04 | 国家电网有限公司大数据中心 | Method, device, equipment and storage medium for determining model classification performance identification |
CN116400426A (en) * | 2023-06-06 | 2023-07-07 | 山东省煤田地质局第三勘探队 | Electromagnetic method-based data survey system |
CN117077535A (en) * | 2023-08-31 | 2023-11-17 | 广东电白建设集团有限公司 | High formwork construction monitoring method based on Gaussian mixture clustering algorithm |
CN117727373A (en) * | 2023-12-01 | 2024-03-19 | 海南大学 | Sample and feature double weighting-based intelligent C-means clustering method for feature reduction |
-
2020
- 2020-03-18 CN CN202010190288.9A patent/CN111428768A/en active Pending
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111898954B (en) * | 2020-07-31 | 2024-01-12 | 沙师弟(重庆)网络科技有限公司 | Vehicle matching method based on improved Gaussian mixture model clustering |
CN111898954A (en) * | 2020-07-31 | 2020-11-06 | 沙师弟(重庆)网络科技有限公司 | Vehicle matching method based on improved Gaussian mixture model clustering |
CN112428263A (en) * | 2020-10-16 | 2021-03-02 | 北京理工大学 | Mechanical arm control method and device and cluster model training method |
CN113243891A (en) * | 2021-05-31 | 2021-08-13 | 平安科技(深圳)有限公司 | Mild cognitive impairment recognition method and device, computer equipment and storage medium |
CN113312851A (en) * | 2021-06-16 | 2021-08-27 | 华电山东新能源有限公司 | Early warning method for temperature abnormity of main bearing of wind driven generator |
CN113569910A (en) * | 2021-06-25 | 2021-10-29 | 石化盈科信息技术有限责任公司 | Account type identification method and device, computer equipment and storage medium |
CN113889192A (en) * | 2021-09-29 | 2022-01-04 | 西安热工研究院有限公司 | Single cell RNA-seq data clustering method based on deep noise reduction self-encoder |
CN113889192B (en) * | 2021-09-29 | 2024-02-27 | 西安热工研究院有限公司 | Single-cell RNA-seq data clustering method based on deep noise reduction self-encoder |
CN114139621A (en) * | 2021-11-29 | 2022-03-04 | 国家电网有限公司大数据中心 | Method, device, equipment and storage medium for determining model classification performance identification |
CN116400426B (en) * | 2023-06-06 | 2023-08-29 | 山东省煤田地质局第三勘探队 | Electromagnetic method-based data survey system |
CN116400426A (en) * | 2023-06-06 | 2023-07-07 | 山东省煤田地质局第三勘探队 | Electromagnetic method-based data survey system |
CN117077535A (en) * | 2023-08-31 | 2023-11-17 | 广东电白建设集团有限公司 | High formwork construction monitoring method based on Gaussian mixture clustering algorithm |
CN117727373A (en) * | 2023-12-01 | 2024-03-19 | 海南大学 | Sample and feature double weighting-based intelligent C-means clustering method for feature reduction |
CN117727373B (en) * | 2023-12-01 | 2024-05-31 | 海南大学 | Sample and feature double weighting-based intelligent C-means clustering method for feature reduction |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111428768A (en) | Hellinger distance-Gaussian mixture model-based clustering method | |
Yu et al. | Online fault diagnosis in industrial processes using multimodel exponential discriminant analysis algorithm | |
CN111368920B (en) | Quantum twin neural network-based classification method and face recognition method thereof | |
CN107203785A (en) | Multipath Gaussian kernel Fuzzy c-Means Clustering Algorithm | |
CN113326731A (en) | Cross-domain pedestrian re-identification algorithm based on momentum network guidance | |
CN111027636B (en) | Unsupervised feature selection method and system based on multi-label learning | |
CN115131618A (en) | Semi-supervised image classification method based on causal reasoning | |
Ververidis et al. | Information loss of the mahalanobis distance in high dimensions: Application to feature selection | |
CN105160598B (en) | Power grid service classification method based on improved EM algorithm | |
CN113222072A (en) | Lung X-ray image classification method based on K-means clustering and GAN | |
CN108921853A (en) | Image partition method based on super-pixel and clustering of immunity sparse spectrums | |
CN117131449A (en) | Data management-oriented anomaly identification method and system with propagation learning capability | |
CN105893956A (en) | Online target matching method based on multi-feature adaptive measure learning | |
CN115393631A (en) | Hyperspectral image classification method based on Bayesian layer graph convolution neural network | |
CN110175631A (en) | A kind of multiple view clustering method based on common Learning Subspaces structure and cluster oriental matrix | |
Sadeghi et al. | Deep clustering with self-supervision using pairwise data similarities | |
CN112465016A (en) | Partial multi-mark learning method based on optimal distance between two adjacent marks | |
CN115344693B (en) | Clustering method based on fusion of traditional algorithm and neural network algorithm | |
Nguyen et al. | Robust product classification with instance-dependent noise | |
CN115510964B (en) | Computer calculation method for liquid chromatograph scientific instrument | |
CN112257787B (en) | Image semi-supervised classification method based on generation type dual-condition confrontation network structure | |
CN114881172A (en) | Software vulnerability automatic classification method based on weighted word vector and neural network | |
WO2021128521A1 (en) | Automatic industry classification method and system | |
Li et al. | A BYY scale-incremental EM algorithm for Gaussian mixture learning | |
CN112906751A (en) | Method for identifying abnormal value through unsupervised learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |