CN111428768A - Hellinger distance-Gaussian mixture model-based clustering method - Google Patents

Hellinger distance-Gaussian mixture model-based clustering method Download PDF

Info

Publication number
CN111428768A
CN111428768A CN202010190288.9A CN202010190288A CN111428768A CN 111428768 A CN111428768 A CN 111428768A CN 202010190288 A CN202010190288 A CN 202010190288A CN 111428768 A CN111428768 A CN 111428768A
Authority
CN
China
Prior art keywords
mixture model
gaussian mixture
gaussian
sample
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010190288.9A
Other languages
Chinese (zh)
Inventor
郭伟
何茂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Guangdong Electronic Information Engineering Research Institute of UESTC
Original Assignee
University of Electronic Science and Technology of China
Guangdong Electronic Information Engineering Research Institute of UESTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China, Guangdong Electronic Information Engineering Research Institute of UESTC filed Critical University of Electronic Science and Technology of China
Priority to CN202010190288.9A priority Critical patent/CN111428768A/en
Publication of CN111428768A publication Critical patent/CN111428768A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)

Abstract

The invention discloses a clustering method based on a Hellinger distance-Gaussian mixture model, which is applied to the field of mechanical fault diagnosis and cluster analysis. In order to solve the problem of low identification precision of label-free data in the prior art, the method improves the clustering capability of a Gaussian mixture model, introduces a regular term which is minimized based on Hellinger distance and used for measuring the distance between probability distributions of the Gaussian mixture model in a data manifold space, and introduces a regular term which is minimized based on Hellinger distance and used for restricting the updating process of posterior probability, and gradually updates the parameters of the Gaussian mixture model by combining with a generalized expectation maximization algorithm, so that the probability of generating given data by the probability distribution of the obtained mixture model is maximized, thereby realizing automatic learning and clustering of data, accurately judging the category information of label-free data, and providing a feasible method for intelligent learning of big data.

Description

Hellinger distance-Gaussian mixture model-based clustering method
Technical Field
The invention belongs to the field of mechanical fault diagnosis and cluster analysis, and particularly relates to a data clustering technology based on a Hellinger distance-Gaussian mixture model.
Background
The upsizing and complication of modern mechanical equipment make it highly susceptible to various faults during continuous operation, and therefore, in order to ensure safe and reliable operation of the equipment, it is necessary to detect potential fault risks early through monitoring and diagnosis, avoid possible accidents and corresponding maintenance consumption, and maximize the use efficiency of the equipment. In recent years, with the proposal of the concept of "big data", data-driven intelligent fault diagnosis methods are popularized and applied. According to the method, the physical failure mechanism of the equipment does not need to be searched, the health state of the monitoring equipment can be automatically judged by learning the statistical rules and the internal characteristics of a large amount of data, and a useful tool is provided for online monitoring and fault monitoring of industrial equipment.
The intelligent fault diagnosis provides an important means for online monitoring and health prediction of large-scale equipment and rotating equipment by automatically extracting fault information implicit in large-scale monitoring data, but the existing large-scale diagnosis methods are based on the assumption that typical faults of the monitoring data are complete and state marks are clear, which is difficult to meet in engineering practice. In the actual operation process of the equipment, in order to ensure continuous and safe operation of the equipment, the equipment cannot be frequently stopped to detect faults and mark the equipment state, so that the monitored data only has little or even no marking information, and the equipment state corresponding to the related data cannot be known. Therefore, there is a need to implement accurate intelligent diagnosis of devices using unsupervised learning methods.
Unsupervised learning is a machine learning method that finds its intrinsic regularity or structure from unlabeled data, where clustering can group a given large amount of data into several categories according to their similarity or distance of features. Existing clustering methods can be divided into two categories: the hard clustering (such as K-means) and the soft clustering (such as Gaussian mixture models) can be used, the former judges that the sample can only belong to one class, and the latter divides the class to simultaneously mine the longitudinal structure (similarity) and the transverse structure (dimension reduction) of the data, so that a more accurate clustering result is obtained. The Gaussian mixture model can fit the distribution of any data by linearly combining a plurality of Gaussian distribution functions, but the existing Gaussian mixture model is influenced by factors such as parameter initialization and complex operation, and the algorithm research and clustering application are still few. Therefore, how to improve the gaussian mixture model, the model parameters are optimized by combining the expectation maximization algorithm, and the recognition capability of the model parameters on the sample class is improved.
Disclosure of Invention
In order to solve the technical problems, the invention provides a clustering method based on a Hellinger distance-Gaussian mixture model, which introduces a regularization term based on the Hellinger distance on the basis of maximizing the probability distribution of a sample, constructs an internal manifold structure of the sample through a generalized expectation maximization algorithm, and further realizes the automatic judgment of the data category. The steps are described as follows:
data feature composition sample set X to be classified is X ═ { XiI-1, …, n }, and contains n samples, each sample xiIncluding d-dimensional features.
And S1, setting parameters and initializing.
1) Setting K components in the Gaussian mixture model, and initializing Gaussian model parameters by adopting a K-mean algorithm
Figure BDA0002415624570000021
2) A regularization coefficient λ is set.
3) An update coefficient γ is set, and its initial value is set to 0.9.
4) And setting the number l of adjacent neighbors.
5) The iteration number t is initialized to 1, i.e., t equals 1.
6) The iteration end value is set to a smaller value.
S2, constructing a model optimization objective function: and defining an objective function of the parameter optimization of the Gaussian mixture model. In the parameter optimization process, Hellinger distance is introduced to calculate the closeness degree between the two distributions.
The constructed gaussian mixture model is composed of K gaussian distributions,
Figure BDA0002415624570000022
wherein, theta is (pi)111,…,πKKK) Is a parameter of the Gaussian mixture model, μkSum-sigmak(K1, …, K) is the mean and covariance of the kth gaussian, Nk(xiμkk) Is the Gaussian distribution density of the kth partial model, pikIs the corresponding mixing coefficient and satisfies
Figure BDA0002415624570000023
In order to realize data clustering, the gaussian mixture model parameters Θ are updated by iterative operations. Thus, X is defined as the observation sample set, and Z ═ ZiI is 1, …, n, and X and Z form a complete sample set, and on the basis of maximizing a complete sample log-likelihood function, a regularization term is introduced to form an optimization objective function, which is defined as follows:
Figure BDA0002415624570000024
wherein, the lambda is a regularization coefficient,
Figure BDA0002415624570000025
for the regularization term, the Hellinger distance is incorporated herein into the regularization term. Hellinger distance is typically used in order and asymptotic statistics, then probability distribution PiAnd PjThe square of the Hellinger distance between is:
Figure BDA0002415624570000031
and satisfies h (P)i,Pj) Less than or equal to 1. Regularization term
Figure BDA0002415624570000032
Is shown as
Figure BDA0002415624570000033
Wherein, P (k | x)i) And P (k | x)j) Are respectively a sample xiAnd xjThe posterior probability generated by the k-th Gaussian component, the Laplace matrix L is L ═ D-W, where the relationship between matrices D and W is
Figure BDA0002415624570000034
Figure BDA0002415624570000035
T denotes transposition.
For sample xiFrom the Hellinger distance, its l nearest neighbors, l ∈ { n-1 }. sample x in the nearest neighbor graph can be determinediSample x adjacent to itjW of the weight betweenijIs defined as:
Figure BDA0002415624570000036
wherein the content of the first and second substances,
Figure BDA0002415624570000037
represents a sample xjI neighbor sample sets.
S3, calculating the posterior probability of the sample: according to the parameters theta of the mixed modelt-1And calculating the posterior probability by adopting a generalized expectation maximization algorithm.
Obtaining a Gaussian mixture model parameter theta according to the t-1 iterationt-1Calculating posterior probability
Figure BDA0002415624570000038
On the basis, a Q function is defined by adopting a generalized expectation maximization algorithm for iterative operation of model parameters, and the Q function is expressed as
Figure BDA0002415624570000041
The objective of iterative optimization is to maximize Q (Θ, Θ) separatelyt-1) And minimizing the regularization term
Figure BDA0002415624570000042
Figure BDA0002415624570000043
S4, updating model parameters: and updating the posterior probability and the Gaussian mixture model parameters by adopting a generalized expected maximum algorithm.
First, minimize the regularization term
Figure BDA0002415624570000044
(equation (17)), applying Newton-Laprison's method to obtain the update of the posterior probability as:
Figure BDA0002415624570000045
second, maximize Q (Θ, Θ)t-1) Updated gaussian mixture model parameters Θ can be obtainedt
Figure BDA0002415624570000046
Figure BDA0002415624570000047
Figure BDA0002415624570000048
And S5, calculating a regularization likelihood function value.
Figure BDA0002415624570000049
S6, judging iteration termination:
1) if it is not
Figure BDA00024156245700000410
The update coefficient 0.9 γ → γ is set (i.e., the current update coefficient γ is multiplied by 0.9 as the update coefficient γ of the next iteration), returning to S4.
2) If it is not
Figure BDA00024156245700000411
Stopping iteration, and determining the parameters theta of the Gaussian mixture modeltOutputs the posterior probability P (k | x)i) (i ═ 1, …, n; k ═ 1, …, K); otherwise, adding 1 to the iteration number, namely t ← t +1, and returning to S3.
S7, data type judgment: for each sample, a gaussian component label K (K is 1, …, K) corresponding to the maximum posterior probability is taken as the clustering result of the sample.
The invention has the beneficial effects that: the invention provides a clustering method based on a Hellinger distance-Gaussian mixture model, which is characterized in that a data clustering algorithm is constructed by using the unsupervised learning capability of the Gaussian mixture model, each cluster is determined by one Gaussian distribution, and each data is formed by the comprehensive action of a plurality of probability clusters; the neighbor samples are defined on a data manifold structure through Hellinger distance and regularization terms, and simultaneously Gaussian distribution parameters and coefficients in the hybrid model are gradually updated by combining a generalized expectation maximization algorithm, so that the probability of generating given data by probability distribution determined by the hybrid model is the maximum, automatic learning and clustering of the data are realized, and the class information of the data without labels can be accurately judged. The method not only expands the probabilistic clustering algorithm, but also can improve the mining capability of the potential structure of the unmarked data, and can be applied to intelligent diagnosis of industrial data.
Drawings
FIG. 1 is a flow chart of the clustering method based on Hellinger distance-Gaussian mixture model of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Fig. 1 is a flow chart of the clustering method based on the Hellinger distance-gaussian mixture model according to the present invention.
The Iris data set is used in the embodiment to clarify the implementation result. The dataset is also called iris data set, and is a type of dataset for multi-variable analysis. It contains 150 data samples, each data containing 4 attributes (features), namely calyx length and width, petal length and width; the categories to which the data belongs are of 3 types: irises Iris (Iris Setosa), Iris versicolor Iris (Iris versicolor) and irises Virginica (Iris Virginica), 50 data per category.
The parameter settings and initial value settings of the gaussian mixture model are as follows:
1) the Gaussian component K is 3, and a K mean value algorithm is adopted to initialize model parameters
Figure BDA0002415624570000051
2) The regularization coefficient λ is 0.1.
3) The initial value of the update coefficient is γ equal to 0.9.
4) The number of neighbors l is 2.
5) Iteration end value of 10-5
According to the method, the sample is input into the Gaussian mixture model, and the model parameter value is updated through iterative operation until the termination condition is met. For each sample to be classified, the model outputs posterior probability values obtained by calculating 1 st, 2 nd and 3 rd Gaussian components when iteration is terminated, and the label of the Gaussian component corresponding to the maximum value is taken as the class information of the sample. For example, for sample x1The posterior probability values of the outputs of the 1 st, 2 nd and 3 rd Gaussian components in the mixed model are (2.66 × 10)-40,7.98×10-281), then the cluster label is (0,0, 1); the true class of this sample is Iris tectorum (Iris Setosa), and correspondingly, the true label is (1,0, 0). The corresponding relation between the clustering label and the real label can be determined by adopting a Kuhn-Munkres algorithm on all samples, the cluster obtained by the 3 rd Gaussian component corresponds to the 1 st Iris pallida, and therefore, the sample x is subjected to1The classification result of (2) is correct.
By adopting the method, the accuracy of the clustering analysis can be checked by contrasting the class information of the sample given by the Iris data set, namely the identification accuracy is identified, and the calculation formula is as follows:
Figure BDA0002415624570000061
the results of the recognition accuracy of the two clustering models with different sample characteristics are compared in table 1. One of the models is a clustering method based on a Hellinger distance-Gaussian mixture model, which is provided by the invention and is abbreviated as HGMM; another approach is the conventional gaussian mixture model (no regularization term based on Hellinger distance is introduced). Meanwhile, considering the influence of the sensitivity and the correlation of the sample features on the cluster analysis, the second row in table 1 lists the cluster analysis results of all the sample features (4 features) selected, and the third row and the fourth row respectively list the highest and lowest analysis results of some features (3 features selected).
TABLE 1 comparison of classification correctness of two clustering models using different sample characteristics
Sample characterization HGMM GMM
1,2,3,4 98% 77%
1,3,4 97% 61%
1,2,4 88% 61%
Note: HGMM: hellinger distance-Gaussian mixture model (Hellinger distance Gaussian mixture model)
GMM: gaussian mixture Model (Gaussian Mixed Model)
"1": attribute value "length of calyx" (sepal length, unit: cm)
"2": attribute value "length of calyx" (sepal width, unit: cm)
"3": attribute value "petal length" (unit: centimeter)
"4": attribute value "petal width" (Total width, unit: centimeter)
The table shows that the clustering capability of the Gaussian mixture model can be remarkably improved by adopting the improved method provided by the invention, and the recognition accuracy of the label-free data is 98% at most and 88% at least; meanwhile, the method provided by the invention can obtain higher identification accuracy without reducing multidimensional characteristics, can obtain an intelligent classification model through unsupervised learning, and can be further popularized and applied to unsupervised learning of other data.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (6)

1. The data clustering method based on the Hellinger distance-Gaussian mixture model is characterized by comprising the following steps of:
s1, parameter setting and initialization: setting initial values of parameters of a Gaussian mixture model, and initial values and set values of related parameters, wherein the initial values of the parameters of the Gaussian mixture model comprise: the number K of Gaussian distributions in the mixed model, and the initial value of each Gaussian distribution parameter, namely the mean value
Figure FDA0002415624560000011
Sum covariance
Figure FDA0002415624560000012
And the mixing coefficient corresponding to the Gaussian distribution
Figure FDA0002415624560000013
And satisfy
Figure FDA0002415624560000014
The initial value of the Gaussian mixture model parameter is
Figure FDA0002415624560000015
Setting other parameter initial values and set values, namely a regularization coefficient lambda, an initial value of an updating coefficient gamma, a neighbor number l and an iteration termination value, and initializing an iteration sequence number t to 1, namely t is 1;
s2, constructing a model optimization objective function: defining an objective function for optimizing parameters of the Gaussian mixture model, and introducing a regularization term to update the parameters of the Gaussian mixture model, wherein the approximation degree between two Gaussian distributions is calculated by using a Hellinger distance;
s3, calculating the posterior probability of the sample: calculating the posterior probability of the sample according to the Gaussian mixture model parameters obtained in the previous iteration;
s4, updating Gaussian mixture model parameters: updating posterior probability and Gaussian mixture model parameters by adopting a generalized expectation maximum algorithm;
s5, calculating a regularization likelihood function value;
s6, judging iteration termination: comparing regularization likelihood function values before and after updating of the Gaussian mixture model parameters, and continuing the iteration process of the steps S3-S5 until an iteration termination condition is met;
s7, data type judgment: and for each sample, taking a Gaussian component label corresponding to the maximum posterior probability as the clustering result of the sample.
2. The Hellinger distance-gaussian mixture model-based clustering method according to claim 1, wherein the step S2 is implemented by:
the gaussian mixture model to be optimized is composed of K gaussian distributions,
Figure FDA0002415624560000016
wherein, theta is (pi)111,…,πKKK) Is a parameter of the Gaussian mixture model, μkSum-sigmakIs the mean and covariance of the kth Gaussian distribution, Nk(xikk) For a corresponding Gaussian distribution density, pikIs the corresponding mixing coefficient and satisfies
Figure FDA0002415624560000017
xiRepresents one sample in the sample set X, i ═ 1, …, n, each sample XiIncluding d-dimensional features;
to achieve data clustering, the gaussian mixture model parameters Θ are updated by iterative operations, so X is defined as the observation sample set, and Z ═ ZiI is 1, …, n, and X and Z form a complete sample set, and on the basis of maximizing the log-likelihood function of the complete sample set, a regularization term is introduced to form an optimization objective function, which is defined as follows:
Figure FDA0002415624560000021
wherein, the lambda is a regularization coefficient,
Figure FDA0002415624560000022
for the regularization term, where the Hellinger distance is introduced into the regularization term, then the probability distribution PiAnd PjH (P) of Hellinger distance betweeni,Pj) The square of (d) is:
Figure FDA0002415624560000023
and satisfies h (P)i,Pj) 1 ≦ regular term
Figure FDA0002415624560000024
Is shown as
Figure FDA0002415624560000025
Wherein, P (k | x)i) And P (k | x)j) Are respectively a sample xiAnd xjThe posterior probability generated by the kth gaussian component, laplacian matrix L may be represented as L ═ D-W, where the relationship of matrices D and W is
Figure FDA0002415624560000026
Figure FDA0002415624560000027
T represents transposition;
for sample xiFrom Hellinger distance, its l nearest neighbors, l ∈ { n-1}, can be determined, sample x in the nearest neighbor graphiSample x adjacent to itjW of the weight betweenijIs defined as:
Figure FDA0002415624560000028
wherein the content of the first and second substances,
Figure FDA0002415624560000029
and
Figure FDA00024156245600000210
respectively represent samples xiAnd xjI neighbor sample sets.
3. The Hellinger distance-gaussian mixture model-based clustering method according to claim 1, wherein the step S3 is implemented by:
obtaining a Gaussian mixture model parameter theta according to the t-1 iterationt-1The posterior probability is calculated as:
Figure FDA0002415624560000031
on the basis, a Q function is defined by adopting a generalized expectation maximization algorithm and is used for iterative operation of model parameters, and the method is represented as follows:
Figure FDA0002415624560000032
the objective of iterative optimization is to maximize Q (Θ, Θ) separatelyt-1) And minimizing the regularization term
Figure FDA0002415624560000033
Figure FDA0002415624560000034
4. The Hellinger distance-gaussian mixture model-based clustering method according to claim 1, wherein the posterior probability and the gaussian mixture model parameters are respectively updated by adopting a generalized expectation-maximization algorithm in the step S4, and the method is implemented in two steps:
first, minimize the regularization term RkThe update of the posterior probability obtained by applying the Newton-Laplacian method is as follows:
Figure FDA0002415624560000035
second, maximize Q (Θ, Θ)t-1) Obtaining updated Gaussian mixture model parameters thetat
Figure FDA0002415624560000036
Figure FDA0002415624560000037
Figure FDA0002415624560000038
5. The Hellinger distance-gaussian mixture model-based clustering method according to claim 1, wherein the regularized likelihood function values need to be calculated in step S5, and the corresponding calculation formula is:
Figure FDA0002415624560000041
6. the Hellinger distance-Gaussian mixture model-based clustering method according to claim 1, wherein the step S6 specifically comprises: comparing the regularization likelihood function values calculated in step S5, and determining the flow direction of the program:
1) if it is not
Figure FDA0002415624560000042
Then the update coefficient 0.9 γ → γ is set, returning to S4;
2) if it is not
Figure FDA0002415624560000043
Stopping iteration, and determining the parameters theta of the Gaussian mixture modeltOutputs the posterior probability P (k | x)i) (i ═ 1, …, n; k ═ 1, …, K); otherwise, adding 1 to the iteration number, namely t ← t +1, and returning to S3.
CN202010190288.9A 2020-03-18 2020-03-18 Hellinger distance-Gaussian mixture model-based clustering method Pending CN111428768A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010190288.9A CN111428768A (en) 2020-03-18 2020-03-18 Hellinger distance-Gaussian mixture model-based clustering method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010190288.9A CN111428768A (en) 2020-03-18 2020-03-18 Hellinger distance-Gaussian mixture model-based clustering method

Publications (1)

Publication Number Publication Date
CN111428768A true CN111428768A (en) 2020-07-17

Family

ID=71546416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010190288.9A Pending CN111428768A (en) 2020-03-18 2020-03-18 Hellinger distance-Gaussian mixture model-based clustering method

Country Status (1)

Country Link
CN (1) CN111428768A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898954A (en) * 2020-07-31 2020-11-06 沙师弟(重庆)网络科技有限公司 Vehicle matching method based on improved Gaussian mixture model clustering
CN112428263A (en) * 2020-10-16 2021-03-02 北京理工大学 Mechanical arm control method and device and cluster model training method
CN113243891A (en) * 2021-05-31 2021-08-13 平安科技(深圳)有限公司 Mild cognitive impairment recognition method and device, computer equipment and storage medium
CN113312851A (en) * 2021-06-16 2021-08-27 华电山东新能源有限公司 Early warning method for temperature abnormity of main bearing of wind driven generator
CN113569910A (en) * 2021-06-25 2021-10-29 石化盈科信息技术有限责任公司 Account type identification method and device, computer equipment and storage medium
CN113889192A (en) * 2021-09-29 2022-01-04 西安热工研究院有限公司 Single cell RNA-seq data clustering method based on deep noise reduction self-encoder
CN114139621A (en) * 2021-11-29 2022-03-04 国家电网有限公司大数据中心 Method, device, equipment and storage medium for determining model classification performance identification
CN116400426A (en) * 2023-06-06 2023-07-07 山东省煤田地质局第三勘探队 Electromagnetic method-based data survey system
CN117077535A (en) * 2023-08-31 2023-11-17 广东电白建设集团有限公司 High formwork construction monitoring method based on Gaussian mixture clustering algorithm
CN117727373A (en) * 2023-12-01 2024-03-19 海南大学 Sample and feature double weighting-based intelligent C-means clustering method for feature reduction

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898954B (en) * 2020-07-31 2024-01-12 沙师弟(重庆)网络科技有限公司 Vehicle matching method based on improved Gaussian mixture model clustering
CN111898954A (en) * 2020-07-31 2020-11-06 沙师弟(重庆)网络科技有限公司 Vehicle matching method based on improved Gaussian mixture model clustering
CN112428263A (en) * 2020-10-16 2021-03-02 北京理工大学 Mechanical arm control method and device and cluster model training method
CN113243891A (en) * 2021-05-31 2021-08-13 平安科技(深圳)有限公司 Mild cognitive impairment recognition method and device, computer equipment and storage medium
CN113312851A (en) * 2021-06-16 2021-08-27 华电山东新能源有限公司 Early warning method for temperature abnormity of main bearing of wind driven generator
CN113569910A (en) * 2021-06-25 2021-10-29 石化盈科信息技术有限责任公司 Account type identification method and device, computer equipment and storage medium
CN113889192A (en) * 2021-09-29 2022-01-04 西安热工研究院有限公司 Single cell RNA-seq data clustering method based on deep noise reduction self-encoder
CN113889192B (en) * 2021-09-29 2024-02-27 西安热工研究院有限公司 Single-cell RNA-seq data clustering method based on deep noise reduction self-encoder
CN114139621A (en) * 2021-11-29 2022-03-04 国家电网有限公司大数据中心 Method, device, equipment and storage medium for determining model classification performance identification
CN116400426B (en) * 2023-06-06 2023-08-29 山东省煤田地质局第三勘探队 Electromagnetic method-based data survey system
CN116400426A (en) * 2023-06-06 2023-07-07 山东省煤田地质局第三勘探队 Electromagnetic method-based data survey system
CN117077535A (en) * 2023-08-31 2023-11-17 广东电白建设集团有限公司 High formwork construction monitoring method based on Gaussian mixture clustering algorithm
CN117727373A (en) * 2023-12-01 2024-03-19 海南大学 Sample and feature double weighting-based intelligent C-means clustering method for feature reduction
CN117727373B (en) * 2023-12-01 2024-05-31 海南大学 Sample and feature double weighting-based intelligent C-means clustering method for feature reduction

Similar Documents

Publication Publication Date Title
CN111428768A (en) Hellinger distance-Gaussian mixture model-based clustering method
Yu et al. Online fault diagnosis in industrial processes using multimodel exponential discriminant analysis algorithm
CN111368920B (en) Quantum twin neural network-based classification method and face recognition method thereof
CN107203785A (en) Multipath Gaussian kernel Fuzzy c-Means Clustering Algorithm
CN113326731A (en) Cross-domain pedestrian re-identification algorithm based on momentum network guidance
CN111027636B (en) Unsupervised feature selection method and system based on multi-label learning
CN115131618A (en) Semi-supervised image classification method based on causal reasoning
Ververidis et al. Information loss of the mahalanobis distance in high dimensions: Application to feature selection
CN105160598B (en) Power grid service classification method based on improved EM algorithm
CN113222072A (en) Lung X-ray image classification method based on K-means clustering and GAN
CN108921853A (en) Image partition method based on super-pixel and clustering of immunity sparse spectrums
CN117131449A (en) Data management-oriented anomaly identification method and system with propagation learning capability
CN105893956A (en) Online target matching method based on multi-feature adaptive measure learning
CN115393631A (en) Hyperspectral image classification method based on Bayesian layer graph convolution neural network
CN110175631A (en) A kind of multiple view clustering method based on common Learning Subspaces structure and cluster oriental matrix
Sadeghi et al. Deep clustering with self-supervision using pairwise data similarities
CN112465016A (en) Partial multi-mark learning method based on optimal distance between two adjacent marks
CN115344693B (en) Clustering method based on fusion of traditional algorithm and neural network algorithm
Nguyen et al. Robust product classification with instance-dependent noise
CN115510964B (en) Computer calculation method for liquid chromatograph scientific instrument
CN112257787B (en) Image semi-supervised classification method based on generation type dual-condition confrontation network structure
CN114881172A (en) Software vulnerability automatic classification method based on weighted word vector and neural network
WO2021128521A1 (en) Automatic industry classification method and system
Li et al. A BYY scale-incremental EM algorithm for Gaussian mixture learning
CN112906751A (en) Method for identifying abnormal value through unsupervised learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination