CN107330458A - A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers - Google Patents

A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers Download PDF

Info

Publication number
CN107330458A
CN107330458A CN201710503214.4A CN201710503214A CN107330458A CN 107330458 A CN107330458 A CN 107330458A CN 201710503214 A CN201710503214 A CN 201710503214A CN 107330458 A CN107330458 A CN 107330458A
Authority
CN
China
Prior art keywords
mrow
msub
clustering
cluster
munderover
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710503214.4A
Other languages
Chinese (zh)
Inventor
李学刚
狄岚
李斌
李通明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou College of Information Technology CCIT
Original Assignee
Changzhou College of Information Technology CCIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou College of Information Technology CCIT filed Critical Changzhou College of Information Technology CCIT
Priority to CN201710503214.4A priority Critical patent/CN107330458A/en
Publication of CN107330458A publication Critical patent/CN107330458A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers, belong to data mining and mode identification technology, comprise the following steps:The distance relation of input data set and sample point is clustered;Clustering method is used to obtain cluster labels to target data set clustering;The cluster labels obtained after clustering carry out performance evaluation with original tag according to evaluation index.Present invention seek to address that the Clustering Effect of fuzzy C-mean algorithm by the cluster centre that it is initialized influenceed it is larger, it cannot be guaranteed that the problem of obtaining optimal solution, the selection of initial cluster center is first carried out on the basis of FCM algorithms, it is to be used as heuristic information using the variance of sample to choose FCM initial cluster centers, with the field radius of sample, K are chosen positioned at the minimum sample point of different zones upside deviation as initial cluster center, the algorithm need not set any parameter.

Description

A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers
Technical field
The present invention relates to a kind of clustering method of data set, more particularly to a kind of minimum variance optimization initial clustering The fuzzy C-means clustering method at center, belongs to data mining and mode identification technology.
Background technology
Traditional FCM algorithms are all randomly selected when choosing cluster centre, so that the result for being easily caused cluster is inadequate It is stable, in some instances it may even be possible to cluster centre to be made to converge to local extremum, to solve the above problems, being believed according to the tight ness rating of sample distribution Breath, can calculate sample according to minimum variance clustering of optimizing initial centers, the initialization algorithm according to the space distribution information of sample This variance draws the tight ness rating information of sample, and the minimum sample point of selection variance and its a range of sample point are as first Beginning cluster centre, realizes improved fuzzy clustering algorithm.
FCM utilizes the algorithm that iteration declines, and is the search procedure of a part, more sensitive to initial cluster centre, The result finally given is not necessarily global optimal dividing, if the cluster centre that can be chosen, according to arest neighbors method by sample Originally it is assigned to each initial cluster center and produces initial clustering, the result of cluster is up to global optimum, therefore, based on each class The variance minimum principle of cluster central sample, proposes the FCM clustering algorithms of the minimum variance clustering of optimizing initial centers based on sample.
The content of the invention
The main object of the present invention is to provide for a kind of fuzzy C-means clustering of minimum variance clustering of optimizing initial centers Method, the problem of result that solution is caused because of not knowing for initial cluster center cannot get optimal solution.
The purpose of the present invention can reach by using following technical scheme:
A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers, comprises the following steps:
Step S1:The distance relation of input data set and sample point is clustered;
Step S2:Clustering method is used to obtain cluster labels to target data set clustering;
Step S3:The cluster labels obtained after clustering carry out performance evaluation with original tag according to evaluation index.
Further, in the step S1, the data set of input is defeated using manual simulation's data set and the progress of UCI data sets Enter, cluster classification is several to be determined according to manual simulation's data set and UCI data sets.
Further, in the step S2, by target data set clustering, target data set and pixel are set Cluster labels, the setting procedure of cluster labels includes:
Step S21:The physical location for concentrating sample according to target data sets label, in manual simulation data set and UCI data concentrated setting number of tags;
Step S22:In the data set for the data composition that FCM algorithms are used to set label, obtain after clustering Subordinated-degree matrix U and cluster centre V.
Further, the step S22 specifically includes following steps:
Step S221:Cluster classification number c is determined first;
Step S222:Maximum iteration Maxt and worst error threshold epsilon are set;
Step S223:The subordinated-degree matrix U obtained by FCM algorithm clusterings, and cluster centre V are set, FCM is used as The initial degree of membership and cluster centre of algorithm, now set primary iteration number of times t=1;
Step S224:Subordinated-degree matrix and cluster centre matrix are updated by iteration optimization formula.
Further, in the step S224, the iteration optimization formula is:
U is subordinated-degree matrix, and d is fuzziness matrix, and v is cluster centre, and m is Fuzzy Exponential, and x is sample variance;
Until when t reaches maximum iteration Max_t or works as | | U(t+1)-U(t)||FrobeniusDuring < ε, method is terminated, this When U, V is the optimal solution of method.
Further, the cluster centre V is obtained to comprise the following steps:
Step S2231:Calculate each sample x in sample setiVariance, find out variance in data set W minimum SampleWillIt is set to the initial cluster center v of first class cluster1;Calculate the half r of the root-mean-square distance of data set samplem, Order:
C=1,
W=W-W1
Step S2232:If c < K, make c=c+1, the minimum sample of variance in data set W is found outIt is set to c The initial cluster center v of class clusterc, and make:
W=W-Wc,
Otherwise, it just have found K initial cluster center V0=[v1,v2,…,vk]。
Further, the FCM algorithms comprise the following steps:
Step S2233:Set Fuzzy Exponential m (1≤m);K initialized in the step S2231 are initial poly- Class center V0=[v1,v2,…,vk];Convergence precision ε > 0 are set;Maximum iteration tmax;Make iterations k=0;
Step S2234:Calculate U(k+1)
Step S2235:Calculate V(k+1)
U is subordinated-degree matrix, and d is fuzziness matrix, and v is cluster centre, and m is Fuzzy Exponential, and x is sample variance;
Step S2236:If | | V(k)-V(k+1)| |≤ε, stop iteration;Otherwise, k=k+1, goes to step S2232;
Step S2237:When algorithm is terminated, the degree of membership U and cluster centre V finally given is just cluster optimal solution.
Further, performance evaluation, property are carried out according to evaluation index to the label that is obtained after clustering and original tag Energy evaluation index includes:NMI evaluation indexes and RandIndex evaluation indexes.
Further, the NMI evaluation indexes are:
Wherein:Ni,jRepresent the compatible degree between ith cluster and class j;
N represents the size of sample capacity;
NiRepresent the number of samples of ith cluster;
NjRepresent the number of samples of j-th of cluster.
Further, the RandIndex evaluation indexes are:
Wherein:f00Represent that data point has different class labels, and belong to inhomogeneous data and count out;
f11Represent that there is identical class label, and belong to same category of data and count out;
N represents the amount of capacity of sample.
The advantageous effects of the present invention:According to the fuzzy C-mean algorithm of the minimum variance clustering of optimizing initial centers of the present invention Clustering method, the fuzzy C-means clustering method for the minimum variance clustering of optimizing initial centers that the present invention is provided, it is intended to solve fuzzy The Clustering Effect of C averages is influenceed larger by the cluster centre that it is initialized, it is impossible to which the problem of guarantee obtains optimal solution, the present invention is The selection of initial cluster center is first carried out on the basis of FCM algorithms, it is proposed that a kind of new minimum variance optimization initial clustering The C means clustering methods at center, it is using the variance of sample as heuristic information, with sample that the present invention, which chooses FCM initial cluster centers, This field radius, chooses K positioned at the minimum sample point of different zones upside deviation as initial cluster center, the algorithm is not required to Any parameter is set.
Brief description of the drawings
Fig. 1 is the one preferred of the fuzzy C-means clustering method of the minimum variance clustering of optimizing initial centers according to the present invention The schematic flow sheet of embodiment.
Embodiment
To make those skilled in the art's more clear and clear and definite technical scheme, with reference to embodiment and accompanying drawing The present invention is described in further detail, but the implementation of the present invention is not limited to this.
As shown in figure 1, a kind of fuzzy C-means clustering side for minimum variance clustering of optimizing initial centers that the present embodiment is provided Method, comprises the following steps:
Step S1:The distance relation of input data set and sample point is clustered;
Step S2:Clustering method is used to obtain cluster labels to target data set clustering;
Step S3:The cluster labels obtained after clustering carry out performance evaluation with original tag according to evaluation index.
Further, in the present embodiment, in the step S1, the data set of input using manual simulation's data set and UCI data sets are inputted, and cluster classification is several to be determined according to manual simulation's data set and UCI data sets.
Further, in the present embodiment, in the step S2, by target data set clustering, to target data Collection and pixel set cluster labels, and the setting procedure of cluster labels includes:
Step S21:The physical location for concentrating sample according to target data sets label, in manual simulation data set and UCI data concentrated setting number of tags;
Step S22:In the data set for the data composition that FCM algorithms are used to set label, obtain after clustering Subordinated-degree matrix U and cluster centre V.
Further, in the present embodiment, the step S22 specifically includes following steps:
Step S221:Cluster classification number c is determined first;
Step S222:Maximum iteration Maxt and worst error threshold epsilon are set;
Step S223:The subordinated-degree matrix U obtained by FCM algorithm clusterings, and cluster centre V are set, FCM is used as The initial degree of membership and cluster centre of algorithm, now set primary iteration number of times t=1;
Step S224:Subordinated-degree matrix and cluster centre matrix are updated by iteration optimization formula.
Further, in the present embodiment, in the step S224, the iteration optimization formula is:
U is subordinated-degree matrix, and d is fuzziness matrix, and v is cluster centre, and m is Fuzzy Exponential, and x is sample variance;
Until when t reaches maximum iteration Max_t or works as | | U(t+1)-U(t)||FrobeniusDuring < ε, method is terminated, this When U, V is the optimal solution of method.
Further, in the present embodiment, the cluster centre V is obtained to comprise the following steps:
Step S2231:Calculate each sample x in sample setiVariance, find out variance in data set W minimum SampleWillIt is set to the initial cluster center v of first class cluster1;Calculate the half r of the root-mean-square distance of data set samplem, Order:
C=1,
W=W-W1
Step S2232:If c < K, make c=c+1, the minimum sample of variance in data set W is found outIt is set to c The initial cluster center v of class clusterc, and make:
W=W-Wc,
Otherwise, it just have found K initial cluster center V0=[v1,v2,…,vk]。
Further, in the present embodiment, the FCM algorithms comprise the following steps:
Step S2233:Set Fuzzy Exponential m (1≤m);K initialized in the step S2231 are initial poly- Class center V0=[v1,v2,…,vk];Convergence precision ε > 0 are set;Maximum iteration tmax;Make iterations k=0;
Step S2234:Calculate U(k+1)
Step S2235:Calculate V(k+1)
U is subordinated-degree matrix, and d is fuzziness matrix, and v is cluster centre, and m is Fuzzy Exponential, and x is sample variance;
Step S2236:If | | V(k)-V(k+1)| |≤ε, stop iteration;Otherwise, k=k+1, goes to step S2232;
Step S2237:When algorithm is terminated, the degree of membership U and cluster centre V finally given is just cluster optimal solution.
Further, performance evaluation, property are carried out according to evaluation index to the label that is obtained after clustering and original tag Energy evaluation index includes:NMI evaluation indexes and RandIndex evaluation indexes.
Further, in the present embodiment, the NMI evaluation indexes are:
Wherein:Ni,jRepresent the compatible degree between ith cluster and class j;
N represents the size of sample capacity;
NiRepresent the number of samples of ith cluster;
NjRepresent the number of samples of j-th of cluster.
Further, in the present embodiment, the RandIndex evaluation indexes are:
Wherein:f00Represent that data point has different class labels, and belong to inhomogeneous data and count out;
f11Represent that there is identical class label, and belong to same category of data and count out;
N represents the amount of capacity of sample.
Further, in the present embodiment, the C of the minimum variance clustering of optimizing initial centers proposed by the present embodiment is equal It is worth the validity of clustering method, experiment is classified into 3 parts, respectively using noiseless simulated data sets, with noise and outlier Data set, UCI True Data collection, pass through the method for the present invention:The core possibility C means clustering methods and mould of heart septum in greatly Paste c means clustering algorithms, possibility c averages (PCM) clustering algorithm, Fuzzy C-Means Cluster Algorithm based on core and based on core The Comparison of experiment results analysis of possibility C means clustering algorithms, illustrates that the present invention is imitated in the cluster of the data set to obscurity boundary Really and the robustness of noise is all lifted.
In summary, in the present embodiment, the Fuzzy C according to the minimum variance clustering of optimizing initial centers of the present embodiment is equal It is worth clustering method, the fuzzy C-means clustering method for the minimum variance clustering of optimizing initial centers that the present embodiment is provided, it is intended to solve The Clustering Effect of fuzzy C-mean algorithm is influenceed larger by the cluster centre that it is initialized, it is impossible to the problem of guarantee obtains optimal solution, this hair Bright is the selection for first being carried out on the basis of FCM algorithms initial cluster center, it is proposed that a kind of new minimum variance optimization is initial The C means clustering methods of cluster centre, the present invention choose FCM initial cluster centers be the variance using sample as heuristic information, With the field radius of sample, choose K and be used as initial cluster center, the algorithm positioned at the minimum sample point of different zones upside deviation Any parameter need not be set.
It is described above, it is only further embodiment of the present invention, but protection scope of the present invention is not limited thereto, and it is any Those familiar with the art is in scope disclosed in this invention, and technique according to the invention scheme and its design add With equivalent substitution or change, protection scope of the present invention is belonged to.

Claims (10)

1. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers, it is characterised in that including following step Suddenly:
Step S1:The distance relation of input data set and sample point is clustered;
Step S2:Clustering method is used to obtain cluster labels to target data set clustering;
Step S3:The cluster labels obtained after clustering carry out performance evaluation with original tag according to evaluation index.
2. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 1, it is special Levy and be, in the step S1, the data set of input is inputted using manual simulation's data set and UCI data sets, clusters class It is not several to be determined according to manual simulation's data set and UCI data sets.
3. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 2, it is special Levy and be, in the step S2, by target data set clustering, cluster labels are set to target data set and pixel, The setting procedure of cluster labels includes:
Step S21:The physical location of sample is concentrated to set label according to target data, in manual simulation's data set and UCI numbers According to concentrated setting number of tags;
Step S22:In the data set for the data composition that FCM algorithms are used to set label, being subordinate to after clustering is obtained Spend matrix U and cluster centre V.
4. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 3, it is special Levy and be, the step S22 specifically includes following steps:
Step S221:Cluster classification number c is determined first;
Step S222:Maximum iteration Maxt and worst error threshold epsilon are set;
Step S223:The subordinated-degree matrix U obtained by FCM algorithm clusterings, and cluster centre V are set, FCM algorithms are used as Initial degree of membership and cluster centre, now set primary iteration number of times t=1;
Step S224:Subordinated-degree matrix and cluster centre matrix are updated by iteration optimization formula.
5. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 4, it is special Levy and be, in the step S224, the iteration optimization formula is:
<mrow> <msub> <mi>u</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msup> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>d</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>,</mo> <msub> <mi>v</mi> <mi>i</mi> </msub> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>d</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>x</mi> <mi>k</mi> </msub> <mo>,</mo> <msub> <mi>v</mi> <mi>j</mi> </msub> </mrow> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>)</mo> </mrow> <mrow> <mn>2</mn> <mo>/</mo> <mrow> <mo>(</mo> <mrow> <mi>m</mi> <mo>-</mo> <mn>1</mn> </mrow> <mo>)</mo> </mrow> </mrow> </msup> </mrow> </mfrac> </mrow>
<mrow> <msub> <mi>v</mi> <mi>i</mi> </msub> <mo>=</mo> <mfrac> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>u</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> <mi>m</mi> </msup> <msub> <mi>x</mi> <mi>j</mi> </msub> </mrow> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>u</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> <mi>m</mi> </msup> </mrow> </mfrac> </mrow>
U is subordinated-degree matrix, and d is fuzziness matrix, and v is cluster centre, and m is Fuzzy Exponential, and x is sample variance;
Until when t reaches maximum iteration Max_t or works as | | U(t+1)-U(t)||FrobeniusDuring < ε, method is terminated, now U, V are the optimal solution of method.
6. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 4, it is special Levy and be, obtain the cluster centre V and comprise the following steps:
Step S2231:Calculate each sample x in sample setiVariance, find out the minimum sample of variance in data set WWillIt is set to the initial cluster center v of first class cluster1;Calculate the half r of the root-mean-square distance of data set samplem, order:
C=1,
<mrow> <msub> <mi>W</mi> <mn>1</mn> </msub> <mo>=</mo> <mo>{</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>|</mo> <mi>d</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>,</mo> <msubsup> <mi>x</mi> <mi>i</mi> <mn>1</mn> </msubsup> <mo>)</mo> </mrow> <mo>&lt;</mo> <msub> <mi>r</mi> <mi>m</mi> </msub> <mo>,</mo> <mi>j</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mi>n</mi> <mo>}</mo> <mo>,</mo> </mrow>
W=W-W1
Step S2232:If c < K, make c=c+1, the minimum sample of variance in data set W is found outIt is set to c class clusters Initial cluster center vc, and make:
<mrow> <msub> <mi>W</mi> <mi>c</mi> </msub> <mo>=</mo> <mo>{</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>|</mo> <mi>d</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>,</mo> <msubsup> <mi>x</mi> <mi>i</mi> <mn>1</mn> </msubsup> <mo>)</mo> </mrow> <mo>&lt;</mo> <mi>c</mi> <mi>m</mi> <mi>e</mi> <mi>a</mi> <mi>n</mi> <mo>,</mo> <mi>j</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mi>n</mi> <mo>,</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>&amp;NotElement;</mo> <msub> <mi>w</mi> <mi>r</mi> </msub> <mo>,</mo> <mi>r</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mi>c</mi> <mo>-</mo> <mn>1</mn> <mo>}</mo> <mo>,</mo> </mrow>
W=W-Wc,
Otherwise, it just have found K initial cluster center V0=[v1,v2,…,vk]。
7. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 6, it is special Levy and be, the FCM algorithms comprise the following steps:
Step S2233:Set Fuzzy Exponential m (1≤m);In the K initial clustering initialized in the step S2231 Heart V0=[v1,v2,…,vk];Convergence precision ε > 0 are set;Maximum iteration tmax;Make iterations k=0;
Step S2234:Calculate U(k+1)
<mrow> <msub> <mi>u</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msup> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>d</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>,</mo> <msub> <mi>v</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>d</mi> <mrow> <mi>j</mi> <mi>k</mi> </mrow> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>k</mi> </msub> <mo>,</mo> <msub> <mi>v</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>)</mo> </mrow> <mrow> <mn>2</mn> <mo>/</mo> <mrow> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </msup> </mrow> </mfrac> </mrow>
Step S2235:Calculate V(k+1)
<mrow> <msub> <mi>v</mi> <mi>i</mi> </msub> <mo>=</mo> <mfrac> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>u</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> <mi>m</mi> </msup> <msub> <mi>x</mi> <mi>j</mi> </msub> </mrow> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>u</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>)</mo> </mrow> <mi>m</mi> </msup> </mrow> </mfrac> </mrow>
U is subordinated-degree matrix, and d is fuzziness matrix, and v is cluster centre, and m is Fuzzy Exponential, and x is sample variance;
Step S2236:If | | V(k)-V(k+1)| |≤ε, stop iteration;Otherwise, k=k+1, goes to step S2232;
Step S2237:When algorithm is terminated, the degree of membership U and cluster centre V finally given is just cluster optimal solution.
8. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 7, it is special Levy and be, performance evaluation, Performance Evaluating Indexes are carried out according to evaluation index to the label that is obtained after clustering and original tag Including:NMI evaluation indexes and RandIndex evaluation indexes.
9. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 8, it is special Levy and be, the NMI evaluation indexes are:
<mrow> <mi>N</mi> <mi>M</mi> <mi>I</mi> <mo>=</mo> <mfrac> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msub> <mi>N</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </msub> <mi>l</mi> <mi>o</mi> <mi>g</mi> <mfrac> <mrow> <mi>N</mi> <mo>&amp;times;</mo> <msub> <mi>N</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </msub> </mrow> <mrow> <msub> <mi>N</mi> <mi>i</mi> </msub> <mo>&amp;times;</mo> <msub> <mi>N</mi> <mi>j</mi> </msub> </mrow> </mfrac> </mrow> <msqrt> <mrow> <mo>(</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msub> <mi>N</mi> <mi>i</mi> </msub> <mi>log</mi> <mi> </mi> <msub> <mi>N</mi> <mi>i</mi> </msub> <mo>/</mo> <mi>N</mi> <mo>)</mo> <mo>&amp;times;</mo> <mo>(</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </munderover> <msub> <mi>N</mi> <mi>j</mi> </msub> <mi>log</mi> <mi> </mi> <msub> <mi>N</mi> <mi>j</mi> </msub> <mo>/</mo> <mi>N</mi> <mo>)</mo> </mrow> </msqrt> </mfrac> </mrow>
Wherein:Ni,jRepresent the compatible degree between ith cluster and class j;
N represents the size of sample capacity;
NiRepresent the number of samples of ith cluster;
NjRepresent the number of samples of j-th of cluster.
10. a kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers according to claim 8, its It is characterised by, the RandIndex evaluation indexes are:
<mrow> <mi>R</mi> <mi>I</mi> <mo>=</mo> <mfrac> <mrow> <msub> <mi>f</mi> <mn>00</mn> </msub> <mo>+</mo> <msub> <mi>f</mi> <mn>11</mn> </msub> </mrow> <mrow> <mi>N</mi> <mrow> <mo>(</mo> <mi>N</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>/</mo> <mn>2</mn> </mrow> </mfrac> </mrow>
Wherein:f00Represent that data point has different class labels, and belong to inhomogeneous data and count out;
f11Represent that there is identical class label, and belong to same category of data and count out;
N represents the amount of capacity of sample.
CN201710503214.4A 2017-06-27 2017-06-27 A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers Pending CN107330458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710503214.4A CN107330458A (en) 2017-06-27 2017-06-27 A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710503214.4A CN107330458A (en) 2017-06-27 2017-06-27 A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers

Publications (1)

Publication Number Publication Date
CN107330458A true CN107330458A (en) 2017-11-07

Family

ID=60198141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710503214.4A Pending CN107330458A (en) 2017-06-27 2017-06-27 A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers

Country Status (1)

Country Link
CN (1) CN107330458A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229593A (en) * 2018-03-28 2018-06-29 浙江工贸职业技术学院 A kind of improved fuzzy clustering image partition method
CN108510180A (en) * 2018-03-28 2018-09-07 电子科技大学 The computational methods of performance interval residing for a kind of production equipment
CN108830317A (en) * 2018-06-08 2018-11-16 绍兴文理学院 The quick fine obtaining value method of open mine side slope ROCK MASS JOINT occurrence based on digital photogrammetry
CN109858858A (en) * 2019-01-21 2019-06-07 中国人民解放军陆军工程大学 A kind of classification of underground logistic network nodal points and site selection system and method
CN110096630A (en) * 2019-05-06 2019-08-06 吉林农业大学 Big data processing method of the one kind based on clustering
CN110633371A (en) * 2019-09-23 2019-12-31 北京安信天行科技有限公司 Log classification method and system
CN110880015A (en) * 2019-10-16 2020-03-13 河南工业大学 Distributed integrated clustering analysis method based on fuzzy C-means
CN110956204A (en) * 2019-11-18 2020-04-03 济南大学 Gaussian mixture model data clustering method based on transfer learning
CN111353379A (en) * 2020-01-06 2020-06-30 西南电子技术研究所(中国电子科技集团公司第十研究所) Signal measurement feature matching and labeling method based on weight clustering
CN111401412A (en) * 2020-02-29 2020-07-10 同济大学 Distributed soft clustering method based on average consensus algorithm in Internet of things environment
CN111881502A (en) * 2020-07-27 2020-11-03 中铁二院工程集团有限责任公司 Bridge state discrimination method based on fuzzy clustering analysis
CN112270355A (en) * 2020-10-28 2021-01-26 长沙理工大学 Active safety prediction method based on big data technology and SAE-GRU
CN112541528A (en) * 2020-12-02 2021-03-23 国家电网有限公司 Power transmission and transformation project cost prediction index optimization method based on fuzzy clustering
CN113761682A (en) * 2021-09-06 2021-12-07 重庆大学 Method for determining and optimizing dynamic pressure difference index of support in real time
CN114398493A (en) * 2021-12-29 2022-04-26 中国人民解放军92728部队 Unmanned aerial vehicle type spectrum construction method based on fuzzy clustering and cost-effectiveness value
CN114625945A (en) * 2022-03-16 2022-06-14 云南升玥信息技术有限公司 Medical treatment disease data management system based on artificial intelligence
CN117610991A (en) * 2023-11-15 2024-02-27 国网冀北电力有限公司经济技术研究院 Reliability analysis method, device, equipment and medium for power communication network
CN117727373B (en) * 2023-12-01 2024-05-31 海南大学 Sample and feature double weighting-based intelligent C-means clustering method for feature reduction

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229593A (en) * 2018-03-28 2018-06-29 浙江工贸职业技术学院 A kind of improved fuzzy clustering image partition method
CN108510180A (en) * 2018-03-28 2018-09-07 电子科技大学 The computational methods of performance interval residing for a kind of production equipment
CN108510180B (en) * 2018-03-28 2021-08-06 电子科技大学 Method for calculating performance interval of production equipment
CN108830317A (en) * 2018-06-08 2018-11-16 绍兴文理学院 The quick fine obtaining value method of open mine side slope ROCK MASS JOINT occurrence based on digital photogrammetry
CN108830317B (en) * 2018-06-08 2022-04-15 宁波大学 Rapid and fine evaluation method for joint attitude of surface mine slope rock mass based on digital photogrammetry
CN109858858A (en) * 2019-01-21 2019-06-07 中国人民解放军陆军工程大学 A kind of classification of underground logistic network nodal points and site selection system and method
CN110096630A (en) * 2019-05-06 2019-08-06 吉林农业大学 Big data processing method of the one kind based on clustering
CN110633371A (en) * 2019-09-23 2019-12-31 北京安信天行科技有限公司 Log classification method and system
CN110880015A (en) * 2019-10-16 2020-03-13 河南工业大学 Distributed integrated clustering analysis method based on fuzzy C-means
CN110880015B (en) * 2019-10-16 2023-04-07 河南工业大学 Distributed integrated clustering analysis method based on fuzzy C-means
CN110956204A (en) * 2019-11-18 2020-04-03 济南大学 Gaussian mixture model data clustering method based on transfer learning
CN111353379B (en) * 2020-01-06 2023-04-07 西南电子技术研究所(中国电子科技集团公司第十研究所) Signal measurement feature matching and labeling method based on weight clustering
CN111353379A (en) * 2020-01-06 2020-06-30 西南电子技术研究所(中国电子科技集团公司第十研究所) Signal measurement feature matching and labeling method based on weight clustering
CN111401412B (en) * 2020-02-29 2022-06-14 同济大学 Distributed soft clustering method based on average consensus algorithm in Internet of things environment
CN111401412A (en) * 2020-02-29 2020-07-10 同济大学 Distributed soft clustering method based on average consensus algorithm in Internet of things environment
CN111881502A (en) * 2020-07-27 2020-11-03 中铁二院工程集团有限责任公司 Bridge state discrimination method based on fuzzy clustering analysis
CN112270355A (en) * 2020-10-28 2021-01-26 长沙理工大学 Active safety prediction method based on big data technology and SAE-GRU
CN112270355B (en) * 2020-10-28 2023-12-05 长沙理工大学 Active safety prediction method based on big data technology and SAE-GRU
CN112541528A (en) * 2020-12-02 2021-03-23 国家电网有限公司 Power transmission and transformation project cost prediction index optimization method based on fuzzy clustering
CN113761682A (en) * 2021-09-06 2021-12-07 重庆大学 Method for determining and optimizing dynamic pressure difference index of support in real time
CN114398493A (en) * 2021-12-29 2022-04-26 中国人民解放军92728部队 Unmanned aerial vehicle type spectrum construction method based on fuzzy clustering and cost-effectiveness value
CN114625945A (en) * 2022-03-16 2022-06-14 云南升玥信息技术有限公司 Medical treatment disease data management system based on artificial intelligence
CN117610991A (en) * 2023-11-15 2024-02-27 国网冀北电力有限公司经济技术研究院 Reliability analysis method, device, equipment and medium for power communication network
CN117727373B (en) * 2023-12-01 2024-05-31 海南大学 Sample and feature double weighting-based intelligent C-means clustering method for feature reduction

Similar Documents

Publication Publication Date Title
CN107330458A (en) A kind of fuzzy C-means clustering method of minimum variance clustering of optimizing initial centers
CN111814871B (en) Image classification method based on reliable weight optimal transmission
CN111191732B (en) Target detection method based on full-automatic learning
CN103886330A (en) Classification method based on semi-supervised SVM ensemble learning
CN112800770B (en) Entity alignment method based on heteromorphic graph attention network
CN111047182A (en) Airspace complexity evaluation method based on deep unsupervised learning
CN108763237A (en) A kind of knowledge mapping embedding grammar based on attention mechanism
CN109492076B (en) Community question-answer website answer credible evaluation method based on network
CN104503973A (en) Recommendation method based on singular value decomposition and classifier combination
CN103489033A (en) Incremental type learning method integrating self-organizing mapping and probability neural network
CN110084314B (en) False positive gene mutation filtering method aiming at targeted capture gene sequencing data
CN105654102A (en) Data processing device and data processing method
CN104794482A (en) Inter-class maximization clustering algorithm based on improved kernel fuzzy C mean value
CN105894024A (en) Possibility fuzzy c mean clustering algorithm based on multiple kernels
CN104850867A (en) Object identification method based on intuitive fuzzy c-means clustering
Zhou et al. ECMdd: Evidential c-medoids clustering with multiple prototypes
CN106845536A (en) A kind of parallel clustering method based on image scaling
CN105469114A (en) Method of increasing K-means convergence speed
CN108596204B (en) Improved SCDAE-based semi-supervised modulation mode classification model method
CN106529604A (en) Adaptive image tag robust prediction method and system
Li et al. A mixed data clustering algorithm with noise-filtered distribution centroid and iterative weight adjustment strategy
CN113591016B (en) Landslide labeling contour generation method based on multi-user cooperation
CN112668633B (en) Adaptive graph migration learning method based on fine granularity field
CN113987203A (en) Knowledge graph reasoning method and system based on affine transformation and bias modeling
CN109948674A (en) Method for measuring similarity and system based on depth meta learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171107

RJ01 Rejection of invention patent application after publication