CN109102021A

CN109102021A - The mutual polishing multicore k- mean cluster machine learning method of core under deletion condition

Info

Publication number: CN109102021A
Application number: CN201810910362.2A
Authority: CN
Inventors: 郑军; 刘新旺
Original assignee: Jushi Technology (shanghai) Co Ltd
Current assignee: Jushi Technology (shanghai) Co Ltd
Priority date: 2018-08-10
Filing date: 2018-08-10
Publication date: 2018-12-28

Abstract

The present invention relates to the mutual polishing multicore k- mean cluster machine learning methods of core under a kind of deletion condition, this method will be filled and be blended with cluster, the filling missing core under the guidance of cluster, it is clustered with the core of filling, and when filling lacks core, each missing core is filled up using other missing nuclear matrix information simultaneously, this method specific steps include: 1) to obtain target data sample and cluster number of targets, and the target data sample is mapped to multicore space；2) mutual polishing operation is introduced, the mutual polishing multicore k- mean cluster optimization object function of core is established；3) the mutual polishing multicore k- mean cluster optimization object function of core is solved, realizes cluster.Compared with prior art, the present invention has many advantages, such as that Clustering Effect is good.

Description

The mutual polishing multicore k- mean cluster machine learning method of core under deletion condition

Technical field

The invention belongs to computer visions and mode identification technology, are related to a kind of multiple view clustering method, especially The mutual polishing multicore k- mean cluster machine learning method of core being related under a kind of deletion condition.

Background technique

In recent years, a large amount of research is dedicated to designing effective multiple view clustering algorithm.Their purpose is by base core Optimum combination data are clustered.For example, document " Multiple kernel clustering " (B.Zhao, J.T.Kwok, and C.Zhang, in SDM, 2009, pp.638-649) it proposes while finding largest interval hyperplane, most preferably gather Class label and optimal core.At document " Optimized data fusion for kernel k-means clustering " (S.Yu,L.-C.Tranchevent,X.Liu,W.J.A.K.Suykens,B.D.Moor,and Y.Moreau, IEEE TPAMI, vol.34, no.5, pp.1031-1039,2012) in propose a kind of new k- mean value core optimization algorithm to combine Multiple data sources carry out data analysis.Document " Localized data fusion for kernel k-means clustering with application to cancer biology”(M.and A.A.Margolin,in NIPS, 2014, pp.1305-1313) in, the characteristics of combining weights of core can be adjusted adaptively to capture sample.Use l_2,1Model The mean square error of number substitution k- mean value, document " Robust multiple kernel k-means clustering using l₂₁-norm”(L.Du,P.Zhou,L.Shi,H.Wang,M.Fan,W.Wang,and Y.-D.Shen,in IJCAI,2015, Pp.3476-3482) the multicore k- mean algorithm of robust a kind of is devised, this method can find optimal cluster mark simultaneously Label and optimal core combine.Document " Multiple kernel k-means clustering with matrix-induced regularization”(X.Liu,Y.Dou,J.Yin,L.Wang,and E.Zhu,in AAAI,2016,pp.1888–1894) A kind of matrix export regularization method is devised to reduce redundancy, while promoting the diversity of selected core.These algorithms by Various applications are applied to, and present good Clustering Effect.

It is all complete that the hypothesis that above-mentioned multicore clustering algorithm generallys use, which is all base cores, that is to say, that each The row and column of base core does not all lack.In some practical applications, such as the prediction and cardiopathic differentiation of Alzheimer disease, one Certain views missing of a sample is that very common this will lead to the missing of the correspondence row and column of relevant base core.Endless integral basis core Presence to carry out cluster using the information of all views and become abnormal difficult.One direct remedial measure is exactly first with one Kind fills up algorithm to fill up missing core, is then clustered using a kind of clustering algorithm of standard.It is some commonly to fill up algorithm There are zero padding, mean value filling, k- neighbour filling and greatest hope filling.Recently, more advanced filling algorithm is proposed out.Text Offer " Multiview clustering with incomplete views, " (A.Trivedi, P.Rai, H.Daum é III, and S.L.DuVall,in NIPS 2010:Machine Learning for Social Computing Workshop, Whistler, Canada, 2010) using other full views be missing view construct a complete nuclear matrix.Pass through utilization Connection between multiple view, document " Multi-view learning with incomplete views, " (C.Xu, D.Tao, And C.Xu, IEEE Trans.Image Processing, vol.24, no.12, pp.5812-5825,2015) propose one kind The method that multiple view study is completed using missing view, it is empty that this algorithm thinks that different views results from a shared son Between.Document " Multiple incomplete views clustering via weighted nonnegative matrix factorization with l_2,1regularization,”(W.Shao,L.He,and P.S.Yu,in ECML PKDD, 2015, pp.318-334) propose a kind of imperfect multiple view algorithm, it learns the eigenmatrix of all views, and generates consistent Matrix, so that the difference of each view and Consistent Matrix minimizes.In addition, the pass in the view for passing through modeling core value between view System, document " Multi-view kernel completion " (S.Bhadra, S.Kaski, and J.Rousu, in arXiv: 1602.02518,2016) a kind of algorithm is proposed to predict the row and column of base core missing.Although presenting in various applications very well Clustering performance, but " two stages " algorithm above-mentioned have the shortcomings that one it is common, they separated filling and cluster this Two processes thereby reduce clustering performance which suppress mutually coordinated between two processes.Meanwhile if in cluster process In do not fully consider each internuclear available information and redundancy, it is possible to Clustering Effect can be had adverse effect on.

Summary of the invention

It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide under a kind of deletion condition The mutual polishing multicore k- mean cluster machine learning method of core.

The purpose of the present invention can be achieved through the following technical solutions:

A kind of mutual polishing multicore k- mean cluster machine learning method of core under deletion condition, this method will fill and cluster It blends, filling missing core, is clustered with the core of filling under the guidance of cluster, and when filling lacks core, is utilized simultaneously Other missing nuclear matrix information fill up each missing core, and this method specific steps include:

1) target data sample and cluster number of targets are obtained, the target data sample is mapped into multicore space；

2) mutual polishing operation is introduced, the mutual polishing multicore k- mean cluster optimization object function of core is established；

3) the mutual polishing multicore k- mean cluster optimization object function of core is solved using endless form, realizes cluster.

Further, the mutual polishing multicore k- mean cluster optimization object function of the core specifically:

Wherein, H indicates an intermediate parameters, and β indicates core coefficient, K_pIndicate p-th of nuclear matrix, m indicates that total nucleus number, I indicate Unit matrix, n indicate that number of samples, k indicate cluster number of clusters, and λ is regularization parameter,Indicate that all elements are all 1 Column vector, s_pIndicate the index of this p-th of core,Indicate daughter nucleus matrix.

Further, the missing is specially the missing of base core row and column.

Further, in the step 3), it is excellent that the mutual polishing multicore k- mean cluster of core is solved using three step alternative methods Change objective function.

Further, the three steps alternative method specifically includes:

I) fixing Beta andOptimize H；

Ii) fixing Beta and H, optimization

Iii) fixed H andOptimize β.

Further, when the optimization H, traditional core is converted by the mutual polishing multicore k- mean cluster optimization object function of core K- mean cluster problem.

Further, method optimizing is declined using coordinate

Further, the coordinate decline mode specific steps include:

101) it is directed to each K_p, by other nuclear matrixConstant is remained, according to the mutual polishing multicore of the core K- mean cluster optimization object function, which obtains, optimizes each K_pOptimization problem:

Wherein,

102) obtaining step 101) in optimization problem near-optimal problem:

The solution for obtaining the near-optimal problem, using singular value decomposition by the demapping to positive semidefinite space, to obtain The optimal K obtained_p。

Further, when the optimization β, will missing multicore k- mean cluster optimization object function be converted into it is linear about The quadratic programming problem of beam, specifically:

Wherein, d=[d₁,…,d_m]^TIt is a column vector, each d_p=Tr (K_β(I_n-HH^T)),Diagonal line It is above m-1, other elements m-2,It is the cross-correlation for measuring each pair of core, wherein M_pq=Tr (K_pK_q), f= M1_m- diag (M), 1_mIndicate the column vector that the m that an all elements are 1 is tieed up.

Further, the termination condition of cyclic process are as follows:

Wherein, β^(t)、β^(t-1)The core coefficient of t, t-1 iteration are respectively indicated, m is base nucleus number, ε₀For setting accuracy.

Compared with prior art, the invention has the following advantages:

1) present invention can guide filling process with cluster result, this can make to fill out using the combined optimization of filling and cluster It fills directly against in last target, Clustering Effect is good.

2) present invention further improves Clustering Effect by promoting the mutual polishing of missing nuclear matrix, fully takes into account it The available information of its core and internuclear redundancy, Clustering Effect are good.

Detailed description of the invention

Fig. 1 is flow diagram of the invention；

Fig. 2 is different clustering algorithms performance pair on data set Cornell, Texas, Washington and Wisconsin Than figure；

Fig. 3 is different clustering algorithms performance comparison figure on data set Caltech101；

Fig. 4 is different clustering algorithms performance comparison figure on data set CCV；

Fig. 5 is the neat comparison diagram of verification of parent and algorithms of different filling under different miss rates；

Fig. 6 is different clustering algorithms performance comparison figure on data set Flower17 and Flower102.

Specific embodiment

The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention Premised on implemented, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to Following embodiments.

One, core k- mean cluster

Indicate the set of n sample,It indicates Feature Mapping, x is mapped to one Reproducing kernel Hilbert spaceThe objective function of core k- mean cluster is to minimize cluster allocation matrix Z ∈ { 0,1 }^n×k's Error sum of squares can be expressed as following optimization problem:

Wherein,WithRespectively indicate the size and center of c-th of cluster.

The Vector-Matrix Form that the optimization problem of equation (1) description can be written as follows:

Wherein, K is nuclear matrix, and each element is K_ij=φ (x_i)^Tφ(x_j), Indicate that all elements are all 1 column vector.

Variable Z in equation (2) be it is discrete, this makes it be difficult to optimize.Definition can usually be passed throughSo that H can take real number value, problem (2) can be expressed as equation in this way:

Wherein, I_nIt is a unit matrix, size is k × k.

By calculating the corresponding k feature vector of larger characteristic value of K, to solve the optimal solution of equation (3).

Two, multicore k- mean cluster

In multicore configuration, there are many character representations for each sample, these character representations are by one group of Feature MappingDefinition.Specifically, each sample is expressed as φ_β(x)=[β₁φ₁(x)^T,…,β_mφ_m(x)^T]^T, Middle β=[β₁,…,β_m]^T, indicate the coefficient of m base core.These coefficients will be optimised during study.Based on definition φ_β(x), a kernel function can be expressed as:

The K being calculated with equation (4)_βThe nuclear matrix K in equation (3) is substituted, the objective function of multicore cluster can be by It is write as:

The problem can be solved by alternately updating H and β: i) optimize H fixing Beta, fixed core factor beta, H can be by solution side Journey (3) core k- mean cluster optimization problem solving；Ii) the fixed H of optimization β, fixed H, can be by solving following linear restriction Quadratic programming problem can solve β,

Three, the method for the present invention

The present invention provides the mutual polishing multicore k- mean cluster machine learning method of core under a kind of deletion condition, and this method will Filling is blended with cluster, and filling missing core, is clustered with the core of filling under the guidance of cluster, and lacks core in filling When (Incomplete Kernel), while each missing core is filled up using other missing nuclear matrix information, missing core is specially The missing of base core row and column.As shown in Figure 1, this method specific steps include:

1) target data sample and cluster number of targets are obtained, the target data sample is mapped into multicore space.

2) the mutual polishing of core (Mutual Kernel Complement) operation is introduced, it is poly- to establish the mutual polishing multicore k- mean value of core Class optimization object function, specifically:

Above-mentioned equation includes following advantages: a) above-mentioned objective function is more direct, aims at last cluster task, these are logical It crosses fusion nucleus to fill up and cluster to a unified learning framework, it is considered as byproduct that center, which is filled up,；B) above problem work Make in a multicore scene, a large amount of base cores can be handled naturally, and adaptively combine, realizes cluster；C) it does not need to appoint What base core be it is complete, this be all for many existing complementing methods it is required, in addition to this, once cluster number of clusters is given Fixed, the above method does not need any other parameter；D) above-mentioned equation introduces the mutual polishing operation of core, promotes the phase of missing nuclear matrix Mutual polishing, further improves Clustering Effect.

3) three step alternative methods are used and solve the mutual polishing multicore k- mean cluster optimization aim letter of the core in a looping fashion Number realizes cluster.

I) fixing Beta andOptimize H.Optimization in equation (7) about H is about kept to a kind of traditional core k- mean value and asks Topic, can effectively be solved with existing kit.

Ii) fixing Beta and H, optimizationThe present invention declines method optimizing using coordinateSpecifically include step It is rapid:

Wherein,

102) itself is a positive semi-definite planning for the optimization problem that equation (8) indicates, can be used existing convex excellent Chemical industry has to solve, such as CVX.However, the high time complexity of semi definite programming problem limits it in extensive problem Application.In order to reduce computation burden, obtaining step 101 of the present invention) in optimization problem near-optimal problem:

The solution for obtaining the near-optimal problem, using singular value decomposition by the demapping to positive semidefinite space, to obtain The optimal K obtained_p.Each K_pPolishing rely on the combination of cluster result H and other cores simultaneously.

The optimal solution of equation (9) can be by filling up K with the corresponding part of T_pAbsent element easily solve.? After the solution for obtaining equation (9), it can use singular value decomposition and map it to positive semidefinite space, to meet equation (8) Second constraint.The computational efficiency that mentioned algorithm can be significantly improved in this way, keeps middle large-scale application feasible.

Iii) fixed H andOptimize β, converts missing multicore k- mean cluster optimization object function to wired Property constraint quadratic programming problem, specifically:

Wherein, d=[d₁,…,d_m]^TIt is a column vector, each d_p=Tr (K β (I_n-HH^T)),Diagonal line It is above m-1, other elements m-2,It is the cross-correlation for measuring each pair of core, wherein M_pq=Tr (K_pK_q), f= M1_m- diag (M), 1_mIndicate the column vector that the m that an all elements are 1 is tieed up.

To find out that the cross-correlation of base core is included in by M from equation (10), this helps to reduce redundancy, increases simultaneously The diversity of strong core selection, while can also improve Clustering Effect.

The mutual polishing multicore k- mean cluster machine learning method (MKKM-IK-MKC) of core under above-mentioned deletion condition is specific such as Under:

1: input:

2: output: H,

3: initialization β⁽⁰⁾=1_m/ m,

4:repeat

5:

6: givenH is updated by formula (3)^(t).

7: utilizing H^(t)WithEach is updated by formula (8)

8: given H^(t)Withβ is updated by formula (10)^(t).

9:t=t+1.

10:until

In MKKM-IK-MKC,Missing item be initially filled with 0, β^(t)Indicate the core coefficient of the t times iteration. The computation complexity of each iteration of the process is O (n³+mn³+n³), wherein n and m respectively indicates total sample number and base core Number.In this method, since each is independent, K_pIt can be with parallel computation, to improve calculating speed.

Four, it tests

It is assessed on the Multiple Kernel Learning data set that the present embodiment is widely used at 13, these data sets are Cornell、Texas、Washington、Wisconsin、OxfordFlower17、Flower102、Columbia、 ConsumerVideo (CCV) and Caltech101, the primitive character of first four data set are available.The present embodiment utilizes one A linear kernel is mapped in the feature of each view, to obtain two nuclear matrix on each data set.For CCV, lead to It crosses using a Gaussian kernel in SIFT, STIP and MFCC feature, generates three base cores, the width of three Gaussian kernels is arranged to often To the mean value of sample distance.For Flower17, Flower102 and Caltech101 data set, all nuclear matrix are preparatory It is calculated, downloading can be disclosed from website and obtained.It is 5 that Caltech101-5 expression, which belongs to the number of samples of each cluster,.

Several frequently seen fills up algorithm, including zero padding (ZF), mean value filling (MF), k- neighbour fill (KNN), alignment most Bigization fills (AF) and part multiple view cluster (PVC).In experiment these Two Phase Methods be referred to as MKKM+ZF, MKKM+MF, MKKM+KNN and MKKM+AF.In addition, the clustering method that only there is filling and cluster joint to consider is denoted as MKKM-IK, and it is three A large amount of comparisons, including MKKM-IK+ZF, MKKM-IK+MF and MKKM-IK+KNN have been carried out under the different initialization conditions of kind.This hair Bright clustering method is known as the complementary neat MKKM-IK of verification, is denoted as MKKM-IK-MKC.The present embodiment by the method for the present invention with it is above-mentioned The existing comparison of method.

All base cores have all carried out centralized criterion.For all data sets, the true number of cluster be it is known and It is counted as the number of classification.In order to generate imperfect core, the present embodiment generates following index vectorThe present embodiment is first Round (ε * n) sample is selected, wherein round () indicates the function that rounds up.For the sample that each is chosen, all There is a random vector v=(v₁,…,v_m)∈[0,1]^mWith scalar v₀(v₀∈ [0,1]) it generates.If meeting v_p≥v₀, then P-th of view of the sample will will appear.If v₁,…,v_mIt is all unsatisfactory for condition, then it is every to ensure to generate a new vector v At least one view of a sample is available.After completing above step, it will obtain an index vector s_p, which can It is existing to list p-th of view of sample.Parameter ε indicates that miss rate, control view have the sample of missing in an experiment Percentage, in comparison, it will affect the performance of algorithm.Instinctively, ε is bigger, and the performance of clustering algorithm is poorer.Specifically, Value of the ε on all data sets is set as [0.1:0.1:0.9].

The different clustering algorithm performance comparisons on data set Cornell, Texas, Washington and Wisconsin of table 1

The present embodiment uses clustering precision (ACC), normalized mutual information (NMI) and purity as assessment clustering algorithm The evaluation index of energy, as shown in table 1.For all algorithms, each experiment is made 50 times of different random initial values, to be used to reduce K- mean algorithm bring randomness, then chooses best result.Meanwhile generating 10 missing moulds at random by the above method Formula obtains data result.

(1) experimental result on WebKB data set

The present embodiment is tested on four WebKB data sets, including Cornell, Texas, Washington and Wisconsin, in order to compare with algorithm PVC, this needs to obtain primitive character, and can only handle the cluster of two views Task.Table 1 and Fig. 2 show the comparison result of ACC, NMI, purity and standard deviation, and the best meeting overstriking of effect is shown.From table It is observed that MKKM-IK-MKC of the invention is shown shows best effect on all data sets in 1 and Fig. 2, compared to The improvement of existing algorithm is very significant.For example, it improves secondary on Texas data set for total clustering precision Nearly 5 percentage points of good algorithm (PVC).

(2) experimental result on Caltech101 data set

Caltech101 is widely used as standard data set to assess the effect of multicore clustering algorithm, algorithm mentioned above It is on this data set the results show that wherein the constant interval of each sample is 5,10 ..., 30.Total ACC, NMI and Purity data are shown in Table 2.From these three indexs of ACC, NMI and purity, we be can be clearly seen that, the present invention is obvious excellent In other.The validity that core reconstruct is carried out in cluster is best shown in these results.

The different clustering algorithm performance comparisons on data set Caltech101 of table 2

Under different miss rates, the cluster result of algorithms of different, fig. 3, it is shown that with existing method phase Than MKKM-IK-MKC significant effect of the present invention.By taking subgraph (3d) as an example, in different initialization, MKKM-IK- of the present invention MKC achieves preferable effect on the whole.From subgraph (3a)-(3p), it can be clearly seen that the method for the present invention is with sample The improvement degree of several increases, Clustering Effect will be significantly better than other algorithms.

(3) experimental result on data set Flower17 and Flower102

We are also compared the Clustering Effect of above-mentioned algorithm on data set flower17 and flower102, The flower102 data set of flower17 is also the widely used standard data set in Multiple Kernel Learning.Total ACC, NMI and Purity index can also see which demonstrate the validity of MKKM-IK algorithm of the present invention in table 3.

The different clustering algorithm performance comparisons on data set Flower17 and Flower102 of table 3

As shown in figure 4, the method for the present invention has preferable Clustering Effect.By taking subgraph (4a) as an example, when miss rate is 0:1, The clustering precision of MKKM-IK-MKC is more than that method second-best is more than 10 percentage points.When miss rate variation, the method for the present invention Advantage is still able to maintain.

(4) experimental result on data set CCV

Results of property of the method for the present invention on CCV data set is as shown in table 4 and fig. 5.As can be seen from the results, in ACC, NMI In tri- indexs of purity, the method for the present invention is significantly better than comparison algorithm.

The different clustering algorithm performance comparisons on data set CCV of table 4

(5) parent with fill up being aligned between core

The present embodiment calculate parent and fill up after core between degree of registration, and the similarity of utilization measure nuclear matrix To measure the degree of registration of core.Result under different miss rates is shown in Fig. 6, and total alignment condition and standard deviation are shown in Table 5.These knots Fruit shows that the present invention can not only obtain better cluster result, also can preferably be filled out using the thought generation for serving cluster Fill result.

The different clustering algorithms of table 5 performance comparison on all data set CCV is summarized

Existing multicore clustering algorithm shows good effect in various applications, but they cannot be effectively treated The problem of core lacks.Combined optimization core filling of the present invention and cluster are to solve this problem.This makes two learning processes seamless Fusion, and mutual alignment operation is introduced, achieve better cluster result.Through a large number of experiments, in multiple public data collection On all demonstrate cluster result be improved significantly.

The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical solution, all should be within the scope of protection determined by the claims.

Claims

1. the mutual polishing multicore k- mean cluster machine learning method of core under a kind of deletion condition, which is characterized in that this method will Filling is blended with cluster, and filling missing core, is clustered with the core of filling under the guidance of cluster, and lacks core in filling When, while each missing core is filled up using other missing nuclear matrix information, this method specific steps include:

2. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 1, special Sign is, the mutual polishing multicore k- mean cluster optimization object function of core specifically:

Wherein, H indicates an intermediate parameters, and β indicates core coefficient, K_pIndicate p-th of nuclear matrix, m indicates that total nucleus number, I indicate unit square Battle array, n indicate that number of samples, k indicate cluster number of clusters, and λ is regularization parameter,Indicate all elements be all 1 column to Amount, s_pIndicate the index of this p-th of core,Indicate daughter nucleus matrix.

3. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 1, special Sign is that the missing is specially the missing of base core row and column.

4. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 2, special Sign is, in the step 3), solves the mutual polishing multicore k- mean cluster optimization object function of core using three step alternative methods.

5. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 4, special Sign is that the three steps alternative method specifically includes:

I) fixing Beta andOptimize H；

Ii) fixing Beta and H, optimization

Iii) fixed H andOptimize β.

6. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 5, special Sign is, when the optimization H, converts traditional core k- mean cluster for the mutual polishing multicore k- mean cluster optimization object function of core Problem.

7. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 5, special Sign is, declines method optimizing using coordinate

8. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 7, special Sign is that the coordinate decline mode specific steps include:

101) it is directed to each K_p, by other nuclear matrixConstant is remained, according to the mutual polishing multicore k- mean value of the core It clusters optimization object function and obtains and optimize each K_pOptimization problem:

Wherein,

102) obtaining step 101) in optimization problem near-optimal problem:

The solution for obtaining the near-optimal problem, using singular value decomposition by the demapping to positive semidefinite space, thus obtain Optimal K_p。

9. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 5, special Sign is, when the optimization β, converts the secondary rule with linear restriction for missing multicore k- mean cluster optimization object function The problem of drawing, specifically:

Wherein, d=[d₁,…,d_m]^TIt is a column vector, each d_p=Tr (K_β(I_n-HH^T)),Diagonal line on be M-1, other elements m-2,It is the cross-correlation for measuring each pair of core, wherein M_pq=Tr (K_pK_q), f=M1_m- Diag (M), 1_mIndicate the column vector that the m that an all elements are 1 is tieed up.

10. the mutual polishing multicore k- mean cluster machine learning method of core under deletion condition according to claim 1, special Sign is, the termination condition of cyclic process are as follows: