CN106934415A - A kind of K means initial cluster center choosing methods based on Delaunay triangulation network - Google Patents

A kind of K means initial cluster center choosing methods based on Delaunay triangulation network Download PDF

Info

Publication number
CN106934415A
CN106934415A CN201710090315.3A CN201710090315A CN106934415A CN 106934415 A CN106934415 A CN 106934415A CN 201710090315 A CN201710090315 A CN 201710090315A CN 106934415 A CN106934415 A CN 106934415A
Authority
CN
China
Prior art keywords
point
cluster center
initial cluster
mixing
triangulation network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710090315.3A
Other languages
Chinese (zh)
Inventor
马燕
杨杰
韦高洁
张相芬
李顺宝
张玉萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Normal University
University of Shanghai for Science and Technology
Original Assignee
Shanghai Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Normal University filed Critical Shanghai Normal University
Priority to CN201710090315.3A priority Critical patent/CN106934415A/en
Publication of CN106934415A publication Critical patent/CN106934415A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Abstract

The invention discloses a kind of K means initial cluster center choosing methods based on Delaunay triangulation network, data set table to be clustered is shown as Delaunay triangulation network, calculates the representative point in the triangulation network;The product of each density sum for representing point and its Euclidean distance is calculated as two mixing distances represented between point, then, the 1st initial cluster center is selected in all representative points, and in adding it to initial cluster center set C, the 2nd initial cluster center of reselection, and in adding it to initial cluster center set C, then, calculate the mixing distance with each initial cluster center in initial cluster center set one by one in remaining representative point, and select minimum mixing distance, then the representative point corresponding to maximum mixing distance is picked out in all of minimum mixing distance, and in adding it to initial cluster center set C, the qualified point that represents constantly is picked out from point is represented and is added to set C, until the element number that initial cluster center set C is included is equal to K.

Description

A kind of K-means initial cluster center choosing methods based on Delaunay triangulation network
Technical field
It is initial the present invention relates to computer classes field, more particularly to a kind of K-means based on Delaunay triangulation network Cluster centre choosing method.
Background technology
Cluster is a kind of unsupervised data analysing method, in the case of no priori, to sample by respective Characteristic is reasonably classified, and is widely used in Data Mining.The principle of classification of cluster is to make the data in same group With similitude as big as possible, the data in different groups have diversity as big as possible.That is, data similarity is got in organizing Greatly, data similarity is smaller between group, then classifying quality is better.Clustering algorithm can be divided into based on divide, density, layering, Grid and the type such as model.Used as based on the clustering algorithm for dividing, K-means clustering algorithms are because its algorithm is simple, perform height Imitate and be widely used.
The basic step of K-means clustering algorithms is as follows:
The first step:K data object is randomly selected from comprising the n data set of data object as in initial clustering The heart, wherein K (K >=2) are the number of predetermined cluster;
Second step:Closest class is assigned to according to minimal distance principle to the data object that data are concentrated;
3rd step:The average of each data object in clustering is calculated as new cluster centre;
4th step:Second step and the 3rd step are repeated, until cluster centre no longer changes.
K-means clustering algorithms have a quick, simple advantage, but due to initial cluster center be by randomly select come Determine, therefore there is problems with the method:If 1) initial cluster center of a certain classification comes from another category, cluster knot Easily there is local optimum in fruit, and can not reach global optimum;2) cluster result depends on the selection of initial cluster center, causes to gather Class unstable result;3) mistake cluster result is caused when hypotelorism between initial cluster center.
To overcome disadvantage mentioned above, many technical staff propose improved method.CCIA algorithms are based on data compression principle, right Each attribute of data performs K-means algorithms and obtains many data patterns, finally merges, and algorithm whole structure is good, But algorithm complex increases with the increase of data object dimension.Another kd-tree methods be with the density of bounding box come Instead of the density of each data point.The method has the following disadvantages:First, this replacement cannot accurately express the density of data point Distribution situation, if second, value of all data points under a certain attribute is all equal in certain bounding box, the density of the bounding box It is as a result meaningless for infinite.Also a kind of K-means++ algorithms, the algorithm considers the distance between data point, but there is also with Lower shortcoming:First, first initial center is randomly selected and causes final result unstable, second, the density to data point is not done Definition, so as to cause cluster result easily to be influenceed by outlier.
Therefore, those skilled in the art is devoted to a kind of K-means based on Delaunay triangulation network of exploitation and initially gathers Class center choosing method, overcomes the shortcoming of random selection initial cluster center in traditional K-means methods, improves clustering precision, Avoid the influence of outlier.
The content of the invention
In view of the drawbacks described above of prior art, the technical problems to be solved by the invention are to overcome traditional K-means side The shortcoming of initial cluster center is randomly choosed in method, clustering precision is improved, it is to avoid the influence of outlier.
To achieve the above object, the invention provides in a kind of K-means initial clusterings based on Delaunay triangulation network Heart choosing method, comprises the following steps:
Step 1, data set table to be clustered is shown as Delaunay triangulation network so that each data point in data set to be clustered Corresponded with the node in Delaunay triangulation network;
Step 2, the average for calculating each Atria summit in Delaunay triangulation network, and using average as triangle Represent a little;
Step 3, the inverse for calculating triangle area where each representative point, and represent falling for point place triangle area by each Number is used as the density for representing point;
Step 4, the density sum of calculation representative point and the Euclidean distance of point is represented, and using both products as two generations Mixing distance between table point;
Step 5, it is all represent point in select the maximum representative o'clock of density as the 1st initial cluster center, and will be close Maximum representative point is spent to be added in initial cluster center set C;
Step 6, selection are with the 1st mixing of initial cluster center apart from farthest representative o'clock as the 2nd initial clustering Center, and will be added in initial cluster center set C apart from farthest representative point with the mixing of the 1st initial cluster center;
Step 7, it is remaining represent point in calculate one by one with initial cluster center set each initial cluster center it is mixed Distance is closed, and selects minimum mixing distance, it is corresponding then to pick out maximum mixing distance in all of minimum mixing distance Representative point, and the representative point that will be picked out corresponding to maximum mixing distance in all of minimum mixing distance be added to it is initial poly- In class centralization C, the qualified point that represents constantly is picked out from point is represented and is added to set C, until initial clustering The element number that centralization C is included is equal to K.
Further, step 1 specific method includes:
Data set to be clustered is arranged to X={ x1,x2,...,xnN data object is included, it is that data set X builds Delaunay triangulation network G=(V, E), wherein, V={ v1,v2,...,vnThe set of triangulation network G interior joints is represented, E represents triangle The set on side in net G, and a data object x in data set XiA node v in ∈ X and triangulation network GiIt is between ∈ V One-to-one relationship, then the interstitial content in triangulation network G be equal to data set X in data object number, two in triangulation network G Distance between node is equal to the Euclidean distance between its corresponding data object, i.e. d (vi,vj)=d (xi,xj)。
Further, step 2 specific method includes:
The three of triangle T summits are separately arranged as v in constituting triangulation network Gi、vj、vk, three summits respectively with X in data set to be clusteredi、xj、xkThese three data objects are corresponded, wherein, xi=(xi1,xi2,…,xid), xj= (xj1,xj2,…,xjd), xk=(xk1,xk2,…,xkd), d represents the attribute dimension of data object, calculates three averages on summit ForAverage as triangle T representative point.
Further, step 3 specific method includes:
Three summits for representing triangle T where point r are separately arranged as vi、vj、vk, three summits respectively with it is to be clustered Data set in xi、xj、xkThese three data objects are corresponded, wherein, xi=(xi1,xi2,…,xid), xj=(xj1, xj2,…,xjd), xk=(xk1,xk2,...,xkd), d represents the attribute dimension of data object, then three length of side difference in triangle T It is arranged to:
Calculate the semi-perimeter of triangleObtaining the area S of triangle T is Finally, the inverse of area S is obtained, i.e.As the density for representing point r.
Further, step 4 specific method includes:
Two represent point r1, r2Density be separately arranged as ρ1With ρ2, represent point r1With representative point r2Euclidean distance etc. In d12, then point r is represented1With representative point r2Between mixing distance be equal to h=(ρ12)×d12
Further, step 5 specific method includes:
All set for representing point composition are arranged to R={ r1,r2,...,rt, t is the Delaunay triangulation network for building The number of intermediate cam shape, first, all density for representing point is calculated by step 3, then, density maximum is selected from set R Representative o'clock is used as the 1st initial cluster center c1, and the maximum representative point of set R Midst densities is added to initial cluster center collection In conjunction C, i.e. C={ c1, then the maximum representative point of density is removed from set R, rearrangement represents point set, obtains R= {r1,r2,...,rt-1}。
Further, step 6 specific method includes:
Difference calculation representative point set R={ r1,r2,...,rt-1In each represent point with first initial cluster center Mixing distance, takes mixing apart from farthest representative o'clock as the 2nd initial cluster center c2, and will mix apart from farthest representative Point is added in initial cluster center set C, i.e. C={ c1,c2, then mixing is moved apart from farthest representative point from set R Remove, rearrangement represents point set, obtains R={ r1,r2,...,rt-2}。
Further, step 7 specific method includes:
Step 71, from it is remaining represent point set R in select r1, calculate each initial poly- with initial cluster center set C The mixing distance at class center, and the mixing distance of minimum is selected in all of mixing distance, it is expressed as h1min
Step 72, r is selected from R2, the mixing distance with each initial cluster center in initial cluster center set C is calculated, And minimum mixing distance is selected in all of mixing distance, it is expressed as h2min;Until picking out last representative from R Point rt-2, the mixing distance with each initial cluster center in initial cluster center set C is calculated, and in all of mixing distance The mixing distance of minimum is selected, h is expressed as(t-2)min
Step 73, in all of minimum mixing apart from h1min, h2min..., h(t-2)minIn pick out maximum mixing apart from institute It is corresponding to represent a little, and representative point is added in initial cluster center set C, constantly picked out from point is represented and meet bar The representative point of part is added to initial cluster center set C, until the element number that initial cluster center set C is included is equal to K.
Technique effect
Overcome the shortcoming that initial cluster center is randomly choosed in traditional K-means methods, improve clustering precision, it is to avoid from The influence of group's point.
The technique effect of design of the invention, concrete structure and generation is described further below with reference to accompanying drawing, with It is fully understood from the purpose of the present invention, feature and effect.
Brief description of the drawings
Fig. 1 is a kind of K-means initial clusterings based on Delaunay triangulation network of a preferred embodiment of the invention The schematic flow sheet of center choosing method.
Fig. 2 is a kind of K-means initial clusterings based on Delaunay triangulation network of a preferred embodiment of the invention Schematic diagram of 60 data objects of center choosing method under plane right-angle coordinate.
Fig. 3 is a kind of K-means initial clusterings based on Delaunay triangulation network of a preferred embodiment of the invention 108 triangulars of center choosing method into schematic diagram of the triangulation network under plane right-angle coordinate.
Fig. 4 is a kind of K-means initial clusterings based on Delaunay triangulation network of a preferred embodiment of the invention 108 of center choosing method represent schematic diagram of the point under plane right-angle coordinate.
Fig. 5 is a kind of K-means initial clusterings based on Delaunay triangulation network of a preferred embodiment of the invention Schematic diagram of the initial cluster center of center choosing method under plane right-angle coordinate.
Fig. 6 is a kind of K-means initial clusterings based on Delaunay triangulation network of a preferred embodiment of the invention Schematic diagram of the initial cluster center of center choosing method on the data set with outlier.
Specific embodiment
To improve the accuracy rate of K-means Clustering Effects, can be using in method choice initial clustering proposed by the present invention The heart.
It is specific to implement that following processing procedure is deferred to:
Data-oriented collection X={ x1,x2,...,xn, comprising n data object.For data set X builds Delaunay triangles Net is simultaneously calculated and represents point set R={ r1,r2,...,rt, t is the number of the Delaunay triangulation network intermediate cam shape for building, Each density for representing point is calculated, and density sum and its product of Euclidean distance that two represent point are represented a little as two Between mixing distance, then, it is all represent point in select the maximum representative o'clock of density as the 1st initial cluster center, and Add it in initial cluster center set C, reselection is with first mixing of initial cluster center apart from farthest representative O'clock as the 2nd initial cluster center, and add it to initial cluster center set C, then, point is represented remaining The mixing distance with each initial cluster center in initial cluster center set is calculated one by one, and selects minimum mixing distance, then Pick out the representative point corresponding to maximum mixing distance in all of minimum mixing distance, and add it to initial clustering In heart set C, the qualified point that represents constantly is picked out from point is represented and is added to set C, until initial cluster center The element number that set C is included is equal to K.
As shown in figure 1, a kind of K-means initial clusterings based on Delaunay triangulation network of a preferred embodiment of the present invention The specific implementation of center choosing method is comprised the following steps:
Step 1, data set table to be clustered is shown as Delaunay triangulation network so that each data point in data set to be clustered Corresponded with the node in Delaunay triangulation network;Concrete operations are:
Data set to be clustered is arranged to X={ x1,x2,...,xnN data object is included, it is that data set X builds Delaunay triangulation network G=(V, E), wherein, V={ v1,v2,...,vnThe set of triangulation network G interior joints is represented, E represents triangle The set on side in net G, and a data object x in data set XiA node v in ∈ X and triangulation network GiIt is between ∈ V One-to-one relationship, then the interstitial content in triangulation network G be equal to data set X in data object number, two in triangulation network G Distance between node is equal to the Euclidean distance between its corresponding data object, i.e. d (vi,vj)=d (xi,xj)。
Step 2, the average for calculating each Atria summit in Delaunay triangulation network, and using average as triangle Represent a little;Concrete operations are:
The three of triangle T summits are separately arranged as v in constituting triangulation network Gi、vj、vk, three summits respectively with X in data set to be clusteredi、xj、xkThese three data objects are corresponded, wherein, xi=(xi1,xi2,...,xid), xj= (xj1,xj2,...,xjd), xk=(xk1,xk2,...,xkd), d represents the attribute dimension of data object, calculates the equal of three summits It is worth and isAverage as triangle T representative point.
Step 3, the inverse for calculating triangle area where each representative point, and represent falling for point place triangle area by each Number is used as the density for representing point;Concrete operations are:
Three summits for representing triangle T where point r are separately arranged as vi、vj、vk, three summits respectively with it is to be clustered Data set in xi、xj、xkThese three data objects are corresponded, wherein, xi=(xi1,xi2,...,xid), xj=(xj1, xj2,...,xjd), xk=(xk1,xk2,...,xkd), d represents the attribute dimension of data object, then three length of sides point in triangle T It is not arranged to:
Calculate the semi-perimeter of triangleObtaining the area S of triangle T is Finally, the inverse of area S is obtained, i.e.As the density for representing point r.
Step 4, the density sum of calculation representative point and the Euclidean distance of point is represented, and using both products as two generations Mixing distance between table point;Concrete operations are:
Two represent point r1, r2Density be separately arranged as ρ1With ρ2, represent point r1With representative point r2Euclidean distance etc. In d12, then point r is represented1With representative point r2Between mixing distance be equal to h=(ρ12)×d12
Step 5, it is all represent point in select the maximum representative o'clock of density as the 1st initial cluster center, and will be close Maximum representative point is spent to be added in initial cluster center set C;Concrete operations are:
All set for representing point composition are arranged to R={ r1,r2,…,rt, t be build Delaunay triangulation network in The number of triangle, first, all density for representing point is calculated by step 3, then, density maximum generation is selected from set R Table o'clock is used as the 1st initial cluster center c1, and the maximum representative point of set R Midst densities is added to initial cluster center set In C, i.e. C={ c1, then the maximum representative point of density is removed from set R, rearrangement represents point set, obtains R={ r1, r2,…,rt-1}。
Step 6, selection are with the 1st mixing of initial cluster center apart from farthest representative o'clock as the 2nd initial clustering Center, and will be added in initial cluster center set C apart from farthest representative point with the mixing of the 1st initial cluster center; Concrete operations are:
Difference calculation representative point set R={ r1,r2,…,rt-1In each represent point with first initial cluster center Mixing distance, takes mixing apart from farthest representative o'clock as the 2nd initial cluster center c2, and will mix apart from farthest representative Point is added in initial cluster center set C, i.e. C={ c1,c2, then mixing is moved apart from farthest representative point from set R Remove, rearrangement represents point set, obtains R={ r1,r2,…,rt-2}。
Step 7, it is remaining represent point in calculate one by one with initial cluster center set each initial cluster center it is mixed Distance is closed, and selects minimum mixing distance, it is corresponding then to pick out maximum mixing distance in all of minimum mixing distance Representative point, and the representative point that will be picked out corresponding to maximum mixing distance in all of minimum mixing distance be added to it is initial poly- In class centralization C, the qualified point that represents constantly is picked out from point is represented and is added to set C, until initial clustering The element number that centralization C is included is equal to K.Concrete operations are:
Step 71, from it is remaining represent point set R in select r1, calculate each initial poly- with initial cluster center set C The mixing distance at class center, and the mixing distance of minimum is selected in all of mixing distance, it is expressed as h1min
Step 72, r is selected from R2, the mixing distance with each initial cluster center in initial cluster center set C is calculated, And minimum mixing distance is selected in all of mixing distance, it is expressed as h2min;Until picking out last representative from R Point rt-2, the mixing distance with each initial cluster center in initial cluster center set C is calculated, and in all of mixing distance The mixing distance of minimum is selected, h is expressed as(t-2)min
Step 73, in all of minimum mixing apart from h1min, h2min..., h(t-2)minIn pick out maximum mixing apart from institute It is corresponding to represent a little, and representative point is added in initial cluster center set C, constantly picked out from point is represented and meet bar The representative point of part is added to initial cluster center set C, until the element number that initial cluster center set C is included is equal to K.
Embodiment
(1) it is, manually generated to include 60 data set X={ x of data object1,x2,...,x60, the classification of the data set Number K=3, the attribute dimension of each data object is 2 dimensions, below, list the specific object of all data objects:
x1(-0.15,0.77),x2(-0.04,0.3),x3(0.47,-0.1),x4(-0.27,0.14),x5(0.25,- 0.33),x6(0.1,0.59),x7(-0.07,0.3),x8(-0.33,0.25),x9(-0.09,-0.28),x10(-0.03,0.1), x11(-0.46,-0.18),x12(0.06,-0.1),x13(-0.26,-0.18),x14(-0.03,-0.32),x15(0.11,- 0.29),x16(-0.29,-0.07),x17(-0.09,-0.54),x18(0.11,0.19),x19(-0.58,-0.04),x20 (0.33,0.22),x21(2.09,-0.16),x22(2.16,-0.06),x23(1.53,0.01),x24(1.68,-0.02),x25 (1.86,0.19),x26(2.03,0.03),x27(2.36,0.57),x28(1.91,0.1),x29(2.4,0.57),x30(2.15,- 0.23),x31(2.37,0.17),x32(2.04,-0.08),x33(1.79,0.19),x34(1.53,0.19),x35(2.05,- 0.69),x36(2.26,-0.42),x37(1.91,-0.46),x38(1.83,0.13),x39(1.9,0.46),x40(1.65,- 0.1),x41(1.26,1.8),x42(1.17,2.57),x43(0.67,1.66),x44(1.13,2.06),x45(0.76,1.52), x46(1.48,1.77),x47(0.99,1.81),x48(1.52,2.13),x49(0.87,2.3),x50(1.19,2.1),x51 (0.98,1.88),x52(0.36,2.26),x53(0.69,2.25),x54(1.19,2.04),x55(0.98,2.18),x56 (0.65,2.13),x57(0.8,1.69),x58(1.08,2.24),x59(0.69,1.79),x60(1.31,1.81);
Here to represent convenient as, each data object is regarded the point under plane right-angle coordinate, each data object 2 attributes regard 2 coordinates put under plane right-angle coordinate as, as shown in Fig. 2 it is straight in plane to list 60 data objects Corresponding point under angular coordinate system.
(2), it is data set X={ x1,x2,...,x60Delaunay triangulation network G=(V, E) is built, as shown in figure 3, under List for data set X={ x in face1,x2,...,x60Constructed by all Delaunay triangles:
t1(v28,v26,v25),t2(v12,v13,v9),t3(v8,v7,v1),t4(v12,v16,v13),t5(v4,v16,v10),t6 (v18,v2,v10),t7(v12,v10,v16),t8(v7,v8,v4),t9(v2,v6,v7),t10(v6,v1,v7),t11(v43,v59,v52),t12 (v12,v5,v3),t13(v39,v34,v33),t14(v19,v11,v16),t15(v55,v58,v49),t16(v53,v52,v56),t17(v20,v3, v34),t18(v54,v60,v48),t19(v3,v23,v34),t20(v26,v39,v25),t21(v34,v45,v20),t22(v33,v38,v25), t23(v24,v40,v32),t24(v35,v5,v17),t25(v39,v26,v31),t26(v32,v28,v24),t27(v13,v16,v11),t28 (v17,v13,v11),t29(v9,v14,v12),t30(v17,v9,v13),t31(v14,v15,v12),t32(v17,v14,v9),t33(v17,v15, v14),t34(v17,v5,v15),t35(v3,v37,v40),t36(v5,v12,v15),t37(v18,v10,v12),t38(v18,v20,v6),t39 (v18,v12,v20),t40(v12,v3,v20),t41(v5,v37,v3),t42(v1,v19,v8),t43(v16,v4,v19),t44(v6,v45, v1),t45(v7,v4,v10),t46(v8,v19,v4),t47(v52,v1,v43),t48(v52,v19,v1),t49(v10,v2,v7),t50(v18, v6,v2),t51(v55,v49,v56),t52(v20,v45,v6),t53(v59,v56,v52),t54(v45,v57,v43),t55(v44,v54,v50), t56(v49,v53,v56),t57(v49,v42,v53),t58(v42,v49,v58),t59(v42,v52,v53),t60(v59,v51,v56),t61 (v59,v57,v51),t62(v47,v45,v41),t63(v43,v1,v45),t64(v47,v57,v45),t65(v59,v43,v57),t66(v54, v51,v41),t67(v55,v56,v51),t68(v51,v47,v41),t69(v51,v57,v47),t70(v55,v44,v58),t71(v51,v44, v55),t72(v51,v54,v44),t73(v50,v48,v58),t74(v46,v48,v60),t75(v44,v50,v58),t76(v54,v41,v60), t77(v45,v46,v41),t78(v41,v46,v60),t79(v50,v54,v48),t80(v58,v48,v42),t81(v22,v26,v32),t82 (v46,v39,v27),t83(v46,v45,v39),t84(v46,v27,v29),t85(v46,v29,v48),t86(v31,v26,v22),t87(v38, v28,v25),t88(v38,v33,v24),t89(v31,v22,v30),t90(v31,v27,v39),t91(v27,v31,v29),t92(v30,v37, v36),t93(v24,v34,v23),t94(v39,v45,v34),t95(v38,v24,v28),t96(v23,v3,v40),t97(v39,v33,v25), t98(v34,v24,v33),t99(v32,v40,v37),t100(v24,v23,v40),t101(v21,v37,v30),t102(v22,v32,v21),t103 (v26,v28,v32),t104(v37,v35,v36),t105(v37,v5,v35),t106(v22,v21,v30),t107(v32,v37,v21),t108 (v30,v36,v31);
The number one of the triangle built by Delaunay triangulation network has 108, for example, t1Triangle is by v28, v26, v25These three nodes are constituted, due to a data object x in data set XiA node v in ∈ X and triangulation network Gi∈ V it Between be one-to-one relationship, so, t1Triangle can be regarded as by x28, x26, x25These three data objects are constituted, x28, x26, x252 dimension attributes of these three data objects are respectively (1.91,0.1), (2.03,0.03) and (1.86,0.19), be underneath with It is convenient, the node in above-mentioned all Delaunay triangles is all expressed as form with data object:
t1(x28,x26,x25),t2(x12,x13,x9),t3(x8,x7,x1),t4(x12,x16,x13),t5(x4,x16,x10),t6 (x18,x2,x10),t7(x12,x10,x16),t8(x7,x8,x4),t9(x2,x6,x7),t10(x6,x1,x7),t11(x43,x59,x52),t12 (x12,x5,x3),t13(x39,x34,x33),t14(x19,x11,x16),t15(x55,x58,x49),t16(x53,x52,x56),t17(x20,x3, x34),t18(x54,x60,x48),t19(x3,x23,x34),t20(x26,x39,x25),t21(x34,x45,x20),t22(x33,x38,x25), t23(x24,x40,x32),t24(x35,x5,x17),t25(x39,x26,x31),t26(x32,x28,x24),t27(x13,x16,x11),t28 (x17,x13,x11),t29(x9,x14,x12),t30(x17,x9,x13),t31(x14,x15,x12),t32(x17,x14,x9),t33(x17,x15, x14),t34(x17,x5,x15),t35(x3,x37,x40),t36(x5,x12,x15),t37(x18,x10,x12),t38(x18,x20,x6),t39 (x18,x12,x20),t40(x12,x3,x20),t41(x5,x37,x3),t42(x1,x19,x8),t43(x16,x4,x19),t44(x6,x45, x1),t45(x7,x4,x10),t46(x8,x19,x4),t47(x52,x1,x43),t48(x52,x19,x1),t49(x10,x2,x7),t50(x18, x6,x2),t51(x55,x49,x56),t52(x20,x45,x6),t53(x59,x56,x52),t54(x45,x57,x43),t55(x44,x54,x50), t56(x49,x53,x56),t57(x49,x42,x53),t58(x42,x49,x58),t59(x42,x52,x53),t60(x59,x51,x56),t61 (x59,x57,x51),t62(x47,x45,x41),t63(x43,x1,x45),t64(x47,x57,x45),t65(x59,x43,x57),t66(x54, x51,x41),t67(x55,x56,x51),t68(x51,x47,x41),t69(x51,x57,x47),t70(x55,x44,x58),t71(x51,x44, x55),t72(x51,x54,x44),t73(x50,x48,x58),t74(x46,x48,x60),t75(x44,x50,x58),t76(x54,x41,x60), t77(x45,x46,x41),t78(x41,x46,x60),t79(x50,x54,x48),t80(x58,x48,x42),t81(x22,x26,x32),t82 (x46,x39,x27),t83(x46,x45,x39),t84(x46,x27,x29),t85(x46,x29,x48),t86(x31,x26,x22),t87(x38, x28,x25),t88(x38,x33,x24),t89(x31,x22,x30),t90(x31,x27,x39),t91(x27,x31,x29),t92(x30,x37, x36),t93(x24,x34,x23),t94(x39,x45,x34),t95(x38,x24,x28),t96(x23,x3,x40),t97(x39,x33,x25), t98(x34,x24,x33),t99(x32,x40,x37),t100(x24,x23,x40),t101(x21,x37,x30),t102(x22,x32,x21),t103 (x26,x28,x32),t104(x37,x35,x36),t105(x37,x5,x35),t106(x22,x21,x30),t107(x32,x37,x21),t108 (x30,x36,x31)。
(3) t, is calculated1-t108Totally 108 averages on Atria summit, and as the representative point of the triangle, As shown in Figure 4.For example, t1Triangle is by x28, x26, x25These three data objects are constituted, x28, x26, x25These three data pair 2 dimension attributes of elephant are respectively (1.91,0.1), (2.03,0.03) and (1.86,0.19), then t1The representative point r of triangle1It isIt is listed below t1-t108Representative point:
r1(1.93,0.11),r2(-0.10,-0.19),r3(-0.18,0.44),r4(-0.16,-0.12),r5(-0.20, 0.06),r6(0.01,0.20),r7(-0.09,-0.02),r8(-0.22,0.23),r9(0.00,0.40),r10(-0.04, 0.55),r11(0.57,1.90),r12(0.26,-0.18),r13(1.74,0.28),r14(-0.44,-0.10),r15(0.98, 2.24),r16(0.57,2.21),r17(0.78,0.10),r18(1.34,1.99),r19(1.18,0.03),r20(1.93, 0.23),r21(0.87,0.64),r22(1.83,0.17),r23(1.79,-0.07),r24(0.74,-0.52),r25(2.10, 0.22),r26(1.88,0.00),r27(-0.34,-0.14),r28(-0.27,-0.30),r29(-0.02,-0.23),r30(- 0.15,-0.33),r31(0.05,-0.24),r32(-0.07,-0.38),r33(0.00,-0.38),r34(0.09,-0.39),r35 (1.34,-0.22),r36(0.14,-0.24),r37(0.05,0.06),r38(0.18,0.33),r39(0.17,0.10),r40 (0.29,0.01),r41(0.88,-0.30),r42(-0.35,0.33),r43(-0.38,0.01),r44(0.24,0.96),r45(- 0.12,0.18),r46(-0.39,0.12),r47(0.29,1.56),r48(-0.12,1.00),r49(-0.05,0.23),r50 (0.06,0.36),r51(0.83,2.20),r52(0.40,0.78),r53(0.57,2.06),r54(0.74,1.62),r55 (1.17,2.07),r56(0.74,2.23),r57(0.91,2.37),r58(1.04,2.37),r59(0.74,2.36),r60 (0.77,1.93),r61(0.82,1.79),r62(1.00,1.71),r63(0.43,1.32),r64(0.85,1.67),r65 (0.72,1.71),r66(1.14,1.91),r67(0.87,2.06),r68(1.08,1.83),r69(0.92,1.79),r70 (1.06,2.16),r71(1.03,2.04),r72(1.10,1.99),r73(1.26,2.16),r74(1.44,1.90),r75 (1.13,2.13),r76(1.25,1.88),r77(1.17,1.70),r78(1.35,1.79),r79(1.30,2.09),r80 (1.26,2.31),r81(2.08,-0.04),r82(1.91,0.93),r83(1.38,1.25),r84(2.08,0.97),r85 (1.80,1.49),r86(2.19,0.05),r87(1.87,0.14),r88(1.77,0.10),r89(2.23,-0.04),r90 (2.21,0.40),r91(2.38,0.44),r92(2.11,-0.37),r93(1.58,0.06),r94(1.40,0.72),r95 (1.81,0.07),r96(1.22,-0.06),r97(1.85,0.28),r98(1.67,0.12),r99(1.87,-0.21),r100 (1.62,-0.04),r101(2.05,-0.28),r102(2.10,-0.10),r103(1.99,0.02),r104(2.07,-0.52), r105(1.40,-0.49),r106(2.13,-0.15),r107(2.01,-0.23),r108(2.26,-0.16)。
(4) r, is calculated1-r108Totally 108 represent an inverse for place triangle area, and as the representative point Density;For example, r1Triangle where (1.93,0.11) is t1, by x28(1.91,0.1), x26(2.03,0.03), x25(1.86, 0.19) these three data objects are constituted, and first, are calculated by x28And x26The length of side on the side that the two data objects are constitutedThen, calculate by x28And x25The side that the two data objects are constituted The length of sideCalculate again by x26And x25The side that the two data objects are constituted The length of sideThen, t is calculated1The semi-perimeter of triangleThen t1The area S of triangle is
Finally, area S Inverse, i.e.,As representative point r1Density;As stated above, r is calculated1-r108The density of point is represented, It is as follows:
ρ1=277.78, ρ2=43.86, ρ3=15.85, ρ4=53.19, ρ5=39.06, ρ6=69.44, ρ7=29.76, ρ8=63.29, ρ9=232.56, ρ10=19.38, ρ11=38.31, ρ12=21.23, ρ13=28.49, ρ14=54.05, ρ15= 107.53,ρ16=50., ρ17=5.26, ρ18=23.04, ρ19=10.47, ρ20=38.17, ρ21=1.27, ρ22=476.19, ρ23=65.36, ρ24=4., ρ25=12.17, ρ26=35.09, ρ27=90.91, ρ28=27.78, ρ29=119.05, ρ30= 45.25,ρ31=70.92, ρ32=128.21, ρ33=68.97, ρ34=46.51, ρ35=4.71, ρ36=81.3, ρ37=55.56, ρ38=22.68, ρ39=32.15, ρ40=15.24, ρ41=4.87, ρ42=25.64, ρ43=32.47, ρ44=5.69, ρ45= 43.1,ρ46=44.44, ρ47=2.6, ρ48=8.7, ρ49=333.33, ρ50=33.9, ρ51=44.44, ρ52=4.36, ρ53= 21.41,ρ54=96.15, ρ55=555.56, ρ56=102.04, ρ57=59.52, ρ58=26.74, ρ59=18.12, ρ60= 19.57,ρ61=51.28, ρ62=24.81, ρ63=10.27, ρ64=72.99, ρ65=123.46, ρ66=32.47, ρ67= 20.2,ρ68=106.38, ρ69=136.99, ρ70=95.24, ρ71=44.44, ρ72=144.93, ρ73=40.32, ρ74= 31.85,ρ75=156.25, ρ76=156.25, ρ77=26.11, ρ78=526.32, ρ79=101.01, ρ80=12.89, ρ81= 149.25,ρ82=3.08, ρ83=1.91, ρ84=41.67, ρ85=5.27, ρ86=40.98, ρ87=344.83, ρ88= 133.33,ρ89=59.88, ρ90=10.8, ρ91=125, ρ92=28.25, ρ93=74.07, ρ94=2.86, ρ95=120.48, ρ96=15.38, ρ97=106.38, ρ98=36.63, ρ99=13.74, ρ100=156.25, ρ101=65.36, ρ102=188.68, ρ103=158.73, ρ104=23.2, ρ105=5.5, ρ106=181.82, ρ107=68.03, ρ108=23.31.
(5) all set R={ r for representing point composition, are made1,r2,...,r108, 108 is the Delaunay triangulation network for building The number of intermediate cam shape;The maximum representative o'clock of density is selected from set R as the 1st initial cluster center c1, wherein, r55It is right The density p answered55=555.56 is maximum, then by r55(1.17,2.07) are added in initial cluster center set C, i.e. r55Make It is the 1st initial cluster center c1, as shown in figure 5, representing c with " ★ "1, i.e. C={ c1 }={ (1.17,2.07) }, then by density Maximum representative point r55Removed from set R, obtain R={ r1,r2,...,r54,r56,r57,...,r108}。
(6), difference calculation representative point set R={ r1,r2,...,r54,r56,r57,...,r108In each represent point and the One mixing distance of initial cluster center, for example, calculating r1With r55Between mixing distance, first, calculate r1With r55It is close Degree sum=ρ155=277.78+555.56=833.34, then, calculates r1With r55Euclidean distance, according to r1(1.93, 0.11), r55(1.17,2.07), both Euclidean distances Then, Calculate r1With r55Between mixing apart from h1(55)=(ρ155)×d1(55)=833.34 × 2.10=1750.01, for set R ={ r1,r2,...,r54,r56,r57,...,r108In each represent point as stated above calculate with first initial clustering in Heart r55Mixing distance, obtain following result:
r1:1750.01, r2:1552.5, r3:1211.39, r4:1558.4, r5:1444.93, r6:1375., r7: 1428.18, r8:1429.54, r9:1607.76, r10:1115.38, r11:368.2, r12:1401.6, r13:1098.01, r14: 1645.95, r15:165.77, r16:375.45, r17:1127.25, r18:109.93, r19:1154.7, r20:1181.52, r21: 812.97, r22:2073.82, r23:1384.65, r24:1471.64, r25:1175.2, r26:1293.52, r27:1732.54, r28: 1615.85, r29:1747.24, r30:1646.22, r31:1610.05, r32:1880.37, r33:1698.72, r34:1619.57, r35:1288.62, r36:1611.26, r37:1405.58, r38:1156.48, r39:1298.84, r40:1278.59, r41: 1339.43, r42:1342.57, r43:1517.12, r44:813.81, r45:1370.93, r46:1500., r47:569.32, r48: 947.96, r49:1964.45, r50:1202.5, r51:216., r52:839.88, r53:346.18, r54:404.06, r56:302.5, r57:246.03, r58:192.16, r59:298.31, r60:241.55, r61:273.08, r62:232.15, r63:594.12, r64: 320.56, r65:393.83, r66:94.08, r67:172.73, r68:172.1, r69:263.17, r70:91.11, r71:84., r72: 77.05, r73:77.46, r74:187.97, r75:49.83, r76:149.48, r77:215.22, r78:357.02, r79:85.35, r80:147.8, r81:1621.06, r82:759.75, r83:473.85, r84:854.04, r85:482.31, r86:1348.18, r87: 1845.8, r88:1419.11, r89:1452.44, r90:1115.73, r91:1381.54, r92:1523.74, r93:1290.74, r94:765.04, r95:1419.68, r96:1216.1, r97:1264.31, r98:1190.3, r99:1360.63, r100:1537.51, r101:1558.51, r102:1756.41, r103:1578.58, r104:1585.8, r105:1441.92, r106:1784.46, r107: 1527.8, r108:1435.6, wherein, r22With r55Between mixing distance 2073.82, be farthest in all distances, therefore, r22(1.83,0.17) is used as the 2nd initial cluster center c2, as shown in figure 5, representing c with " ▲ "2, and add it to initial In cluster centre set C, i.e. C={ c1,c2}={ (1.17,2.07), (1.83,0.17) }, then by r22Removed from set R, Obtain R={ r1,r2,…,r21,r23,…,r54,r56,…,r108}。
(7), calculated one by one and c in initial cluster center set in set R1And c2Mixing distance, and select minimum mixed Distance is closed, for example, taking out r from set R1, calculate respectively and c1(1.17,2.07) and c2Mixing between (1.83,0.17) away from From, 2073.82 and 90.48 are obtained, it is 90.48, same method, all of representative in set of computations R to select minimum mixing distance Point and c in initial cluster center set1And c2Mixing distance, and select minimum mixing distance, obtain following result:
r1:90.48, r2:1019.3, r3:998.84, r4:1064.05, r5:1045.96, r6:993.05, r7:976.48, r8:1105.93, r9:1304.1, r10:946.54, r11:368.2, r12:800.85, r13:70.66, r14:1214.25, r15: 165.77, r16:375.45, r17:505.52, r18:109.93, r19:321.2, r20:61.72, r21:510.88, r23:129.97, r24:619.45, r25:131.86, r26:92.03, r27:1241.95, r28:1083.54, r29:1125., r30:1063.74, r31: 1001.21, r32:1196.71, r33:1041.26, r34:956.54, r35:302.97, r36:970.03, r37:946.52, r38: 828.12, r39:843.84, r40:761.72, r41:509.92, r42:1099.01, r43:1129.23, r44:813.81, r45: 1012.62, r46:1155.8, r47:569.32, r48:947.96, r49:1521.9, r50:907.96, r51:216., r52: 744.85, r53:346.18, r54:404.06, r56:302.5, r57:246.03, r58:192.16, r59:298.31, r60: 241.55, r61:273.08, r62:232.15, r63:594.12, r64:320.56, r65:393.83, r66:94.08, r67: 172.73, r68:172.1, r69:263.17, r70:91.11, r71:84.00, r72:77.05, r73:77.46, r74:187.97, r75:49.83, r76:149.48, r77:215.22, r78:357.02, r79:85.35, r80:147.8, r81:206.4, r82: 364.25, r83:473.85, r84:435., r85:482.31, r86:196.52, r87:41.05, r88:54.86, r89:241.23, r90:214.28, r91:366.73, r92:307.71, r93:148.57, r94:335.34, r95:59.67, r96:319.52, r97: 64.08, r98:87.18, r99:186.17, r100:189.73, r101:270.78, r102:252.65, r103:139.68, r104: 364.55, r105:380.54, r106:289.52, r107:239.46, r108:269.73, selected in above minimum range it is maximum away from From, wherein, 1521.9 is ultimate range, and its corresponding representative be a little r49, by r49(- 0.05,0.23) is added in initial clustering In heart set C, as the 3rd initial cluster center c3, as shown in figure 5, representing c with "●"3, i.e. C={ c1,c2}={ (1.17, 2.07), (1.83,0.17), (- 0.05,0.23) }, at this moment, the element number that initial cluster center set C is included is equal to K, then Algorithm terminates.
To verify the validity of proposition method of the present invention, its Clustering Effect on real data collection is given below:
We pick 4 data sets Wine, Soybean-small, Iris, Haberman, table 1 from UCI data sets List this 4 relevant informations of data set:
14 information of data set of table
We are respectively by initial cluster center system of selection proposed by the present invention and random selection initial cluster center method For K-means clustering algorithms, cluster result is analyzed and evaluated as evaluation index using classification accuracy rate (AC), its In, the precision for randomly choosing initial cluster center method is 10 average values of stochastic clustering result, and as a result such as table 2, classification is just True rate (AC) is defined as follows:
Wherein, K represents the class number of data set, and N represents the sum of data object in data set, aiExpression is correctly assigned to The number of the data object of the i-th class.
AC value of 24 data sets of table under two kinds of different initial cluster center methods
From table 2 it can be seen that the initial cluster center based on Delaunay triangulation network proposed by the present invention is chosen into method using In K-means clustering algorithms, its classification accuracy rate (AC) is apparently higher than randomized.
Initial cluster center is chosen using the inventive method, the influence of outlier can also be avoided.In preceding embodiment It is manually generated comprising 60 data set X={ x of data object1,x2,...,x60On the basis of, increased to belong to again and peel off 3 data objects of point, its attribute is respectively (- 0.5,2.5), (1.0,0.8), (2.25,2.5), with shown in " ▲ " in Fig. 6. Next, being that the data set chooses initial cluster center using the inventive method, 3 initial cluster centers are obtained, be respectively (1.17,2.07), (1.83,0.17), (- 0.05,0.23), in Fig. 6 with " ★ " represent 3 initial cluster centers, the result with The selection result of the initial cluster center in embodiment is completely the same, is not influenceed by outlier.And if selected using randomized Initial cluster center is taken, it is likely that outlier is elected to be initial cluster center, because the cluster result of K-means algorithms is relied on In the selection of initial cluster center, if inputing to K-means algorithms using outlier as initial cluster center, mistake can be caused Cluster result by mistake.
Preferred embodiment of the invention described in detail above.It should be appreciated that one of ordinary skill in the art without Need creative work just can make many modifications and variations with design of the invention.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical scheme, all should be in the protection domain being defined in the patent claims.

Claims (8)

1. a kind of K-means initial cluster center choosing methods based on Delaunay triangulation network, it is characterised in that including following Step:
Step 1, data set table to be clustered is shown as Delaunay triangulation network so that each data point in the data set to be clustered Corresponded with the node in the Delaunay triangulation network;
Step 2, the average for calculating each Atria summit in the Delaunay triangulation network, and using the average as described The representative point of triangle;
Step 3, calculate it is each it is described represent point where triangle area an inverse, and will it is each it is described represent point place a triangle area It is reciprocal as the density for representing point;
Step 4, the Euclidean distance for calculating the density sum and representative point for representing point, and using both products as two The individual mixing distance represented between point;
Step 5, it is all it is described represent point in select the maximum representative o'clock of density as the 1st initial cluster center, and by institute The maximum representative point of density is stated to be added in initial cluster center set C;
Step 6, selection are with the mixing of the 1st initial cluster center apart from farthest representative o'clock as the 2nd initial clustering Center, and the mixing with the 1st initial cluster center is added to the initial cluster center apart from farthest representative point In set C;
Step 7, it is remaining represent point in calculate one by one and each initial cluster center in the initial cluster center set C Mixing distance, and minimum mixing distance is selected, maximum mixing distance institute is then picked out in all of minimum mixing distance right The representative point answered, and the representative point picked out corresponding to maximum mixing distance in all of minimum mixing distance is added to described In initial cluster center set C, the qualified point that represents constantly is picked out from point is represented and is added to the initial clustering Centralization C, until the element number that the initial cluster center set C is included is equal to K.
2. a kind of K-means initial cluster center choosing methods based on Delaunay triangulation network as claimed in claim 1, its It is characterised by, step 1 specific method includes:
The data set to be clustered is arranged to X={ x1,x2,...,xn, it is that data set X builds comprising n data object A data object x in Delaunay triangulation network G=(V, E), and the data set XiOne in ∈ X and triangulation network G Node viIt is one-to-one relationship between ∈ V, the distance between two nodes in the triangulation network G is equal to its corresponding data object Between Euclidean distance, i.e. d (vi,vj)=d (xi,xj)。
3. a kind of K-means initial cluster center choosing methods based on Delaunay triangulation network as claimed in claim 1, its It is characterised by, step 2 specific method includes:
The three of triangle T summits are separately arranged as v in constituting the triangulation network Gi、vj、vk, three summits point Not with the data set to be clustered in xi、xj、xkThese three data objects are corresponded, wherein, xi=(xi1,xi2,…, xid), xj=(xj1,xj2,…,xjd), xk=(xk1,xk2,…,xkd), the average for calculating three summits isThe average as the triangle T representative point.
4. a kind of K-means initial cluster center choosing methods based on Delaunay triangulation network as claimed in claim 1, its It is characterised by, step 3 specific method includes:
Three summits for representing triangle T where point r are separately arranged as vi、vj、vk, three summits are treated with described respectively X in the data set of clusteri、xj、xkThese three data objects are corresponded, wherein, xi=(xi1,xi2,…,xid), xj= (xj1,xj2,…,xjd), xk=(xk1,xk2,...,xkd), then three length of sides are separately arranged as in the triangle T:
a = ( x i 1 - x j 1 ) 2 + ( x i 2 - x j 2 ) 2 + ... + ( x i d - x j d ) 2 ,
b = ( x i 1 - x k 1 ) 2 + ( x i 2 - x k 2 ) 2 + ... + ( x i d - x k d ) 2 ,
c = ( x j 1 - x k 1 ) 2 + ( x j 2 - x k 2 ) 2 + ... + ( x j d - x k d ) 2 ,
Calculate the semi-perimeter of triangleObtaining the area S of the triangle T is Finally, the inverse of the area S is obtained, i.e.As the density of the representative point r.
5. a kind of K-means initial cluster center choosing methods based on Delaunay triangulation network as claimed in claim 1, its It is characterised by, step 4 specific method includes:
Two represent point r1, r2Density be separately arranged as ρ1With ρ2, the representative point r1With the representative point r2Euclidean away from From equal to d12, then it is described to represent point r1With the representative point r2Between mixing distance be equal to h=(ρ12)×d12
6. a kind of K-means initial cluster center choosing methods based on Delaunay triangulation network as claimed in claim 1, its It is characterised by, step 5 specific method includes:
All set for representing point composition are arranged to R={ r1,r2,...,rt, first, all representatives are calculated by the step 3 The density of point, then, selects the maximum representative o'clock of density as the 1st initial cluster center c from the set R1, and by institute State the maximum representative point of set R Midst densities to be added in the initial cluster center set C, i.e. C={ c1, then density is maximum Representative point removed from set R, rearrangement represent point set, obtain R={ r1,r2,...,rt-1}。
7. a kind of K-means initial cluster center choosing methods based on Delaunay triangulation network as claimed in claim 1, its It is characterised by, step 6 specific method includes:
Difference calculation representative point set R={ r1,r2,...,rt-1In each represent point and first mixing of initial cluster center Distance, takes mixing apart from farthest representative o'clock as the 2nd initial cluster center c2, and by the mixing apart from farthest representative Point is added in the initial cluster center set C, i.e. C={ c1,c2, then by the mixing apart from farthest representative point from collection Removal in R is closed, rearrangement represents point set, obtains R={ r1,r2,...,rt-2}。
8. a kind of K-means initial cluster center choosing methods based on Delaunay triangulation network as claimed in claim 1, its It is characterised by, step 7 specific method includes:
Step 71, from it is remaining represent point set R in select r1, calculate and each initial clustering in the initial cluster center set C The mixing distance at center, and the mixing distance of minimum is selected in all of mixing distance, it is expressed as h1min
Step 72, r is selected from R2, the mixing distance with each initial cluster center in the initial cluster center set C is calculated, And minimum mixing distance is selected in all of mixing distance, it is expressed as h2min;Until picking out last representative from R Point rt-2, calculate with the mixing distance of each initial cluster center in the initial cluster center set C, and all of mixing away from Selected in a distance from the mixing of minimum, be expressed as h(t-2)min
Step 73, in all of minimum mixing apart from h1min, h2min..., h(t-2)minIn pick out corresponding to maximum mixing distance Represent a little, and the point that represents is added in the initial cluster center set C, constantly picked out from point is represented and met The representative point of condition is added to the initial cluster center set C, until the element that the initial cluster center set C is included Number is equal to K.
CN201710090315.3A 2017-02-20 2017-02-20 A kind of K means initial cluster center choosing methods based on Delaunay triangulation network Pending CN106934415A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710090315.3A CN106934415A (en) 2017-02-20 2017-02-20 A kind of K means initial cluster center choosing methods based on Delaunay triangulation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710090315.3A CN106934415A (en) 2017-02-20 2017-02-20 A kind of K means initial cluster center choosing methods based on Delaunay triangulation network

Publications (1)

Publication Number Publication Date
CN106934415A true CN106934415A (en) 2017-07-07

Family

ID=59423857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710090315.3A Pending CN106934415A (en) 2017-02-20 2017-02-20 A kind of K means initial cluster center choosing methods based on Delaunay triangulation network

Country Status (1)

Country Link
CN (1) CN106934415A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304849A (en) * 2018-01-15 2018-07-20 浙江理工大学 A kind of bird plumage color character extracting method
CN108376541A (en) * 2018-02-07 2018-08-07 桂林电子科技大学 A kind of domestic environment sound based on active noise reduction inhibits the location mode of signal
CN110378415A (en) * 2019-07-19 2019-10-25 浙江理工大学 A kind of SAR image sorting algorithm
CN111985530A (en) * 2020-07-08 2020-11-24 上海师范大学 Classification method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304849A (en) * 2018-01-15 2018-07-20 浙江理工大学 A kind of bird plumage color character extracting method
CN108376541A (en) * 2018-02-07 2018-08-07 桂林电子科技大学 A kind of domestic environment sound based on active noise reduction inhibits the location mode of signal
CN110378415A (en) * 2019-07-19 2019-10-25 浙江理工大学 A kind of SAR image sorting algorithm
CN111985530A (en) * 2020-07-08 2020-11-24 上海师范大学 Classification method
CN111985530B (en) * 2020-07-08 2023-12-08 上海师范大学 Classification method

Similar Documents

Publication Publication Date Title
CN106934415A (en) A kind of K means initial cluster center choosing methods based on Delaunay triangulation network
Gilani et al. Learning from millions of 3D scans for large-scale 3D face recognition
CN103207879B (en) The generation method and apparatus of image index
CN110110802A (en) Airborne laser point cloud classification method based on high-order condition random field
CN108062551A (en) A kind of figure Feature Extraction System based on adjacency matrix, figure categorizing system and method
CN104899607B (en) A kind of automatic classification method of traditional moire pattern
CN104573705A (en) Clustering method for building laser scan point cloud data
CN104036255A (en) Facial expression recognition method
CN102855492A (en) Classification method based on mineral flotation foam image
Facchetti et al. Exploring the low-energy landscape of large-scale signed social networks
CN108596919A (en) A kind of Automatic image segmentation method based on depth map
CN104216974A (en) Unmanned aerial vehicle aerial image matching method based on vocabulary tree blocking and clustering
CN109753876A (en) A kind of construction method of the extraction identification and three-dimensional gesture interaction system of three-dimension gesture
CN115775026A (en) Federated learning method based on organization similarity
CN106203528B (en) It is a kind of that intelligent classification algorithm is drawn based on the 3D of Fusion Features and KNN
CN107391594A (en) A kind of image search method based on the sequence of iteration vision
Andreetto et al. Unsupervised learning of categorical segments in image collections
CN113989291A (en) Building roof plane segmentation method based on PointNet and RANSAC algorithm
CN103793504B (en) A kind of cluster initial point system of selection based on user preference and item attribute
CN110210281A (en) Divide three-dimensional point cloud recognition methods and the device of shape convolutional neural networks based on spherical surface
Brown et al. Evolutionary graph compression and diffusion methods for city discovery in role playing games
Khoo et al. Structural pattern recognition using genetic algorithms with specialized operators
CN109241628A (en) Three-dimensional CAD model dividing method based on Graph Spectral Theory and cluster
Sirisin et al. A new technique Gray scale display of input data using shooting SOM and genetic algorithm
CN102136071A (en) Spatial correlation matrix-based image characteristic analysis model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170707

RJ01 Rejection of invention patent application after publication