CN109086831A - Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm - Google Patents

Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm Download PDF

Info

Publication number
CN109086831A
CN109086831A CN201810935647.1A CN201810935647A CN109086831A CN 109086831 A CN109086831 A CN 109086831A CN 201810935647 A CN201810935647 A CN 201810935647A CN 109086831 A CN109086831 A CN 109086831A
Authority
CN
China
Prior art keywords
nectar source
bee
algorithm
stage
fuzzy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810935647.1A
Other languages
Chinese (zh)
Inventor
李宏伟
卫建华
田智慧
赫晓慧
郭恒亮
王晓蕾
赵姗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201810935647.1A priority Critical patent/CN109086831A/en
Publication of CN109086831A publication Critical patent/CN109086831A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Medicines Containing Material From Animals Or Micro-Organisms (AREA)

Abstract

The present invention relates to artificial bee colony algorithm technical fields, more particularly to the Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm, the algorithm includes initial phase, leads the bee stage, follows bee stage and investigation bee stage, further include following steps: step 1: after following the bee stage, judging whether current algorithm is to recycle for the first time;If so, executing step 2;If it is not, thening follow the steps three;Step 2: it is optimized current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if the quality of the solution after optimization is higher than current optimal solution, then current optimal solution is replaced with the solution after optimization, otherwise it abandons, the number of iterations in corresponding nectar source adds 1 simultaneously, subsequently into the investigation bee stage;Step 3: judge whether optimal solution changes after following the bee stage;If so, executing step 2;If it is not, then entering the investigation bee stage.It is high that algorithm provided by the present invention clusters accuracy rate height, fast convergence rate, low optimization accuracy.

Description

Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm
Technical field
The present invention relates to artificial bee colony algorithm technical fields, and in particular to one kind is based on Fuzzy C-Means Algorithm and artificial bee The Hybrid Clustering Algorithm of group's clustering algorithm.
Background technique
About Fuzzy C-Means Algorithm:
Dunn in 1974 proposes Fuzzy C-means (FCM) clustering algorithm on the Research foundation of Bezdek, is widely used In multiple fields such as geospatial information, image procossing, data minings.The maximum of Fuzzy C-Means Algorithm and hard C- mean algorithm The difference is that the degree of membership problem of object, it can only be 0,1 two values that hard C- mean value, which requires the degree of membership of object, and Fuzzy C- Mean value allows the degree of membership of object between [0,1], can also take 0 or 1, this feature of Fuzzy C-means possesses object Greater flexibility, an object both may belong to C1Also it may belong to C2Class, only subjection degree is different.
The basic process of Fuzzy C-Means Clustering Algorithm is: concentrate the characteristic distributions of object to analyze data first, Suitable clusters number c and Fuzzy Exponential m is set according to the characteristic distributions of object;Then it is a right that c is randomly choosed from data set As initial cluster center;Followed by loop iteration, Matrix dividing is obtained, Matrix dividing includes each object to institute There is the degree of membership information of class, cluster centre of new generation is determined by Matrix dividing and data set;Finally, when objective function convergence reaches When keeping stablizing to convergence precision or the degree of membership of object, stops iteration, obtain final cluster centre, data set is according to division Matrix completes fuzzy division.
The objective function of Fuzzy C-Means Algorithm is defined as follows:
dij=| | xj-vi|| (1.2)
Wherein, C={ C1, C2..., CcIndicate set, dijIt is object xjTo the distance of the cluster centre of i-th of subclass, U It is the Matrix dividing of a n × c, is UijSet.uijIndicate j-th of object xjProgram and u are subordinate to for the i-th classij∈ [0,1].uijMeet following constraint condition:
Meanwhile each object is 1 to the sum of degree of membership of all classes, i.e.,
In formula, m is ambiguity parameter, m ∈ [1 ,+∞) control the fog-level of algorithm:
m→1+When, uij→ 1 or 0, this when, FCM algorithm was just degenerated to HCM algorithm;
When m →+∞, uij→ 1/c, this when, the fuzziness of cluster result of FCM algorithm was in maximum rating, i.e. m value The ambiguity for increasing then algorithm increases.The value of m is 2 under normal conditions.
F (X, U, C) is error weighted sum of squares in class, and FCM algorithm makes objective function F (X, U, C) by continuous iteration It minimizes.
Specific step is as follows for Fuzzy C-Means Algorithm:
Stepl: parameter initialization.Set cluster numbersWith Fuzzy Exponential m (1 < m <+∞), usual feelings Value is 2 under condition.Cluster centre is initialized, V is obtained(0)={ v1, v2..., vc}.Convergence precision ε (ε > 0), the number of iterations k= 0。
Step2: subordinated-degree matrix U is calculated.According to cluster centre set V(0), calculate data set in all objects to gather Then the distance at class center is updated subordinated-degree matrix U according to formula (1.5), i.e.,
Step3: cluster centre set V is updated(k).K=k+1 is enabled, is calculated separately according to subordinated-degree matrix U complete in all classes The weighted average of portion's object, and as new cluster centre, i.e.,
Step4: Step2, Step3 are repeated, to the last the cluster centre set of iteration meets following condition twice:
||V(k+1)-V(k)| | < ε (1.7)
The artificial bee colony algorithm of standard:
As shown in Fig. 2, the artificial bee colony algorithm of standard includes 4 stages: initial phase leads the bee stage, follows bee Stage and search bee stage.
(1) initial phase
Initial phase includes parameter initialization and the initial nectar source of generation.Artificial bee colony algorithm has 3 important parameters: honey The quantity SN in source, the maximum cycle MaxCycle of algorithm, nectar source maximum number of iterations limit.Artificial bee colony algorithm exists SN initial nectar sources are randomly generated by formula (2.1) in the initial stage of algorithm, then calculate the fitness value in each nectar source.
Wherein i ∈ { 1,2 ..., SN } indicates the quantity in nectar source;J ∈ { 1,2 ..., D }, indicates the dimension in nectar source;xijIt indicates Solve xiJth dimension value,Indicate the value range of jth dimension variable.
(2) the bee stage is led
It leads the quantity in bee and nectar source equal, bee is led to find quality higher nectar source on the basis of initial nectar source, lead to Formula (2.2) is crossed to carry out neighborhood search near nectar source and generate new nectar source.
vij=xij+r×(xij-xkj) (2.2)
Wherein vijNew nectar source is indicated, it will be seen that new nectar source is in current nectar source x from formula (1.2)ijWith it is adjacent Nectar source xkjOn the basis of obtained by changing the value of current nectar source jth dimension.Random number between r expression [- 1,1], k ∈ 1, 2 ..., SN }, j ∈ { 1,2 ..., D } is both randomly choosed, and k ≠ i.J represents the dimension being updated, and artificial bee colony is calculated Method lead the bee stage by randomly choose certain it is one-dimensional be updated, obtain nectar source.For new nectar source vijIf Then enableIfThen enableIf the fitness value in new nectar source is greater than the fitness in old nectar source Value, then replace old nectar source with new nectar source, bee otherwise led still to save old nectar source.
(3) the bee stage is followed
Honeycomb is returned to after leading bee to search nectar source, calculates the fitness value in each nectar source in all honey according to formula (2.3) Shared ratio in the sum of the fitness value in source.Bee is followed according to the random number that system generates to determine whether that some is selected to lead bee Nectar source scan for, if certain nectar source fitness value proportion be greater than system generate random number if follow bee will select honey Source, this selection strategy are referred to as roulette selection strategy.
Fit in formulaiIt indicates the corresponding fitness value in i-th of nectar source, bee is followed to select a nectar source to carry out neighbour in this stage Domain search, it is similar to the bee stage is led, new nectar source is generated by formula (1.2), is retained if the fitness value in new nectar source is higher Otherwise new nectar source still retains old nectar source.
(4) the search bee stage
If some nectar source fitness value after limit neighborhood search is not still improved, it indicate that working as Preceding nectar source has been local optimum nectar source, corresponding with this nectar source to lead bee that abandon this nectar source and be changed into search bee, is scouted Bee finds new nectar source according to formula (2.1) by way of random search, and search bee starts to scan for simultaneously this new nectar source It is again transformed into and leads bee.
Judge whether the cycle-index of algorithm has reached maximum cycle MaxCycle.If reaching, terminator;If Not up to, then it returns to second stage and updates nectar source by leading bee to continue field search.
Summary of the invention
It is poly- that the purpose of the present invention is to provide a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm Class algorithm can pass through, and accelerate convergence speed of the algorithm.
In order to reach above-mentioned technical purpose, the technical solution adopted in the present invention is as follows:
A kind of Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm, including initialization rank Section leads the bee stage, follows bee stage and investigation bee stage, which is characterized in that is following between bee stage and investigation bee stage Further include following steps:
Step 1: after following the bee stage, judge whether current algorithm is to recycle for the first time;If recycling for the first time, Then follow the steps two;If not recycling for the first time, three are thened follow the steps;
Step 2: optimizing current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if The quality of solution after optimization is higher than current optimal solution, then replaces current optimal solution with the solution after optimization, otherwise abandon, while corresponding The number of iterations in nectar source adds 1, subsequently into the investigation bee stage;
Step 3: judge whether optimal solution changes after following the bee stage;If changing, two are thened follow the steps; If no change has taken place, enter the investigation bee stage.
It is further, described to lead the bee stage and/or follow the formula for generating new nectar source in the bee stage are as follows:
vij=xij+θ×(xij-xkj)
Wherein, the θ is the nonlinear change factor, vijIndicate new nectar source, xijIndicate current nectar source, xkjIndicate adjacent honey Source, k ∈ { 1,2 ..., SN }, j ∈ { 1,2 ..., D } is both randomly choosed, and k ≠ i;J represents the dimension being updated.
Further, the nonlinear change factor θ are as follows:
Wherein m, n are coefficient, and Cycle indicates previous cycle the number of iterations, and MaxCycle indicates largest loop the number of iterations,Wherein rand is random function.
Further, the value range of the m, n are respectively as follows: m ∈ [1,1.5], n ∈ [0,0.2].
Further, it is described follow the bee stage the following steps are included:
It sorts from low to high according to the size for the nectar source fitness value for leading bee, and assigns weight for each nectar source;
According to the fitness value for assigning weight, bee is followed to select nectar source by the selection mode of roulette and carry out neighborhood to search Rope generates new nectar source.
Further, the calculation formula of the weight in the nectar source are as follows:
Wherein, w (i) indicates the weight in nectar source, and value range is between [0,1];SN indicates to lead the quantity of bee.
Further, it is described lead the bee stage and/or follow and generate new nectar source in the bee stage after, if new nectar source fitness value is big Fitness value in old nectar source then replaces old nectar source with new nectar source, on the contrary then retain old nectar source.
Further, further include following steps after the investigation bee stage: whether judging the cycle-index of the algorithm Reach maximum cycle MaxCycle;If reaching, terminator;If not up to, return leads the bee stage, continue Field search updates nectar source.
The invention has the following beneficial effects:
1, the present invention joined the step of whether optimal solution is improved judged in original clustering algorithm, can further add Fast convergence speed of the algorithm.
2, the random number r in nectar source more new formula is improved to nonlinear change factor θ by the present invention, with the fortune of algorithm Row, scale factor θ can nonlinear change.Bigger in the initial stage θ value of algorithm, nectar source update step-length is also bigger, and honeybee is searched The range of rope is also just bigger, and the diversity of population is also just relatively good;In the later period of algorithm, since bee colony moves closer to optimal honey Source needs to carry out at this time small range of search, and θ value slowly reduces, and nectar source updates step-length and slowly reduces, and is conducive in current honey The more good nectar source of careful search near source, and the present invention is by improved nectar source more new formula and improved mixing The mode that clustering algorithm combines further improves the low optimization accuracy of algorithm.
3, the present invention is that each nectar source assigns weight, random in order to avoid occurring when using the selection mode of roulette Property bigger, low efficiency and nectar source higher for quality exist leakage choosing possibility drawback;Following bee stage selection honey When source, weight is assigned according to leading the nectar source quality of bee to be ranked up from low to high, and for each nectar source;The higher nectar source of quality I value is bigger, and the weight of distribution is higher, and the selected probability in nectar source is also higher.The algorithm later period has been arrived, although all nectar sources Fitness value reaches unanimity, but the nectar source that the nectar source weight of the weight in high-quality nectar source or specific mass difference is high, high-quality It still is able to show one's talent, obtains more optimizing chance.
Detailed description of the invention
Fig. 1 is Hybrid Clustering Algorithm flow chart;
Fig. 2 is the operation figure of IRIS cluster data.
Specific embodiment
Below by specific embodiment combination attached drawing, the present invention will be described in detail, it should be noted that in the feelings not conflicted Under condition, the feature in embodiment and embodiment in the present invention be can be combined with each other, and the scope of protection of the present invention is not limited thereto.
Embodiment 1
Honeybee producting honey behavior in artificial bee colony algorithm and searching Optimal cluster centers in clustering algorithm are one-to-one Relationship, table 1 list this corresponding relationship.In artificial bee colony algorithm, in nectar source position and cluster process in possible cluster The heart is corresponding, and nectar source quality is corresponding with the value of evaluation function, and bee colony explores the speed of searching and gathering honey and finds Optimal cluster centers Speed it is corresponding, optimal quality nectar source corresponds to Optimal cluster centers.
The corresponding relationship of table 1 searching Optimal cluster centers and honeybee producting honey behavior
If sample space is x={ x1, x2..., xn, wherein xiIt is a d dimensional vector.It will be every in artificial bee colony algorithm One nectar source and a cluster centre set V={ v1, v2..., vcCorresponding, wherein vjIt is and xiVector with identical dimension, honey Source quality is higher, and expression cluster centre is more excellent.In order to evaluate each nectar source (each group cluster centralization) in artificial bee colony algorithm Quality, we are by the fitness function of artificial bee colony algorithm is defined as:
fiti=1/ [1+F (X, U, C)] (3.1)
Wherein: F (X, U, C) is objective function defined in formula (1.1), that is, the target of Fuzzy C-Means Clustering Algorithm Function.Nectar source quality is higher, and expression cluster centre set is more excellent, and for the value of F (X, U, C) with regard to smaller, Clustering Effect is also better, fitiValue it is higher.Artificial bee colony algorithm is extended to artificial bee colony clustering algorithm (Artificial Bee Colony below Clustering algorithm, CABC).
As shown in Figure 1, specific step is as follows for artificial bee colony clustering algorithm:
Stepl: setting clusters number c, nectar source quantity, to lead bee quantity, follow bee quantity be SN.If sample attribute is tieed up Degree is d, then sets D=c*d for the dimension in nectar source, the maximum number of iterations in each nectar source is set as Limit=SN*D, algorithm Maximum cycle is set as MaxCycle, and current cycle time Cycle is set as 0.
Step2: SN initial nectar sources are randomly generated as initial cluster center according to initial nectar source formula, then calculates and draws Divide subordinated-degree matrix U and calculate the fitness value in each nectar source, the highest nectar source of fitness value is recorded.
The initial nectar source formula are as follows:
Wherein, i ∈ { 1,2 ..., SN } indicates i-th of nectar source;J ∈ { 1,2 ..., D }, indicates the dimension in nectar source;xijTable Show solution xiJth dimension value,Indicate the value range of jth dimension variable, rand (0,1) is that value codomain is (0,1) Random function.
Step3: current cycle time Cycle is added into l.
Step4: it leads bee to carry out neighborhood search and generates new nectar source vij, then algorithm updates degree of membership square according to formula (1.5) Battle array U and the fitness value for calculating new nectar source are replaced if the fitness value in new nectar source is greater than the fitness value in old nectar source with new nectar source Old nectar source is changed, old nectar source is otherwise still retained.
The calculation formula for generating new nectar source are as follows:
vij=xij+r×(xij-xkj) (1.1)
Wherein vijIndicate new nectar source, xijIndicate current nectar source, xkjIndicate adjacent nectar source, r indicates random between [- 1,1] Number, k ∈ { 1,2 ..., SN }, j ∈ { 1,2 ..., D } is both randomly choosed, and k ≠ i.J represents the dimension being updated, Artificial bee colony algorithm lead the bee stage by randomly choose certain it is one-dimensional be updated, obtain nectar source.For new nectar source vijIfThen enableIfThen enable
Step5: the fitness value in each nectar source proportion in the sum of the fitness value in all nectar sources is calculated.
Step6: following bee to select nectar source according to the selection mode of roulette, then carries out neighborhood search and generates new nectar source, Then algorithm updates subordinated-degree matrix U and calculates the fitness value in nectar source, if the fitness value in new nectar source is greater than the suitable of old nectar source It answers angle value then to replace old nectar source with new nectar source, otherwise still retains old nectar source;
The formula for following bee to generate new nectar source is identical as the formula for leading bee to generate new nectar source, is using formula 1.1。
The update subordinated-degree matrix U is to be updated according to the following formula, and it is u that U, which is the Matrix dividing of a n × c,ijCollection It closes:
Wherein, k indicates the number of iterations, and c is setting cluster numbers, k=(1,2 ..., c), xjIndicate that object, v indicate cluster Centralization, vjIndicate j-th of subclass cluster centre, vkIndicate that the cluster centre of k-th of subclass, m are ambiguity parameter, m ∈ [1 ,+∞) control the fog-level of algorithm, m → 1+When, uij→ 1 or 0,;When m →+∞, uij→ 1/c, m value increases then at this time The ambiguity of algorithm increases.The value of m is 2 under normal conditions.
Step7: judge whether the current cycle time Cycle of algorithm is to recycle for the first time, if recycling for the first time, is followed It after the bee stage, is optimized current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if excellent The quality of solution after change is higher than current optimal solution, then replaces current optimal solution with the solution after optimization, otherwise abandon, while corresponding honey The number of iterations in source adds 1;If not recycling for the first time, it is divided into two kinds of situations at this time: if 1. optimal solution is sent out after following the bee stage Raw change then optimizes current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if after optimization The quality of solution be higher than current optimal solution, then with optimization after solution replace current optimal solution, otherwise abandon, at the same accordingly nectar source change Generation number adds 1;2. if optimal solution is after following the bee stage, no change has taken place, does not execute Fuzzy C-Means Algorithm.
Step8: if the fitness value in certain nectar source nectar source after maximum number of iterations does not still improve, with nectar source pair That answers leads bee to be changed into investigation bee, and SN initial nectar sources are randomly generated according to initial nectar source formula in algorithm, and investigation bee turns again Become leading bee.
Step9: current cycle time Cycle adds 1, judges whether Cycle is greater than maximum cycle MaxCycle.If big In maximum cycle, indicating that algorithm has reached maximum cycle, stop iteration, algorithm terminates, output Optimal cluster centers, Subordinated-degree matrix and maximum adaptation angle value;If being less than maximum cycle, goes to Step4 and continue cycling through.
Embodiment 2
In the present embodiment, difference from example 1 is that: follow bee according to the selecting party of roulette in Step6 Further include following steps before formula selects nectar source:
Honeycomb is returned to after leading bee to search nectar source, and nectar source is arranged from low to high first, in accordance with the height of nectar source fitness value Then sequence is that each nectar source assigns weight according to formula (2.4), follows bee to select nectar source, then update honey according to formula (1.1) Source.
The weight computing formula in new nectar source is as follows:
Wherein, SN indicates to lead the quantity of bee, and w (i) indicates the weight in i-th of new nectar source.
As can be seen from the above equation, the value range of w (i) is between [0,1].The higher nectar source i value of quality is bigger, distribution Weight is higher, and the selected probability in nectar source is also higher.The algorithm later period is arrived, although the fitness value in all nectar sources tends to one It causes, but the ropy nectar source weight of the weight ratio in high-quality nectar source is high, high-quality nectar source still is able to show one's talent, obtain To more optimization chances.
According to the fitness value for assigning weight, bee is followed to select nectar source by the selection mode of roulette, then according to public affairs Formula (1.1) carries out neighborhood search and generates new nectar source.Bee is followed if the fitness value in new nectar source is greater than the fitness value in old nectar source Retain new nectar source, otherwise give up new nectar source, bee is followed still to retain old nectar source.
Embodiment 3
Difference from example 1 is that: it leads bee to carry out neighborhood search in Step4 and generates new nectar source vij, and/or It follows bee to carry out neighborhood search in Step6 and generates new nectar source vij, new nectar source v thereinijFormula has done further improvement, tool Body is as follows:
vij=xij+θ×(xij-xkj) (2.5)
Wherein vijIndicate new nectar source, xijIndicate current nectar source, xkjIndicate adjacent nectar source, θ indicates the nonlinear change factor, k ∈ { 1,2 ..., SN }, j ∈ { 1,2 ..., D } is both randomly choosed, and k ≠ i.J represents the dimension being updated.
New nectar source vijIt is in current nectar source xijWith adjacent nectar source xkjOn the basis of by changing current nectar source xijJth dimension What value obtained.J represents the dimension that is updated, artificial bee colony algorithm lead the bee stage by randomly choosing certain one-dimensional progress more Newly, new nectar source is obtained.
For new nectar source vijIfThen enableIfThen enableAlso It is to say, if new nectar source is greater than maximum value, using maximum value as updated new nectar source;It, will if new nectar source is less than minimum value Minimum value is as updated new nectar source;If the fitness value in new nectar source is greater than the fitness value in old nectar source, with new nectar source generation For old nectar source, bee is otherwise led still to save old nectar source.
Nonlinear change factor θ in formula 2.5 are as follows:
Wherein m, n are coefficient, dimensionless group;Cycle is previous cycle the number of iterations, and MaxCycle changes for largest loop Generation number,
The value codomain of parameter alpha are as follows:
Random function rand defining range is (0,1) in the formula of initial nectar source, when rand is less than 0.5, α value -1;When When rand is more than or equal to 0.5, α value 1.
Artificial bee colony algorithm due to standard is following the bee stage to be updated according to formula (1.1) to nectar source, updates public Formula uses a codomain to control nectar source update step-length, random too big, the nothing of this way of search for the random factor r of [- 1,1] Method be effectively ensured nectar source search range with algorithm carry out make corresponding change.
Therefore the nonlinear change factor θ that the present invention is proposed according to the characteristics of artificial bee colony algorithm, can be with algorithm It carries out updating step-length according to the nonlinear change nectar source of iterative process (Cycle).In improved artificial bee colony algorithm, bee is followed Stage nectar source is updated according to formula (2.5), and with the operation of algorithm, nonlinear change factor θ can nonlinear change.? The initial stage θ value of algorithm is bigger, and nectar source update step-length is also bigger, and the range of honeybee search is also with regard to bigger, the multiplicity of population Property also just it is relatively good.In the later period of algorithm, since bee colony moves closer to optimal nectar source, need to carry out small range of search at this time Rope, θ value slowly reduce, and nectar source updates step-length and slowly reduces, and it is more good to be conducive to careful search near current nectar source New nectar source improves the low optimization accuracy of algorithm.
Preferably, m takes [1,1.5], and effect is relatively good when n takes the value between [0,0.2].
Embodiment 4
The difference is that, bee is followed to select honey according to the selection mode of roulette in step Step6 with embodiment 3 Before source, further include the steps that described in embodiment 2:
Honeycomb is returned to after leading bee to search nectar source, and nectar source is arranged from low to high first, in accordance with the height of nectar source fitness value Then sequence is that each nectar source assigns weight according to formula (2.4), follows bee to select nectar source, then update honey according to formula (2.5) Source.
The weight computing formula in new nectar source is as follows:
Wherein, SN indicates to lead the quantity of bee, and w (i) indicates the weight in i-th of new nectar source
As can be seen from the above equation, the value range of w (i) is between [0,1].The higher nectar source i value of quality is bigger, distribution Weight is higher, and the selected probability in nectar source is also higher.The algorithm later period is arrived, although the fitness value in all nectar sources tends to one It causes, but the ropy nectar source weight of the weight ratio in high-quality nectar source is high, high-quality nectar source still is able to show one's talent, obtain To more optimization chances.
According to the fitness value for assigning weight, bee is followed to select nectar source by the selection mode of roulette, then according to public affairs Formula (2.5) carries out neighborhood search and generates new nectar source.Bee is followed if the fitness value in new nectar source is greater than the fitness value in old nectar source Retain new nectar source, otherwise give up new nectar source, bee is followed still to retain old nectar source.
Experimental result and analysis
(1) experimental result:
In order to compare artificial bee colony clustering algorithm (CABC algorithm) and Hybrid Clustering Algorithm involved in this embodiment The performance of (ICABC_FCM algorithm), we are using 5 common data sets in UCI database: IRIS data set, BUPA data Collection, WDBC data set, Wine data set, Thyroid data set are tested.Experiment sample composition is as shown in table 4:
The composition of 4 experiment sample data set of table
Dataset name Number of samples Classification number Dimension
IRIS 150 3 4
BUPA 345 2 6
WDBC 569 2 30
Wine 178 3 13
Thyroid 215 3 5
IRIS data set: being made of the attribute data of three kinds of iris plant samples, and data set includes 150 samples in total, Every a kind of including 50 samples, sample attribute includes sepal length, sepal width, petal length and petal width a total of four Attribute.
BUPA data set: about the record of male's patients with liver diseases, data set includes 345 samples, each sample in total Originally there are 6 attributes, wherein preceding 5 attributes are blood testings as a result, the sample in data set is divided into two classes: the first kind has 114, Second class has 231.
WDBC data set: data set includes 569 samples, and each sample has 30 attributes, and the sample in data set is divided into Two classes: Malignant and Benign, wherein Malignant class has 357, and Benign class has 212.
Wine data set: data set includes 178 samples, and each sample has 13 attributes, represents the grape in a place of production 13 chemical features that wine is included, the sample in data set are divided into three classes: the first kind has 59 samples, and the second class has 71 samples This, third class has 48 samples.
Thyroid data set: the thyroid gland data set being made of 215 samples, each sample have 5 attributes.Data set In sample be divided into three classes: the first kind has 150 samples, and the second class has 35 samples, and third class has 30 samples.
Respectively with CABC algorithm, ICABC algorithm and ICABC_FCM algorithm to IRIS data set, BUPA data set, WDBC number Clustering is carried out according to collection, Wine data set and Thyroid data set.
The FUZZY WEIGHTED index of each algorithm is m=2, and the minimum that wherein the FCM algorithm stage allows in ICABC_FCM algorithm is accidentally Poor ε=10-3, honeybee number is 20, and maximum cycle MaxCycle is that 2000, Limit is dimension * (SN/2), the dimension in nectar source Degree is equal to the attribute dimensions of sample multiplied by clusters number, and algorithm is separately operable 20 times and is averaged as final result.
As shown in Fig. 2, being the operation figure of IRIS cluster data.
The cluster accuracy of each data set is as shown in table 5:
Each cluster data of table 5 averagely under true rate
The cluster result of each data set is as shown in table 6 below:
The target function value of each cluster data of table 6
Interpretation of result:
Fig. 2 is the operation figure of IRIS cluster data, it can be seen from the figure that CABC algorithm is after circulation about 180 times Target function value just starts to tend towards stability, and hybrid algorithm just has a solution well after circulation primary, and algorithm is several in circulation Target function value tends to stablize after secondary, and convergence speed of the algorithm is quickly.
From table 4, it can be seen that the cluster accuracy rate of hybrid algorithm is higher than the cluster accuracy rate of CABC algorithm, show clustering In accuracy rate, ICABC_FCM algorithm is better than CABC algorithm.
As can be seen from Table 6, ICABC_FCM algorithm is respectively less than CABC algorithm in the average and standard deviation of cluster, Show that ICABC_FCM algorithm is superior to CABC algorithm on whole low optimization accuracy and stability, and compare ICABC_FCM algorithm and The cluster result of CABC algorithm, ICABC_FCM algorithm are better than ICABC algorithm on whole low optimization accuracy and stability.
By being analyzed above it is found that ICABC_FCM algorithm is steady in cluster accuracy rate, convergence rate, low optimization accuracy and algorithm CABC algorithm is superior on qualitative.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features; And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims (8)

1. a kind of Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm, including initial phase, It leads the bee stage, follow bee stage and investigation bee stage, which is characterized in that further include following steps:
Step 1: after following the bee stage, judge whether current algorithm is to recycle for the first time;If recycling for the first time, then hold Row step 2;If not recycling for the first time, three are thened follow the steps;
Step 2: optimizing current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if optimization The quality of solution afterwards is higher than current optimal solution, then replaces current optimal solution with the solution after optimization, otherwise abandon, while corresponding nectar source The number of iterations add 1, subsequently into investigation the bee stage;
Step 3: judge whether optimal solution changes after following the bee stage;If changing, two are thened follow the steps;If not yet It changes, then enters the investigation bee stage.
It is calculated 2. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 1 clusters Method, which is characterized in that described to lead the bee stage and/or follow the formula for generating new nectar source in the bee stage are as follows:
vij=xij+θ×(xij-xkj)
Wherein, the θ is the nonlinear change factor, vijIndicate new nectar source, xijIndicate current nectar source, xkjIndicate adjacent nectar source, i, K ∈ { 1,2 ..., SN } indicates the quantity in nectar source, and k ≠ i;J ∈ { 1,2 ..., D } represents the dimension in the nectar source being updated.
It is calculated 3. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 2 clusters Method, which is characterized in that the nonlinear change factor θ are as follows:
Wherein m, n are coefficient, and Cycle indicates previous cycle the number of iterations, and MaxCycle indicates largest loop the number of iterations,Wherein rand is random function.
It is calculated 4. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 3 clusters Method, which is characterized in that, the value range of the m, n are respectively as follows: m ∈ [1,1.5], n ∈ [0,0.2].
It is calculated 5. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 1 clusters Method, which is characterized in that it is described follow the bee stage the following steps are included:
It sorts from low to high according to the size for the nectar source fitness value for leading bee, and assigns weight for each nectar source;
According to the fitness value for assigning weight, follows bee to select nectar source by the selection mode of roulette and carry out neighborhood search production Raw new nectar source.
It is calculated 6. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 5 clusters Method, which is characterized in that the calculation formula of the weight in the nectar source are as follows:
Wherein, w (i) indicates the weight in nectar source, and value range is between [0,1];SN indicates to lead the quantity of bee.
It is calculated 7. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 1 clusters Method, which is characterized in that it is described lead the bee stage and/or follow and generate new nectar source in the bee stage after, if new nectar source fitness value is big Fitness value in old nectar source then replaces old nectar source with new nectar source, on the contrary then retain old nectar source.
It is calculated 8. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 1 clusters Method, which is characterized in that after the investigation bee stage further include following steps: judge whether the cycle-index of the algorithm has reached To maximum cycle;If reaching, terminator;If not up to, return leads the bee stage, continue field search more New nectar source.
CN201810935647.1A 2018-08-16 2018-08-16 Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm Pending CN109086831A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810935647.1A CN109086831A (en) 2018-08-16 2018-08-16 Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810935647.1A CN109086831A (en) 2018-08-16 2018-08-16 Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm

Publications (1)

Publication Number Publication Date
CN109086831A true CN109086831A (en) 2018-12-25

Family

ID=64793549

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810935647.1A Pending CN109086831A (en) 2018-08-16 2018-08-16 Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm

Country Status (1)

Country Link
CN (1) CN109086831A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111123324A (en) * 2019-12-31 2020-05-08 杭州电子科技大学 DGPS integer ambiguity searching method based on improved ant colony algorithm
CN112530529A (en) * 2020-12-09 2021-03-19 合肥工业大学 Gas concentration prediction method, system, equipment and storage medium thereof
CN113177583A (en) * 2021-04-16 2021-07-27 中国人民解放军空军工程大学 Aerial target clustering method
US11769033B2 (en) 2021-06-02 2023-09-26 Imam Abdulrahman Bin Faisal University System, computer readable storage medium, and method for segmentation and enhancement of brain MRI images
CN117251755A (en) * 2023-11-17 2023-12-19 核工业北京地质研究院 Clustering method of seismic attributes

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111123324A (en) * 2019-12-31 2020-05-08 杭州电子科技大学 DGPS integer ambiguity searching method based on improved ant colony algorithm
CN112530529A (en) * 2020-12-09 2021-03-19 合肥工业大学 Gas concentration prediction method, system, equipment and storage medium thereof
CN112530529B (en) * 2020-12-09 2024-01-26 合肥工业大学 Gas concentration prediction method, system, equipment and storage medium thereof
CN113177583A (en) * 2021-04-16 2021-07-27 中国人民解放军空军工程大学 Aerial target clustering method
CN113177583B (en) * 2021-04-16 2022-10-18 中国人民解放军空军工程大学 Aerial target clustering method
US11769033B2 (en) 2021-06-02 2023-09-26 Imam Abdulrahman Bin Faisal University System, computer readable storage medium, and method for segmentation and enhancement of brain MRI images
CN117251755A (en) * 2023-11-17 2023-12-19 核工业北京地质研究院 Clustering method of seismic attributes
CN117251755B (en) * 2023-11-17 2024-02-27 核工业北京地质研究院 Clustering method of seismic attributes

Similar Documents

Publication Publication Date Title
CN109086831A (en) Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm
CN106709037B (en) A kind of film recommended method based on Heterogeneous Information network
CN107203785A (en) Multipath Gaussian kernel Fuzzy c-Means Clustering Algorithm
CN107169983B (en) Multi-threshold image segmentation method based on cross variation artificial fish swarm algorithm
CN109034231A (en) The deficiency of data fuzzy clustering method of information feedback RBF network valuation
CN110909787A (en) Method and system for multi-objective batch scheduling optimization based on clustering evolutionary algorithm
CN111599406B (en) Global multi-network comparison method combined with network clustering method
Kurada et al. A preliminary survey on optimized multiobjective metaheuristic methods for data clustering using evolutionary approaches
Sheng et al. Multilocal search and adaptive niching based memetic algorithm with a consensus criterion for data clustering
CN109583480A (en) One kind being used for aero-engine anti-asthma control system bathtub curve estimation method
CN109326328A (en) A kind of extinct plants and animal pedigree evolution analysis method based on pedigree cluster
CN109919458B (en) Collaborative cost task allocation method and system based on concept lattice in social network
CN117611974B (en) Image recognition method and system based on searching of multiple group alternative evolutionary neural structures
Tiwari et al. Improving ant colony optimization algorithm for data clustering
CN107169594A (en) A kind of optimization method and device of Vehicle Routing Problems
CN110378402A (en) A kind of K-means clustering method of self study attribute weight
Suresh et al. Data clustering using multi-objective differential evolution algorithms
CN104732522A (en) Image segmentation method based on polymorphic ant colony algorithm
CN111353525A (en) Modeling and missing value filling method for unbalanced incomplete data set
CN112488773A (en) Smart television user classification method, computer equipment and storage medium
CN114093426A (en) Marker screening method based on gene regulation network construction
CN105912887B (en) A kind of modified gene expression programming-fuzzy C-mean algorithm crop data sorting technique
Qin et al. Non-myopic knowledge gradient policy for ranking and selection
CN109408728A (en) A kind of difference secret protection recommended method based on covering algorithm
Zhao et al. A bi-layer decomposition algorithm for many-objective optimization problems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181225