CN109086831A - Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm - Google Patents
Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm Download PDFInfo
- Publication number
- CN109086831A CN109086831A CN201810935647.1A CN201810935647A CN109086831A CN 109086831 A CN109086831 A CN 109086831A CN 201810935647 A CN201810935647 A CN 201810935647A CN 109086831 A CN109086831 A CN 109086831A
- Authority
- CN
- China
- Prior art keywords
- nectar source
- bee
- algorithm
- stage
- fuzzy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Medicines Containing Material From Animals Or Micro-Organisms (AREA)
Abstract
The present invention relates to artificial bee colony algorithm technical fields, more particularly to the Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm, the algorithm includes initial phase, leads the bee stage, follows bee stage and investigation bee stage, further include following steps: step 1: after following the bee stage, judging whether current algorithm is to recycle for the first time;If so, executing step 2;If it is not, thening follow the steps three;Step 2: it is optimized current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if the quality of the solution after optimization is higher than current optimal solution, then current optimal solution is replaced with the solution after optimization, otherwise it abandons, the number of iterations in corresponding nectar source adds 1 simultaneously, subsequently into the investigation bee stage;Step 3: judge whether optimal solution changes after following the bee stage;If so, executing step 2;If it is not, then entering the investigation bee stage.It is high that algorithm provided by the present invention clusters accuracy rate height, fast convergence rate, low optimization accuracy.
Description
Technical field
The present invention relates to artificial bee colony algorithm technical fields, and in particular to one kind is based on Fuzzy C-Means Algorithm and artificial bee
The Hybrid Clustering Algorithm of group's clustering algorithm.
Background technique
About Fuzzy C-Means Algorithm:
Dunn in 1974 proposes Fuzzy C-means (FCM) clustering algorithm on the Research foundation of Bezdek, is widely used
In multiple fields such as geospatial information, image procossing, data minings.The maximum of Fuzzy C-Means Algorithm and hard C- mean algorithm
The difference is that the degree of membership problem of object, it can only be 0,1 two values that hard C- mean value, which requires the degree of membership of object, and Fuzzy C-
Mean value allows the degree of membership of object between [0,1], can also take 0 or 1, this feature of Fuzzy C-means possesses object
Greater flexibility, an object both may belong to C1Also it may belong to C2Class, only subjection degree is different.
The basic process of Fuzzy C-Means Clustering Algorithm is: concentrate the characteristic distributions of object to analyze data first,
Suitable clusters number c and Fuzzy Exponential m is set according to the characteristic distributions of object;Then it is a right that c is randomly choosed from data set
As initial cluster center;Followed by loop iteration, Matrix dividing is obtained, Matrix dividing includes each object to institute
There is the degree of membership information of class, cluster centre of new generation is determined by Matrix dividing and data set;Finally, when objective function convergence reaches
When keeping stablizing to convergence precision or the degree of membership of object, stops iteration, obtain final cluster centre, data set is according to division
Matrix completes fuzzy division.
The objective function of Fuzzy C-Means Algorithm is defined as follows:
dij=| | xj-vi|| (1.2)
Wherein, C={ C1, C2..., CcIndicate set, dijIt is object xjTo the distance of the cluster centre of i-th of subclass, U
It is the Matrix dividing of a n × c, is UijSet.uijIndicate j-th of object xjProgram and u are subordinate to for the i-th classij∈
[0,1].uijMeet following constraint condition:
Meanwhile each object is 1 to the sum of degree of membership of all classes, i.e.,
In formula, m is ambiguity parameter, m ∈ [1 ,+∞) control the fog-level of algorithm:
m→1+When, uij→ 1 or 0, this when, FCM algorithm was just degenerated to HCM algorithm;
When m →+∞, uij→ 1/c, this when, the fuzziness of cluster result of FCM algorithm was in maximum rating, i.e. m value
The ambiguity for increasing then algorithm increases.The value of m is 2 under normal conditions.
F (X, U, C) is error weighted sum of squares in class, and FCM algorithm makes objective function F (X, U, C) by continuous iteration
It minimizes.
Specific step is as follows for Fuzzy C-Means Algorithm:
Stepl: parameter initialization.Set cluster numbersWith Fuzzy Exponential m (1 < m <+∞), usual feelings
Value is 2 under condition.Cluster centre is initialized, V is obtained(0)={ v1, v2..., vc}.Convergence precision ε (ε > 0), the number of iterations k=
0。
Step2: subordinated-degree matrix U is calculated.According to cluster centre set V(0), calculate data set in all objects to gather
Then the distance at class center is updated subordinated-degree matrix U according to formula (1.5), i.e.,
Step3: cluster centre set V is updated(k).K=k+1 is enabled, is calculated separately according to subordinated-degree matrix U complete in all classes
The weighted average of portion's object, and as new cluster centre, i.e.,
Step4: Step2, Step3 are repeated, to the last the cluster centre set of iteration meets following condition twice:
||V(k+1)-V(k)| | < ε (1.7)
The artificial bee colony algorithm of standard:
As shown in Fig. 2, the artificial bee colony algorithm of standard includes 4 stages: initial phase leads the bee stage, follows bee
Stage and search bee stage.
(1) initial phase
Initial phase includes parameter initialization and the initial nectar source of generation.Artificial bee colony algorithm has 3 important parameters: honey
The quantity SN in source, the maximum cycle MaxCycle of algorithm, nectar source maximum number of iterations limit.Artificial bee colony algorithm exists
SN initial nectar sources are randomly generated by formula (2.1) in the initial stage of algorithm, then calculate the fitness value in each nectar source.
Wherein i ∈ { 1,2 ..., SN } indicates the quantity in nectar source;J ∈ { 1,2 ..., D }, indicates the dimension in nectar source;xijIt indicates
Solve xiJth dimension value,Indicate the value range of jth dimension variable.
(2) the bee stage is led
It leads the quantity in bee and nectar source equal, bee is led to find quality higher nectar source on the basis of initial nectar source, lead to
Formula (2.2) is crossed to carry out neighborhood search near nectar source and generate new nectar source.
vij=xij+r×(xij-xkj) (2.2)
Wherein vijNew nectar source is indicated, it will be seen that new nectar source is in current nectar source x from formula (1.2)ijWith it is adjacent
Nectar source xkjOn the basis of obtained by changing the value of current nectar source jth dimension.Random number between r expression [- 1,1], k ∈ 1,
2 ..., SN }, j ∈ { 1,2 ..., D } is both randomly choosed, and k ≠ i.J represents the dimension being updated, and artificial bee colony is calculated
Method lead the bee stage by randomly choose certain it is one-dimensional be updated, obtain nectar source.For new nectar source vijIf
Then enableIfThen enableIf the fitness value in new nectar source is greater than the fitness in old nectar source
Value, then replace old nectar source with new nectar source, bee otherwise led still to save old nectar source.
(3) the bee stage is followed
Honeycomb is returned to after leading bee to search nectar source, calculates the fitness value in each nectar source in all honey according to formula (2.3)
Shared ratio in the sum of the fitness value in source.Bee is followed according to the random number that system generates to determine whether that some is selected to lead bee
Nectar source scan for, if certain nectar source fitness value proportion be greater than system generate random number if follow bee will select honey
Source, this selection strategy are referred to as roulette selection strategy.
Fit in formulaiIt indicates the corresponding fitness value in i-th of nectar source, bee is followed to select a nectar source to carry out neighbour in this stage
Domain search, it is similar to the bee stage is led, new nectar source is generated by formula (1.2), is retained if the fitness value in new nectar source is higher
Otherwise new nectar source still retains old nectar source.
(4) the search bee stage
If some nectar source fitness value after limit neighborhood search is not still improved, it indicate that working as
Preceding nectar source has been local optimum nectar source, corresponding with this nectar source to lead bee that abandon this nectar source and be changed into search bee, is scouted
Bee finds new nectar source according to formula (2.1) by way of random search, and search bee starts to scan for simultaneously this new nectar source
It is again transformed into and leads bee.
Judge whether the cycle-index of algorithm has reached maximum cycle MaxCycle.If reaching, terminator;If
Not up to, then it returns to second stage and updates nectar source by leading bee to continue field search.
Summary of the invention
It is poly- that the purpose of the present invention is to provide a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm
Class algorithm can pass through, and accelerate convergence speed of the algorithm.
In order to reach above-mentioned technical purpose, the technical solution adopted in the present invention is as follows:
A kind of Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm, including initialization rank
Section leads the bee stage, follows bee stage and investigation bee stage, which is characterized in that is following between bee stage and investigation bee stage
Further include following steps:
Step 1: after following the bee stage, judge whether current algorithm is to recycle for the first time;If recycling for the first time,
Then follow the steps two;If not recycling for the first time, three are thened follow the steps;
Step 2: optimizing current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if
The quality of solution after optimization is higher than current optimal solution, then replaces current optimal solution with the solution after optimization, otherwise abandon, while corresponding
The number of iterations in nectar source adds 1, subsequently into the investigation bee stage;
Step 3: judge whether optimal solution changes after following the bee stage;If changing, two are thened follow the steps;
If no change has taken place, enter the investigation bee stage.
It is further, described to lead the bee stage and/or follow the formula for generating new nectar source in the bee stage are as follows:
vij=xij+θ×(xij-xkj)
Wherein, the θ is the nonlinear change factor, vijIndicate new nectar source, xijIndicate current nectar source, xkjIndicate adjacent honey
Source, k ∈ { 1,2 ..., SN }, j ∈ { 1,2 ..., D } is both randomly choosed, and k ≠ i;J represents the dimension being updated.
Further, the nonlinear change factor θ are as follows:
Wherein m, n are coefficient, and Cycle indicates previous cycle the number of iterations, and MaxCycle indicates largest loop the number of iterations,Wherein rand is random function.
Further, the value range of the m, n are respectively as follows: m ∈ [1,1.5], n ∈ [0,0.2].
Further, it is described follow the bee stage the following steps are included:
It sorts from low to high according to the size for the nectar source fitness value for leading bee, and assigns weight for each nectar source;
According to the fitness value for assigning weight, bee is followed to select nectar source by the selection mode of roulette and carry out neighborhood to search
Rope generates new nectar source.
Further, the calculation formula of the weight in the nectar source are as follows:
Wherein, w (i) indicates the weight in nectar source, and value range is between [0,1];SN indicates to lead the quantity of bee.
Further, it is described lead the bee stage and/or follow and generate new nectar source in the bee stage after, if new nectar source fitness value is big
Fitness value in old nectar source then replaces old nectar source with new nectar source, on the contrary then retain old nectar source.
Further, further include following steps after the investigation bee stage: whether judging the cycle-index of the algorithm
Reach maximum cycle MaxCycle;If reaching, terminator;If not up to, return leads the bee stage, continue
Field search updates nectar source.
The invention has the following beneficial effects:
1, the present invention joined the step of whether optimal solution is improved judged in original clustering algorithm, can further add
Fast convergence speed of the algorithm.
2, the random number r in nectar source more new formula is improved to nonlinear change factor θ by the present invention, with the fortune of algorithm
Row, scale factor θ can nonlinear change.Bigger in the initial stage θ value of algorithm, nectar source update step-length is also bigger, and honeybee is searched
The range of rope is also just bigger, and the diversity of population is also just relatively good;In the later period of algorithm, since bee colony moves closer to optimal honey
Source needs to carry out at this time small range of search, and θ value slowly reduces, and nectar source updates step-length and slowly reduces, and is conducive in current honey
The more good nectar source of careful search near source, and the present invention is by improved nectar source more new formula and improved mixing
The mode that clustering algorithm combines further improves the low optimization accuracy of algorithm.
3, the present invention is that each nectar source assigns weight, random in order to avoid occurring when using the selection mode of roulette
Property bigger, low efficiency and nectar source higher for quality exist leakage choosing possibility drawback;Following bee stage selection honey
When source, weight is assigned according to leading the nectar source quality of bee to be ranked up from low to high, and for each nectar source;The higher nectar source of quality
I value is bigger, and the weight of distribution is higher, and the selected probability in nectar source is also higher.The algorithm later period has been arrived, although all nectar sources
Fitness value reaches unanimity, but the nectar source that the nectar source weight of the weight in high-quality nectar source or specific mass difference is high, high-quality
It still is able to show one's talent, obtains more optimizing chance.
Detailed description of the invention
Fig. 1 is Hybrid Clustering Algorithm flow chart;
Fig. 2 is the operation figure of IRIS cluster data.
Specific embodiment
Below by specific embodiment combination attached drawing, the present invention will be described in detail, it should be noted that in the feelings not conflicted
Under condition, the feature in embodiment and embodiment in the present invention be can be combined with each other, and the scope of protection of the present invention is not limited thereto.
Embodiment 1
Honeybee producting honey behavior in artificial bee colony algorithm and searching Optimal cluster centers in clustering algorithm are one-to-one
Relationship, table 1 list this corresponding relationship.In artificial bee colony algorithm, in nectar source position and cluster process in possible cluster
The heart is corresponding, and nectar source quality is corresponding with the value of evaluation function, and bee colony explores the speed of searching and gathering honey and finds Optimal cluster centers
Speed it is corresponding, optimal quality nectar source corresponds to Optimal cluster centers.
The corresponding relationship of table 1 searching Optimal cluster centers and honeybee producting honey behavior
If sample space is x={ x1, x2..., xn, wherein xiIt is a d dimensional vector.It will be every in artificial bee colony algorithm
One nectar source and a cluster centre set V={ v1, v2..., vcCorresponding, wherein vjIt is and xiVector with identical dimension, honey
Source quality is higher, and expression cluster centre is more excellent.In order to evaluate each nectar source (each group cluster centralization) in artificial bee colony algorithm
Quality, we are by the fitness function of artificial bee colony algorithm is defined as:
fiti=1/ [1+F (X, U, C)] (3.1)
Wherein: F (X, U, C) is objective function defined in formula (1.1), that is, the target of Fuzzy C-Means Clustering Algorithm
Function.Nectar source quality is higher, and expression cluster centre set is more excellent, and for the value of F (X, U, C) with regard to smaller, Clustering Effect is also better,
fitiValue it is higher.Artificial bee colony algorithm is extended to artificial bee colony clustering algorithm (Artificial Bee Colony below
Clustering algorithm, CABC).
As shown in Figure 1, specific step is as follows for artificial bee colony clustering algorithm:
Stepl: setting clusters number c, nectar source quantity, to lead bee quantity, follow bee quantity be SN.If sample attribute is tieed up
Degree is d, then sets D=c*d for the dimension in nectar source, the maximum number of iterations in each nectar source is set as Limit=SN*D, algorithm
Maximum cycle is set as MaxCycle, and current cycle time Cycle is set as 0.
Step2: SN initial nectar sources are randomly generated as initial cluster center according to initial nectar source formula, then calculates and draws
Divide subordinated-degree matrix U and calculate the fitness value in each nectar source, the highest nectar source of fitness value is recorded.
The initial nectar source formula are as follows:
Wherein, i ∈ { 1,2 ..., SN } indicates i-th of nectar source;J ∈ { 1,2 ..., D }, indicates the dimension in nectar source;xijTable
Show solution xiJth dimension value,Indicate the value range of jth dimension variable, rand (0,1) is that value codomain is (0,1)
Random function.
Step3: current cycle time Cycle is added into l.
Step4: it leads bee to carry out neighborhood search and generates new nectar source vij, then algorithm updates degree of membership square according to formula (1.5)
Battle array U and the fitness value for calculating new nectar source are replaced if the fitness value in new nectar source is greater than the fitness value in old nectar source with new nectar source
Old nectar source is changed, old nectar source is otherwise still retained.
The calculation formula for generating new nectar source are as follows:
vij=xij+r×(xij-xkj) (1.1)
Wherein vijIndicate new nectar source, xijIndicate current nectar source, xkjIndicate adjacent nectar source, r indicates random between [- 1,1]
Number, k ∈ { 1,2 ..., SN }, j ∈ { 1,2 ..., D } is both randomly choosed, and k ≠ i.J represents the dimension being updated,
Artificial bee colony algorithm lead the bee stage by randomly choose certain it is one-dimensional be updated, obtain nectar source.For new nectar source vijIfThen enableIfThen enable
Step5: the fitness value in each nectar source proportion in the sum of the fitness value in all nectar sources is calculated.
Step6: following bee to select nectar source according to the selection mode of roulette, then carries out neighborhood search and generates new nectar source,
Then algorithm updates subordinated-degree matrix U and calculates the fitness value in nectar source, if the fitness value in new nectar source is greater than the suitable of old nectar source
It answers angle value then to replace old nectar source with new nectar source, otherwise still retains old nectar source;
The formula for following bee to generate new nectar source is identical as the formula for leading bee to generate new nectar source, is using formula
1.1。
The update subordinated-degree matrix U is to be updated according to the following formula, and it is u that U, which is the Matrix dividing of a n × c,ijCollection
It closes:
Wherein, k indicates the number of iterations, and c is setting cluster numbers, k=(1,2 ..., c), xjIndicate that object, v indicate cluster
Centralization, vjIndicate j-th of subclass cluster centre, vkIndicate that the cluster centre of k-th of subclass, m are ambiguity parameter, m ∈
[1 ,+∞) control the fog-level of algorithm, m → 1+When, uij→ 1 or 0,;When m →+∞, uij→ 1/c, m value increases then at this time
The ambiguity of algorithm increases.The value of m is 2 under normal conditions.
Step7: judge whether the current cycle time Cycle of algorithm is to recycle for the first time, if recycling for the first time, is followed
It after the bee stage, is optimized current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if excellent
The quality of solution after change is higher than current optimal solution, then replaces current optimal solution with the solution after optimization, otherwise abandon, while corresponding honey
The number of iterations in source adds 1;If not recycling for the first time, it is divided into two kinds of situations at this time: if 1. optimal solution is sent out after following the bee stage
Raw change then optimizes current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if after optimization
The quality of solution be higher than current optimal solution, then with optimization after solution replace current optimal solution, otherwise abandon, at the same accordingly nectar source change
Generation number adds 1;2. if optimal solution is after following the bee stage, no change has taken place, does not execute Fuzzy C-Means Algorithm.
Step8: if the fitness value in certain nectar source nectar source after maximum number of iterations does not still improve, with nectar source pair
That answers leads bee to be changed into investigation bee, and SN initial nectar sources are randomly generated according to initial nectar source formula in algorithm, and investigation bee turns again
Become leading bee.
Step9: current cycle time Cycle adds 1, judges whether Cycle is greater than maximum cycle MaxCycle.If big
In maximum cycle, indicating that algorithm has reached maximum cycle, stop iteration, algorithm terminates, output Optimal cluster centers,
Subordinated-degree matrix and maximum adaptation angle value;If being less than maximum cycle, goes to Step4 and continue cycling through.
Embodiment 2
In the present embodiment, difference from example 1 is that: follow bee according to the selecting party of roulette in Step6
Further include following steps before formula selects nectar source:
Honeycomb is returned to after leading bee to search nectar source, and nectar source is arranged from low to high first, in accordance with the height of nectar source fitness value
Then sequence is that each nectar source assigns weight according to formula (2.4), follows bee to select nectar source, then update honey according to formula (1.1)
Source.
The weight computing formula in new nectar source is as follows:
Wherein, SN indicates to lead the quantity of bee, and w (i) indicates the weight in i-th of new nectar source.
As can be seen from the above equation, the value range of w (i) is between [0,1].The higher nectar source i value of quality is bigger, distribution
Weight is higher, and the selected probability in nectar source is also higher.The algorithm later period is arrived, although the fitness value in all nectar sources tends to one
It causes, but the ropy nectar source weight of the weight ratio in high-quality nectar source is high, high-quality nectar source still is able to show one's talent, obtain
To more optimization chances.
According to the fitness value for assigning weight, bee is followed to select nectar source by the selection mode of roulette, then according to public affairs
Formula (1.1) carries out neighborhood search and generates new nectar source.Bee is followed if the fitness value in new nectar source is greater than the fitness value in old nectar source
Retain new nectar source, otherwise give up new nectar source, bee is followed still to retain old nectar source.
Embodiment 3
Difference from example 1 is that: it leads bee to carry out neighborhood search in Step4 and generates new nectar source vij, and/or
It follows bee to carry out neighborhood search in Step6 and generates new nectar source vij, new nectar source v thereinijFormula has done further improvement, tool
Body is as follows:
vij=xij+θ×(xij-xkj) (2.5)
Wherein vijIndicate new nectar source, xijIndicate current nectar source, xkjIndicate adjacent nectar source, θ indicates the nonlinear change factor, k
∈ { 1,2 ..., SN }, j ∈ { 1,2 ..., D } is both randomly choosed, and k ≠ i.J represents the dimension being updated.
New nectar source vijIt is in current nectar source xijWith adjacent nectar source xkjOn the basis of by changing current nectar source xijJth dimension
What value obtained.J represents the dimension that is updated, artificial bee colony algorithm lead the bee stage by randomly choosing certain one-dimensional progress more
Newly, new nectar source is obtained.
For new nectar source vijIfThen enableIfThen enableAlso
It is to say, if new nectar source is greater than maximum value, using maximum value as updated new nectar source;It, will if new nectar source is less than minimum value
Minimum value is as updated new nectar source;If the fitness value in new nectar source is greater than the fitness value in old nectar source, with new nectar source generation
For old nectar source, bee is otherwise led still to save old nectar source.
Nonlinear change factor θ in formula 2.5 are as follows:
Wherein m, n are coefficient, dimensionless group;Cycle is previous cycle the number of iterations, and MaxCycle changes for largest loop
Generation number,
The value codomain of parameter alpha are as follows:
Random function rand defining range is (0,1) in the formula of initial nectar source, when rand is less than 0.5, α value -1;When
When rand is more than or equal to 0.5, α value 1.
Artificial bee colony algorithm due to standard is following the bee stage to be updated according to formula (1.1) to nectar source, updates public
Formula uses a codomain to control nectar source update step-length, random too big, the nothing of this way of search for the random factor r of [- 1,1]
Method be effectively ensured nectar source search range with algorithm carry out make corresponding change.
Therefore the nonlinear change factor θ that the present invention is proposed according to the characteristics of artificial bee colony algorithm, can be with algorithm
It carries out updating step-length according to the nonlinear change nectar source of iterative process (Cycle).In improved artificial bee colony algorithm, bee is followed
Stage nectar source is updated according to formula (2.5), and with the operation of algorithm, nonlinear change factor θ can nonlinear change.?
The initial stage θ value of algorithm is bigger, and nectar source update step-length is also bigger, and the range of honeybee search is also with regard to bigger, the multiplicity of population
Property also just it is relatively good.In the later period of algorithm, since bee colony moves closer to optimal nectar source, need to carry out small range of search at this time
Rope, θ value slowly reduce, and nectar source updates step-length and slowly reduces, and it is more good to be conducive to careful search near current nectar source
New nectar source improves the low optimization accuracy of algorithm.
Preferably, m takes [1,1.5], and effect is relatively good when n takes the value between [0,0.2].
Embodiment 4
The difference is that, bee is followed to select honey according to the selection mode of roulette in step Step6 with embodiment 3
Before source, further include the steps that described in embodiment 2:
Honeycomb is returned to after leading bee to search nectar source, and nectar source is arranged from low to high first, in accordance with the height of nectar source fitness value
Then sequence is that each nectar source assigns weight according to formula (2.4), follows bee to select nectar source, then update honey according to formula (2.5)
Source.
The weight computing formula in new nectar source is as follows:
Wherein, SN indicates to lead the quantity of bee, and w (i) indicates the weight in i-th of new nectar source
As can be seen from the above equation, the value range of w (i) is between [0,1].The higher nectar source i value of quality is bigger, distribution
Weight is higher, and the selected probability in nectar source is also higher.The algorithm later period is arrived, although the fitness value in all nectar sources tends to one
It causes, but the ropy nectar source weight of the weight ratio in high-quality nectar source is high, high-quality nectar source still is able to show one's talent, obtain
To more optimization chances.
According to the fitness value for assigning weight, bee is followed to select nectar source by the selection mode of roulette, then according to public affairs
Formula (2.5) carries out neighborhood search and generates new nectar source.Bee is followed if the fitness value in new nectar source is greater than the fitness value in old nectar source
Retain new nectar source, otherwise give up new nectar source, bee is followed still to retain old nectar source.
Experimental result and analysis
(1) experimental result:
In order to compare artificial bee colony clustering algorithm (CABC algorithm) and Hybrid Clustering Algorithm involved in this embodiment
The performance of (ICABC_FCM algorithm), we are using 5 common data sets in UCI database: IRIS data set, BUPA data
Collection, WDBC data set, Wine data set, Thyroid data set are tested.Experiment sample composition is as shown in table 4:
The composition of 4 experiment sample data set of table
Dataset name | Number of samples | Classification number | Dimension |
IRIS | 150 | 3 | 4 |
BUPA | 345 | 2 | 6 |
WDBC | 569 | 2 | 30 |
Wine | 178 | 3 | 13 |
Thyroid | 215 | 3 | 5 |
IRIS data set: being made of the attribute data of three kinds of iris plant samples, and data set includes 150 samples in total,
Every a kind of including 50 samples, sample attribute includes sepal length, sepal width, petal length and petal width a total of four
Attribute.
BUPA data set: about the record of male's patients with liver diseases, data set includes 345 samples, each sample in total
Originally there are 6 attributes, wherein preceding 5 attributes are blood testings as a result, the sample in data set is divided into two classes: the first kind has 114,
Second class has 231.
WDBC data set: data set includes 569 samples, and each sample has 30 attributes, and the sample in data set is divided into
Two classes: Malignant and Benign, wherein Malignant class has 357, and Benign class has 212.
Wine data set: data set includes 178 samples, and each sample has 13 attributes, represents the grape in a place of production
13 chemical features that wine is included, the sample in data set are divided into three classes: the first kind has 59 samples, and the second class has 71 samples
This, third class has 48 samples.
Thyroid data set: the thyroid gland data set being made of 215 samples, each sample have 5 attributes.Data set
In sample be divided into three classes: the first kind has 150 samples, and the second class has 35 samples, and third class has 30 samples.
Respectively with CABC algorithm, ICABC algorithm and ICABC_FCM algorithm to IRIS data set, BUPA data set, WDBC number
Clustering is carried out according to collection, Wine data set and Thyroid data set.
The FUZZY WEIGHTED index of each algorithm is m=2, and the minimum that wherein the FCM algorithm stage allows in ICABC_FCM algorithm is accidentally
Poor ε=10-3, honeybee number is 20, and maximum cycle MaxCycle is that 2000, Limit is dimension * (SN/2), the dimension in nectar source
Degree is equal to the attribute dimensions of sample multiplied by clusters number, and algorithm is separately operable 20 times and is averaged as final result.
As shown in Fig. 2, being the operation figure of IRIS cluster data.
The cluster accuracy of each data set is as shown in table 5:
Each cluster data of table 5 averagely under true rate
The cluster result of each data set is as shown in table 6 below:
The target function value of each cluster data of table 6
Interpretation of result:
Fig. 2 is the operation figure of IRIS cluster data, it can be seen from the figure that CABC algorithm is after circulation about 180 times
Target function value just starts to tend towards stability, and hybrid algorithm just has a solution well after circulation primary, and algorithm is several in circulation
Target function value tends to stablize after secondary, and convergence speed of the algorithm is quickly.
From table 4, it can be seen that the cluster accuracy rate of hybrid algorithm is higher than the cluster accuracy rate of CABC algorithm, show clustering
In accuracy rate, ICABC_FCM algorithm is better than CABC algorithm.
As can be seen from Table 6, ICABC_FCM algorithm is respectively less than CABC algorithm in the average and standard deviation of cluster,
Show that ICABC_FCM algorithm is superior to CABC algorithm on whole low optimization accuracy and stability, and compare ICABC_FCM algorithm and
The cluster result of CABC algorithm, ICABC_FCM algorithm are better than ICABC algorithm on whole low optimization accuracy and stability.
By being analyzed above it is found that ICABC_FCM algorithm is steady in cluster accuracy rate, convergence rate, low optimization accuracy and algorithm
CABC algorithm is superior on qualitative.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and
Range.
Claims (8)
1. a kind of Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm, including initial phase,
It leads the bee stage, follow bee stage and investigation bee stage, which is characterized in that further include following steps:
Step 1: after following the bee stage, judge whether current algorithm is to recycle for the first time;If recycling for the first time, then hold
Row step 2;If not recycling for the first time, three are thened follow the steps;
Step 2: optimizing current optimal solution as the initial cluster center of Fuzzy C-Means Clustering Algorithm, if optimization
The quality of solution afterwards is higher than current optimal solution, then replaces current optimal solution with the solution after optimization, otherwise abandon, while corresponding nectar source
The number of iterations add 1, subsequently into investigation the bee stage;
Step 3: judge whether optimal solution changes after following the bee stage;If changing, two are thened follow the steps;If not yet
It changes, then enters the investigation bee stage.
It is calculated 2. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 1 clusters
Method, which is characterized in that described to lead the bee stage and/or follow the formula for generating new nectar source in the bee stage are as follows:
vij=xij+θ×(xij-xkj)
Wherein, the θ is the nonlinear change factor, vijIndicate new nectar source, xijIndicate current nectar source, xkjIndicate adjacent nectar source, i,
K ∈ { 1,2 ..., SN } indicates the quantity in nectar source, and k ≠ i;J ∈ { 1,2 ..., D } represents the dimension in the nectar source being updated.
It is calculated 3. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 2 clusters
Method, which is characterized in that the nonlinear change factor θ are as follows:
Wherein m, n are coefficient, and Cycle indicates previous cycle the number of iterations, and MaxCycle indicates largest loop the number of iterations,Wherein rand is random function.
It is calculated 4. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 3 clusters
Method, which is characterized in that, the value range of the m, n are respectively as follows: m ∈ [1,1.5], n ∈ [0,0.2].
It is calculated 5. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 1 clusters
Method, which is characterized in that it is described follow the bee stage the following steps are included:
It sorts from low to high according to the size for the nectar source fitness value for leading bee, and assigns weight for each nectar source;
According to the fitness value for assigning weight, follows bee to select nectar source by the selection mode of roulette and carry out neighborhood search production
Raw new nectar source.
It is calculated 6. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 5 clusters
Method, which is characterized in that the calculation formula of the weight in the nectar source are as follows:
Wherein, w (i) indicates the weight in nectar source, and value range is between [0,1];SN indicates to lead the quantity of bee.
It is calculated 7. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 1 clusters
Method, which is characterized in that it is described lead the bee stage and/or follow and generate new nectar source in the bee stage after, if new nectar source fitness value is big
Fitness value in old nectar source then replaces old nectar source with new nectar source, on the contrary then retain old nectar source.
It is calculated 8. a kind of mixing based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm according to claim 1 clusters
Method, which is characterized in that after the investigation bee stage further include following steps: judge whether the cycle-index of the algorithm has reached
To maximum cycle;If reaching, terminator;If not up to, return leads the bee stage, continue field search more
New nectar source.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810935647.1A CN109086831A (en) | 2018-08-16 | 2018-08-16 | Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810935647.1A CN109086831A (en) | 2018-08-16 | 2018-08-16 | Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109086831A true CN109086831A (en) | 2018-12-25 |
Family
ID=64793549
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810935647.1A Pending CN109086831A (en) | 2018-08-16 | 2018-08-16 | Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109086831A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111123324A (en) * | 2019-12-31 | 2020-05-08 | 杭州电子科技大学 | DGPS integer ambiguity searching method based on improved ant colony algorithm |
CN112530529A (en) * | 2020-12-09 | 2021-03-19 | 合肥工业大学 | Gas concentration prediction method, system, equipment and storage medium thereof |
CN113177583A (en) * | 2021-04-16 | 2021-07-27 | 中国人民解放军空军工程大学 | Aerial target clustering method |
US11769033B2 (en) | 2021-06-02 | 2023-09-26 | Imam Abdulrahman Bin Faisal University | System, computer readable storage medium, and method for segmentation and enhancement of brain MRI images |
CN117251755A (en) * | 2023-11-17 | 2023-12-19 | 核工业北京地质研究院 | Clustering method of seismic attributes |
-
2018
- 2018-08-16 CN CN201810935647.1A patent/CN109086831A/en active Pending
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111123324A (en) * | 2019-12-31 | 2020-05-08 | 杭州电子科技大学 | DGPS integer ambiguity searching method based on improved ant colony algorithm |
CN112530529A (en) * | 2020-12-09 | 2021-03-19 | 合肥工业大学 | Gas concentration prediction method, system, equipment and storage medium thereof |
CN112530529B (en) * | 2020-12-09 | 2024-01-26 | 合肥工业大学 | Gas concentration prediction method, system, equipment and storage medium thereof |
CN113177583A (en) * | 2021-04-16 | 2021-07-27 | 中国人民解放军空军工程大学 | Aerial target clustering method |
CN113177583B (en) * | 2021-04-16 | 2022-10-18 | 中国人民解放军空军工程大学 | Aerial target clustering method |
US11769033B2 (en) | 2021-06-02 | 2023-09-26 | Imam Abdulrahman Bin Faisal University | System, computer readable storage medium, and method for segmentation and enhancement of brain MRI images |
CN117251755A (en) * | 2023-11-17 | 2023-12-19 | 核工业北京地质研究院 | Clustering method of seismic attributes |
CN117251755B (en) * | 2023-11-17 | 2024-02-27 | 核工业北京地质研究院 | Clustering method of seismic attributes |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086831A (en) | Hybrid Clustering Algorithm based on Fuzzy C-Means Algorithm and artificial bee colony clustering algorithm | |
CN106709037B (en) | A kind of film recommended method based on Heterogeneous Information network | |
CN107203785A (en) | Multipath Gaussian kernel Fuzzy c-Means Clustering Algorithm | |
CN107169983B (en) | Multi-threshold image segmentation method based on cross variation artificial fish swarm algorithm | |
CN109034231A (en) | The deficiency of data fuzzy clustering method of information feedback RBF network valuation | |
CN110909787A (en) | Method and system for multi-objective batch scheduling optimization based on clustering evolutionary algorithm | |
CN111599406B (en) | Global multi-network comparison method combined with network clustering method | |
Kurada et al. | A preliminary survey on optimized multiobjective metaheuristic methods for data clustering using evolutionary approaches | |
Sheng et al. | Multilocal search and adaptive niching based memetic algorithm with a consensus criterion for data clustering | |
CN109583480A (en) | One kind being used for aero-engine anti-asthma control system bathtub curve estimation method | |
CN109326328A (en) | A kind of extinct plants and animal pedigree evolution analysis method based on pedigree cluster | |
CN109919458B (en) | Collaborative cost task allocation method and system based on concept lattice in social network | |
CN117611974B (en) | Image recognition method and system based on searching of multiple group alternative evolutionary neural structures | |
Tiwari et al. | Improving ant colony optimization algorithm for data clustering | |
CN107169594A (en) | A kind of optimization method and device of Vehicle Routing Problems | |
CN110378402A (en) | A kind of K-means clustering method of self study attribute weight | |
Suresh et al. | Data clustering using multi-objective differential evolution algorithms | |
CN104732522A (en) | Image segmentation method based on polymorphic ant colony algorithm | |
CN111353525A (en) | Modeling and missing value filling method for unbalanced incomplete data set | |
CN112488773A (en) | Smart television user classification method, computer equipment and storage medium | |
CN114093426A (en) | Marker screening method based on gene regulation network construction | |
CN105912887B (en) | A kind of modified gene expression programming-fuzzy C-mean algorithm crop data sorting technique | |
Qin et al. | Non-myopic knowledge gradient policy for ranking and selection | |
CN109408728A (en) | A kind of difference secret protection recommended method based on covering algorithm | |
Zhao et al. | A bi-layer decomposition algorithm for many-objective optimization problems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181225 |