CN106951910A - A kind of method and device of data clusters - Google Patents

A kind of method and device of data clusters Download PDF

Info

Publication number
CN106951910A
CN106951910A CN201710071827.5A CN201710071827A CN106951910A CN 106951910 A CN106951910 A CN 106951910A CN 201710071827 A CN201710071827 A CN 201710071827A CN 106951910 A CN106951910 A CN 106951910A
Authority
CN
China
Prior art keywords
population
individual
current
generation
current population
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710071827.5A
Other languages
Chinese (zh)
Inventor
陈剑勇
麦伟杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201710071827.5A priority Critical patent/CN106951910A/en
Publication of CN106951910A publication Critical patent/CN106951910A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention is applicable field of computer technology there is provided a kind of method and device of data clusters, and methods described includes:Receive the data set to be clustered of input, for the corresponding current population of data set generation, according to the fitness value of each individual in current population and the self adaptation index of current population, the select probability of each individual of generation, according to the current population of select probability evolution, generation population of future generation, when current evolutionary generation is not less than maximum evolutionary generation, calculate the self adaptation index of population of future generation, and population of future generation is set to current population, jump to the operation for calculating the fitness value of each individual in current population, otherwise the optimum individual in population of future generation, generate the cluster of simultaneously output data set, so as to pass through fitness value, self adaptation index regulates individual probability selected during evolution, significantly reduce dependence of the cluster result to initial cluster center, cluster is set to escape from local optimum, it is effectively improved clustering result quality.

Description

A kind of method and device of data clusters
Technical field
The invention belongs to field of computer technology, more particularly to a kind of method and device of data clusters.
Background technology
With being intimately associated for computer and information technology, the data of magnanimity are quickly produced and propagated on the internet, Multiple industries such as finance, telecommunications need to obtain the information with potential significance from the data of magnanimity, could be in fast changing The first chance of economic lifeline is grasped, it is necessary to which the information content for handling and classifying is growing day by day under the development of science and technology, acquisition of information Speed is more and more faster, and the species of information also becomes increasingly complex, and how various to these species, object be indefinite, incomplete big Amount information is effectively classified, and therefrom excavates information that we need, useful, is the important research class of current industry Topic.
At present, can realize this problem by clustering algorithm, conventional K- means clustering algorithms have quick convergence, The advantages of good retractility, it can generally cluster out a preferable result in cluster process, but the Clustering Effect of the algorithm Compare the initial value for relying on cluster centre, while being easily absorbed in local solution in cluster, being vulnerable to " noise " interference, cause from magnanimity Data in the information that obtains is not accurate enough, cluster quality is not good.
The content of the invention
It is an object of the invention to provide a kind of method and device of data clusters, it is intended to solves the fine or not right of cluster result The dependence of initial cluster center is larger, and cluster process is easily absorbed in locally optimal solution, easily disturbed by " noise ", causes clustering result quality Not good the problem of.
On the one hand, the invention provides a kind of method of data clusters, methods described comprises the steps:
The data set to be clustered of input is received, is the corresponding current population of the data set generation, the current population In each individual include preset number cluster centre;
Calculate in the current population fitness value of each individual, and according to all fitness values and described current The self adaptation index of population, generates each individual select probability;
According to all cluster centres in each individual in the current population, the sample in the data set is divided into In corresponding cluster, and according to all select probabilities, the current population of evolving, generation population of future generation;
When current evolutionary generation is not less than default maximum evolutionary generation, under the acquisition current Evolution of Population is described Excellent individual number is generated during generation population, and according to the excellent individual amount, calculates the adaptive of the population of future generation Index is answered, the population of future generation is set to the current population, execution is jumped to and calculates in the current population per each and every one The operation of the fitness value of body;
When the current evolutionary generation exceedes the maximum evolutionary generation, optimal in the population of future generation Body, generates and exports the cluster of the data set.
On the other hand, the invention provides a kind of device of data clusters, described device includes:
Initialization module, the data set to be clustered for receiving input, is the corresponding current kind of the data set generation Each individual includes preset number cluster centre in group, the current population;
Computing module, the fitness value for calculating each individual in the current population, and according to the fitness value With the self adaptation index of the current population, each individual select probability is generated;
Evolution module, for all cluster centres in each individual in the current population, by the data set In sample be divided into corresponding cluster, and according to all select probabilities, the current population of evolving, generation is of future generation Population;
Loop module, for when current evolutionary generation is not less than default maximum evolutionary generation, obtaining the current kind Group generates excellent individual number when entering to turn to the population of future generation, and according to the excellent individual amount, under calculating is described The self adaptation index of generation population, the current population is set to by the population of future generation, is jumped to execution and is calculated described work as The operation of the fitness value of each individual in preceding population;And
Generation module is clustered, for when the current evolutionary generation is not less than the maximum evolutionary generation, according to described Optimum individual in population of future generation, generates and exports the cluster of the data set.
Current population of the invention to data set carries out the evolution of maximum evolutionary generation time, and what is obtained from finally evolving is optimal The cluster centre of data set is obtained in individual, the sample concentrated finally according to these cluster centres to data is divided, and is realized Data are concentrated with the cluster of sample.Wherein, in the evolutionary process that current population generates population of future generation, the present invention is according to current The fitness value of each individual and the self adaptation index of current population in population, it is determined that the selection of each individual is general in current population Rate, after evolution generation population of future generation is carried out to current population according to the select probability, obtains excellent according to current Evolution of Population Different individual number, calculates the self adaptation index of population of future generation.So as to cause that the quality of individual is being clustered every time by evolving When all increase, improve the deficiency that the sensitiveness of cluster centre is brought, it is effective by the mutation operation in evolutionary process Ground avoids the interference that cluster process is put by " noise ", and individual selected probability is adjusted by the renewal of adaptation value, Locally optimal solution is effectively jumped out, and then is effectively improved clustering result quality.
Brief description of the drawings
Fig. 1 is that the method for the data clusters that the embodiment of the present invention one is provided realizes flow chart;
Fig. 2 is the structural representation of the device for the data clusters that the embodiment of the present invention two is provided;And
Fig. 3 is the structural representation of the device for the data clusters that the embodiment of the present invention two is provided.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
Implementing for the present invention is described in detail below in conjunction with specific embodiment:
Embodiment one:
Fig. 1 shows that the method for the data clusters that the embodiment of the present invention one is provided realizes flow, for convenience of description, only The part related to the embodiment of the present invention is shown, details are as follows:
In step S101, the data set to be clustered of input is received, is the corresponding current population of data set generation, currently Each individual includes preset number cluster centre in population.
In embodiments of the present invention, data set to be clustered is made up of multiple sample points.Randomly selected in data set pre- If number sample point, these sample points are respectively set to cluster centre, be may make up by these preset number cluster centres An individual in current population, all individuals in current population can be generated by repeating abovementioned steps.
As an example, each individual is included in the attribute dimension of each sample point in data set is d and current population During k cluster centre, the length of each individual is d × k in current population, current after initialization when current Population Size is N I-th of individual is represented by X in populationi,0=(ci,1,ci,2,…,ci,k), ci,kFor k-th of i-th of individual in current population Cluster centre.The population coded system is simple, and the length of individual is shorter, is easy to therefrom decomposite best cluster centre.
In step s 102, the fitness value of each individual in current population is calculated, and according to all fitness values and is worked as The self adaptation index of preceding population, the select probability of each individual of generation.
In embodiments of the present invention, by default object function (for example, squared error function), it can calculate and obtain current The fitness value of each individual in population, fitness value is a numerical value for weighing individual quality.Owned in calculating After fitness value, all individual good and bad grades in current population can be generated, further according to good and bad grade and self adaptation index, are calculated The select probability of each individual.
Specifically, calculate and obtain after all fitness values, by size of all individuals in current population according to fitness value Arranged, good and bad grade classification is carried out according to putting in order.For example, the individual good and bad grade made number one is 1, come Deputy individual good and bad grade is 2, by that analogy, when the scale of population is N, the individual quality rolled into last place Grade is N.
Specifically, when individual according to fitness value from difference to after being ranked up well, good and bad higher grade individual is more excellent, Now according to good and bad grade and the self adaptation index of current population, the formula of the select probability of each individual in current population is calculated For:
Wherein, λ (g) is the self adaptation index of current population when current evolutionary generation is g, RiTo work as The good and bad grade of i-th of individual in preceding population, NP is all individual sums, P in current populationi gIt is g for current evolutionary generation The select probability of i-th of individual in Shi Dangqian populations.
In step s 103, all cluster centres in each individual in current population, by the sample in data set It is divided into corresponding cluster, and according to all select probabilities, current population of evolving, generation population of future generation.
In embodiments of the present invention, the cluster centre in individual in current population, can enter to the sample that data are concentrated Row is divided, specifically, sample can be divided into corresponding cluster according to the distance between sample in data set and cluster centre In.
Wherein, according to select probability, current population of evolving, generation population of future generation can be realized by following step:
(1) according to select probability, selection target individual enters row variation and intersection in current population, generates new individual.
Specifically, according to select probability, carry out choosing in current population different individuals as base vector and difference to The end vector of amount, the good and bad higher ranked individual of selection is as end vector, by the trend of the whole vector of end vector guiding, So that whole evolutionary process is in excellent individual guiding, so as to effectively improve the convergence efficiency of evolutionary process.
(2) fitness value of new individual is compared with the fitness value of target individual, when new individual is better than target individual, New individual is set to the individual in population of future generation, and excellent individual number is added one, otherwise target individual is set to Individual in population of future generation.
Specifically, after evolution obtains new individual, calculate new individual fitness value, and by the fitness value of new individual with The fitness value of target individual is compared.In each evolve, excellent individual number is all initialized as zero, by excellent The number of the individual new individual number individual better than original to count generation in current evolutionary process.
In step S104, judge whether current evolutionary generation exceedes default maximum evolutionary generation.
In embodiments of the present invention, current evolutionary generation is used for recording the current number of times to Evolution of Population, and maximum is evolved generation Number is used for limiting the total degree to Evolution of Population, when current evolutionary generation exceedes maximum evolutionary generation, it is believed that current population Cluster centre in middle individual is to be optimal.When current evolutionary generation is no more than maximum evolutionary generation, step S105 is performed, Otherwise, step S106 is performed.
Preferably, calculate obtain all individual suitable angle value in current population when, obtain in all fitness values Adaptive optimal control angle value, can be by judging whether the adaptive optimal control angle value meets default threshold value (for example, no more than the default threshold Value), to determine that current cluster centre is enough to be optimal, so as to effectively improve the efficiency of evolutionary process and the effect of cluster Really.
In step S105, obtain and excellent individual number is generated when current Evolution of Population is population of future generation, and according to Excellent individual amount, calculates the self adaptation index of population of future generation, population of future generation is set into current population.
In embodiments of the present invention, when current evolutionary generation is not less than default maximum evolutionary generation, statistics is passed through Excellent individual amount, calculates the self adaptation index of population of future generation, and jumps to each individual in the current population of execution calculating The operation of fitness value, to be evolved to population of future generation.
Specifically, it is first population of future generation according to current Evolution of Population when calculating the self adaptation index of population of future generation When excellent individual amount, calculate corresponding excellent individual ratio.Then, the excellent individual ratio is entered with default desired value Row compares, and according to comparative result, calculates the self adaptation index of population of future generation.
Specifically, the computing formula of excellent individual ratio is:
Wherein, SR (g+1) is the excellent of current Evolution of Population generation when current evolutionary generation is g Different individual ratio, c (g+1) is the excellent individual amount of current Evolution of Population generation when current evolutionary generation is g.
In particular, it is desirable to be worth for uSR (g), wherein, u is parameter preset, and SR (g) is that evolutionary generation is currently planted when being g-1 The excellent individual ratio of group's evolution generation.
Specifically, when excellent individual ratio is not less than desired value, the computing formula of population self adaptation index of future generation is λ (g+1)=min (λ (g)+Δ SR (g+1), λmax), otherwise the computing formula of population self adaptation index of future generation for λ (g+1)= max(λmin, λ (g)-Δ (1-SR (g+1))), wherein, λmin、λmax, Δ, u be parameter preset.
In step s 106, the optimum individual in population of future generation, generates the cluster of simultaneously output data set.
In embodiments of the present invention, when current evolutionary generation exceedes maximum evolutionary generation, do not recycle and perform population Evolutional operation, the optimal individual of adaptive value, i.e. optimum individual are now obtained from population of future generation, the optimum individual is solved Code, obtains wherein all cluster centres, and according to these cluster centres, the sample in data set is divided into corresponding cluster Go.
In embodiments of the present invention, according to data set to be clustered, current population is initialized, according to individual in current population Fitness value, good and bad grade is divided to individual, and be that different individual generations is different according to the self adaptation index of current population Select probability, the higher select probability of good individual correspondence, according to the select probability of generation, current population of evolving so that good Individual go to guide the convergence direction entirely evolved, when current evolutionary generation is not less than maximum evolutionary generation, according to current kind Group evolves to the excellent individual amount of population of future generation generation, the self adaptation index of adjustment population of future generation, and according to adjustment after Self adaptation index, update the select probability of Different Individual in population of future generation, population of future generation evolved, until currently Evolutionary generation exceedes maximum evolutionary generation, and finally, the current population obtained according to finally evolving obtains the cluster knot of data set Really, so as to by multiple evolution, improve the generation quality of the cluster centre of data set, and it is excellent according to being generated in evolutionary process The number of individual, adjusts the select probability of individual, has effectively heightened the efficiency that evolutionary process converges to preferable cluster centre, kept away Exempt to be absorbed in local optimum.
Can be with one of ordinary skill in the art will appreciate that realizing that all or part of step in above-described embodiment method is The hardware of correlation is instructed to complete by program, described program can be stored in a computer read/write memory medium, Described storage medium, such as ROM/RAM, disk, CD.
Embodiment two:
Fig. 2 shows the structure of the device for the data clusters that the embodiment of the present invention two is provided, and for convenience of description, only shows The part related to the embodiment of the present invention, including:
Initialization module 21, the data set to be clustered for receiving input, is the corresponding current population of data set generation, Each individual includes preset number cluster centre in current population;
Computing module 22, the fitness value for calculating each individual in current population, and according to fitness value and currently The self adaptation index of population, the select probability of each individual of generation;
Evolution module 23, for all cluster centres in each individual in current population, by the sample in data set Originally it is divided into corresponding cluster, and according to all select probabilities, current population of evolving, generation population of future generation;
Loop module 24, for when current evolutionary generation is not less than default maximum evolutionary generation, obtaining current population Excellent individual number is generated when entering to turn to population of future generation, and according to excellent individual amount, calculates the adaptive of population of future generation Index is answered, population of future generation is set to current population, the fitness value for performing and calculating each individual in current population is jumped to Operation;And
Generation module 25 is clustered, for when current evolutionary generation is not less than maximum evolutionary generation, according to population of future generation In optimum individual, generate and output data set cluster.
Preferably, as shown in figure 3, initialization module 21, which includes cluster centre, chooses module 311, population generation module 312, Wherein:
Cluster centre chooses module 311, for the random preset number sample point in data set, by preset number sample This point is respectively set to cluster centre;And
Population generation module 312, for the individual being combined into preset number cluster centre in current population, weight Operation is randomly selected again, generates all individuals in current population.
Preferably, as shown in figure 3, the inclusive fitness value computing module 321 of computing module 22, select probability computing module 322, wherein:
Fitness value calculation module 321, for according to default object function, calculating each individual in current population fit Answer angle value;And
Select probability computing module 322, for according to all fitness values, generating the quality of each individual in current population Grade, and according to all good and bad grades and the self adaptation index of current population, the selection for calculating each individual in current population is general Rate.
Preferably, as shown in figure 3, evolution module 23 includes individual evolution module 331, new population generation module 332, its In:
Individual evolution module 331, for according to select probability, selection target individual to enter row variation and friendship in current population Fork, generates new individual;And
New population generation module 332, for the fitness value of new individual to be compared with the fitness value of target individual, when new When individual is better than target individual, new individual is set to the individual in population of future generation, and excellent individual amount is added one, otherwise Target individual is set to the individual in population of future generation.
Preferably, as shown in figure 3, loop module 24 includes ratio computing module 341, self adaptation index update module 342, Wherein:
Ratio computing module 341, during for according to excellent individual number, calculating current Evolution of Population for population of future generation Excellent individual ratio;And
Self adaptation index update module 342, for excellent individual ratio and default desired value to be compared, according to than Relatively result, calculates the self adaptation index of population of future generation.
In embodiments of the present invention, according to data set to be clustered, current population is initialized, according to individual in current population Fitness value, good and bad grade is divided to individual, and be that different individual generations is different according to the self adaptation index of current population Select probability, the higher select probability of good individual correspondence, according to the select probability of generation, current population of evolving so that good Individual go to guide the convergence direction entirely evolved, when current evolutionary generation is not less than maximum evolutionary generation, according to current kind Group evolves to the excellent individual amount of population of future generation generation, the self adaptation index of adjustment population of future generation, and according to adjustment after Self adaptation index, update the select probability of Different Individual in population of future generation, population of future generation evolved, until currently Evolutionary generation exceedes maximum evolutionary generation, and finally, the current population obtained according to finally evolving obtains the cluster knot of data set Really, so as to by multiple evolution, improve the generation quality of the cluster centre of data set, and it is excellent according to being generated in evolutionary process The number of individual, adjusts the select probability of individual, has effectively heightened the efficiency that evolutionary process converges to preferable cluster centre, kept away Exempt to be absorbed in local optimum.
In embodiments of the present invention, each unit of the device of data clusters can be realized by corresponding hardware or software unit, Each unit can be independent soft and hardware unit, a soft and hardware unit can also be integrated into, herein not to limit this hair It is bright.The embodiment of each module refers to the description of each step in previous embodiment one in the embodiment of the present invention, herein not Repeat again.
Presently preferred embodiments of the present invention is the foregoing is only, is not intended to limit the invention, all essences in the present invention Any modification, equivalent and improvement made within refreshing and principle etc., should be included within the scope of the present invention.

Claims (10)

1. a kind of method of data clusters, it is characterised in that methods described comprises the steps:
The data set to be clustered of input is received, is every in the corresponding current population of the data set generation, the current population Individual includes preset number cluster centre;
The fitness value of each individual in the current population is calculated, and according to all fitness values and the current population Self adaptation index, generate each individual select probability;
According to all cluster centres in each individual in the current population, the sample in the data set is divided into accordingly Cluster in, and according to all select probabilities, the current population of evolving, generation population of future generation;
When current evolutionary generation is not less than default maximum evolutionary generation, it is the next generation to obtain the current Evolution of Population Excellent individual number is generated during population, and according to the excellent individual amount, the self adaptation for calculating the population of future generation refers to Number, the current population is set to by the population of future generation, is jumped to and is performed each individual in the calculating current population The operation of fitness value;
When the current evolutionary generation exceedes the maximum evolutionary generation, according to the optimum individual in the population of future generation, Generate and export the cluster of the data set.
2. the method as described in claim 1, it is characterised in that the step of current population corresponding for the data set generation, Including:
The preset number sample point is randomly selected in the data set, the preset number sample point is set to gather Class center;
The preset number cluster centre is combined into the individual in the current population, the random choosing is repeated Extract operation, generates all individuals in the current population.
3. the method as described in claim 1, it is characterised in that calculate the fitness value of each individual in the current population, And according to all fitness values and the self adaptation index of the current population, generate the step of each individual select probability Suddenly, including:
According to default object function, the fitness value of each individual in the current population is calculated;
According to all fitness values, the good and bad grade of each individual in the current population is generated, and according to described all Good and bad grade and the self adaptation index of the current population, calculate the select probability of each individual in the current population, described The computing formula of select probability is:
Wherein, λ (g) is the self adaptation index of current population when the current evolutionary generation is g, RiFor The good and bad grade of i-th of individual in the current population, NP is all individual sums, P in the current populationi gWork as to be described I-th of individual select probability in current population when evolution algebraically is g.
4. the method as described in claim 1, it is characterised in that according to all select probabilities, the current population of evolving, The step of generation next generation population, including:
According to the select probability, selection target individual enters row variation and intersection in the current population, generates new individual;
The fitness value of the new individual is compared with the fitness value of the target individual, when the new individual is better than the mesh During mark individual, the new individual is set to the individual in the population of future generation, and the excellent individual amount is added one, it is no The target individual is then set to the individual in the population of future generation.
5. the method as described in claim 1, it is characterised in that according to the excellent individual amount, calculates the kind of future generation The step of self adaptation index of group, including:
According to the excellent individual amount, excellent individual ratio when the current Evolution of Population is the population of future generation is calculated , the computing formula of the excellent individual ratio is:
Wherein, SR (g+1) is current Evolution of Population generation when the current evolutionary generation is g Excellent individual ratio, c (g+1) is the excellent number of individuals of current Evolution of Population generation when the current evolutionary generation is g Mesh;
The excellent individual ratio and default desired value are compared, and according to comparative result, calculate the kind of future generation The self adaptation index of group;
The desired value is uSR (g), wherein, u is parameter preset, and SR (g) is described current when the evolutionary generation is g-1 The excellent individual ratio of Evolution of Population generation;
When the excellent individual ratio is not less than the desired value, the computing formula of the population self adaptation index of future generation is λ (g+1)=min (λ (g)+Δ SR (g+1), λmax), otherwise the computing formula of the population self adaptation index of future generation is λ (g + 1)=max (λmin, λ (g)-Δ (1-SR (g+1))), wherein, λmin、λmax, Δ, u be parameter preset.
6. a kind of device of data clusters, it is characterised in that described device includes:
Initialization module, the data set to be clustered for receiving input, is the corresponding current population of the data set generation, institute State each individual in current population and include preset number cluster centre;
Computing module, the fitness value for calculating each individual in the current population, and according to all fitness values With the self adaptation index of the current population, each individual select probability is generated;
Evolution module, for all cluster centres in each individual in the current population, by the data set Sample is divided into corresponding cluster, and according to all select probabilities, the current population of evolving, generation is of future generation to plant Group;
Loop module, enters for when current evolutionary generation is not less than default maximum evolutionary generation, obtaining the current population Excellent individual number is generated when turning to the population of future generation, and according to the excellent individual amount, calculates the next generation The self adaptation index of population, the current population is set to by the population of future generation, is jumped to execution and is calculated the current kind The operation of the fitness value of each individual in group;And
Generation module is clustered, for when the current evolutionary generation is not less than the maximum evolutionary generation, according to described next For the optimum individual in population, the cluster of the data set is generated and exported.
7. device as claimed in claim 6, it is characterised in that the initialization module includes:
Cluster centre chooses module, for the random preset number sample point in the data set, by the present count Mesh sample point is respectively set to cluster centre;And
Population generation module, for the individual being combined into the preset number cluster centre in the current population, Operation is randomly selected described in repeating, all individuals in the current population are generated.
8. device as claimed in claim 6, it is characterised in that the computing module includes:
Fitness value calculation module, for according to default object function, calculating the adaptation of each individual in the current population Angle value;And
Select probability computing module, for according to all fitness values, generating the excellent of each individual in the current population Of inferior quality level, and according to all good and bad grades and the self adaptation index of the current population, calculate every in the current population Individual select probability, the computing formula of the select probability is:
Wherein, λ (g) is the self adaptation index of current population when the current evolutionary generation is g, RiFor The good and bad grade of i-th of individual in the current population, NP is all individual sums, P in the current populationi gWork as to be described I-th of individual select probability in current population when evolution algebraically is g.
9. device as claimed in claim 6, it is characterised in that the evolution module includes:
Individual evolution module, for according to the select probability, in the current population selection target individual enter row variation and Intersect, generate new individual;And
New population generation module, for the fitness value of the new individual to be compared with the fitness value of the target individual, when When the new individual is better than the target individual, the new individual is set to the individual in the population of future generation, and by institute State excellent individual amount and plus one, the target individual is otherwise set to the individual in the population of future generation.
10. device as claimed in claim 6, it is characterised in that the loop module includes:
Ratio computing module, is the next generation for according to the excellent individual number, calculating the current Evolution of Population Excellent individual ratio during population, the computing formula of the excellent individual ratio is:
Wherein, SR (g+1) is current Evolution of Population generation when the current evolutionary generation is g Excellent individual ratio, c (g+1) is described excellent of current Evolution of Population generation when the current evolutionary generation is g Body number;And
Self adaptation index update module, for the excellent individual ratio and default desired value to be compared, according to comparing As a result, the self adaptation index of the population of future generation is calculated;
The desired value is uSR (g), wherein, u is parameter preset, and SR (g) is described current when the evolutionary generation is g-1 The excellent individual ratio of Evolution of Population generation;
When the excellent individual ratio is not less than the desired value, the computing formula of the population self adaptation index of future generation For λ (g+1)=min (λ (g)+Δ SR (g+1), λmax), otherwise the computing formula of the population self adaptation index of future generation is λ (g+1)=max (λmin, λ (g)-Δ (1-SR (g+1))), wherein, λmin、λmax, Δ, u be parameter preset.
CN201710071827.5A 2017-02-09 2017-02-09 A kind of method and device of data clusters Pending CN106951910A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710071827.5A CN106951910A (en) 2017-02-09 2017-02-09 A kind of method and device of data clusters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710071827.5A CN106951910A (en) 2017-02-09 2017-02-09 A kind of method and device of data clusters

Publications (1)

Publication Number Publication Date
CN106951910A true CN106951910A (en) 2017-07-14

Family

ID=59465824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710071827.5A Pending CN106951910A (en) 2017-02-09 2017-02-09 A kind of method and device of data clusters

Country Status (1)

Country Link
CN (1) CN106951910A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116165542A (en) * 2023-03-01 2023-05-26 上海玫克生储能科技有限公司 Battery parameter identification method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116165542A (en) * 2023-03-01 2023-05-26 上海玫克生储能科技有限公司 Battery parameter identification method, device, equipment and storage medium
CN116165542B (en) * 2023-03-01 2023-10-20 上海玫克生储能科技有限公司 Battery parameter identification method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
Hall et al. Clustering with a genetically optimized approach
CN112801281A (en) Countermeasure generation network construction method based on quantization generation model and neural network
Ali et al. A modified cultural algorithm with a balanced performance for the differential evolution frameworks
CN105718943A (en) Character selection method based on particle swarm optimization algorithm
CN110287985B (en) Depth neural network image identification method based on variable topology structure with variation particle swarm optimization
US20060179018A1 (en) Method and system for training a hearing aid using a self-organising map
CN110705640A (en) Method for constructing prediction model based on slime mold algorithm
CN110210529A (en) A kind of feature selection approach based on binary quanta particle swarm optimization
CN117290721A (en) Digital twin modeling method, device, equipment and medium
CN106951910A (en) A kind of method and device of data clusters
CN110110447A (en) It is a kind of to mix the feedback limit learning machine steel strip thickness prediction technique that leapfrogs
CN111343115B (en) 5G communication modulation signal identification method and system
Chattopadhyay et al. Feature selection using differential evolution with binary mutation scheme
CN115906959A (en) Parameter training method of neural network model based on DE-BP algorithm
CN114550283A (en) Wireless network signal action recognition system and method based on improved support vector machine
Alshare et al. Increasing Accuracy of Random Forest Algorithm by Decreasing Variance
CN114971243A (en) FNN (false negative number) countermeasure generation-based dioxin emission risk early warning model construction method
Silva et al. Sparse least squares support vector machines via genetic algorithms
Khotimah et al. Adaptive SOMMI (Self Organizing Map Multiple Imputation) base on Variation Weight for Incomplete Data
CN114581058B (en) Personnel organization structure optimization method based on business process
Liu et al. Improved competitive swarm optimization algorithms for feature selection
Chen et al. A hybrid ensemble method based on double disturbance for classifying microarray data
Bin et al. A Genetic Clustering Method Based on Variable Length String
Sun et al. Introduce randomness into AdaBoost for robust performance on noisy data
Indira et al. Rule acquisition in data mining using a self adaptive genetic algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170714

RJ01 Rejection of invention patent application after publication