WO2003105080A2 - Metapopulation genetic algorithm for combinatorial optimisation problems - Google Patents
Metapopulation genetic algorithm for combinatorial optimisation problems Download PDFInfo
- Publication number
- WO2003105080A2 WO2003105080A2 PCT/BE2003/000105 BE0300105W WO03105080A2 WO 2003105080 A2 WO2003105080 A2 WO 2003105080A2 BE 0300105 W BE0300105 W BE 0300105W WO 03105080 A2 WO03105080 A2 WO 03105080A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- population
- populations
- individuals
- consensus
- solutions
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
Definitions
- the present invention is related to a new method for selecting efficient solutions of combinatorial optimisation problems in different industrial processes such as transport and telecommunication networks, designing optimal production processes, designing optimal management, logistic and production decisions, DNA-array technology, neuronal systems, etc.
- Optimality-criterion based philogeny inference is a perfect example of "unsolvable" problem. Indeed, the number of solutions increases explosively with the number of taxa (taxa are groups of individuals (i.e., populations), and can be species, genera, families, etc., depending on the question addressed by the philogeny investigation) . The total number of possible unrooted, bifurcating tree topologies (i.e., solutions) among T terminal taxa is:
- (ML) criterion (4) as one of the best for philogeny inference.
- Reasons for this include (i ) statistical consistency; ML estimators tend to converge to true parameters values when the number of characters is increased, (ii) robustness; violations of the ML model assumptions have only a moderate impact on the tree inference accuracy, (Hi) the ability to compare different trees within a statistical framework, and (iv) the ability of ML to make full use of the original character matrix.
- the analytical power of ML has a cost : computation time.
- application of this model-based criterion makes the use of exact philogeny inference methods impractical for more than a trivial number of taxa
- the first of these classes includes solutions that partition the large problem into many small sub-problems whose solutions are then combined into a consensus global solution.
- the "quartet puzzling" method (8) works by reconstructing the best ML tree of each possible quartet of taxa, then combining
- T-taxon tree The advantage of quartet puzzling is that it avoids numerous optimisations on a large number of elements, optimising instead numerous small sets of elements, which is computationally much simpler. Despite the initial appeal, the method has proven to be only moderately more accurate (8) than simple clustering method that requires significantly less computation time.
- the second class is comprised of stochastic heuristics that avoid optimisation of numerous solutions entirely. Instead, they incorporate methods that allow model parameters to be optimised as the search proceeds - taking an inter-step optimisation strategy.
- Stochastic Simulated Annealing (SSA (9) )
- SSA stoculated Annealing
- SSA is based on the simple perturbation algorithm described above, but incorporates a method that perturbs model parameters at each iteration instead of requiring optimisation of each potential solution.
- SSA avoids local optima by accepting changes that decrease the likelihood of the solution with a probability inversely proportional to the reduction in likelihood.
- Another increasingly popular approach is the
- MCMC MCMC-based methods
- MCMC-based methods also benefit from avoiding intra-step optimisation, although they have a slightly different aim: sampling the distribution of the space of solutions instead of only finding optimal solutions .
- GA a type of evolutionary computation method (14) , to philogeny inference.
- GAs implement a set of operators that mimic processes of biological evolution such as mutation, recombination, selection, and reproduction. After an initial step of generating a population, the individuals
- the present invention aims at providing a simple and efficient method as well as an algorithm used in said method able to find and obtain excellent solutions to combinatorial optimisation problems in a suitable fraction of time (approximately 100 to 1000 times faster of that required by existing methods of the state of the art.
- the present invention relates to a method for obtaining an efficient solution to a combinatorial optimisation problem, comprising the following steps: a) selecting different groups (A, B, C) of population able to interact with each other; b) applying specific selection parameters to each group of population (necessary for performing a consensus selection of different solutions) ; c) obtaining for each group of population, a consensus selection of optimal solution portions and portions of solution to be optimised, said (tested) solutions satisfying the requirements of said parameters; d) comparing by a consensus pruning (communication tools among the populations) the solutions obtained from the different groups of population, the optimal solution portions present in the majority of the different groups being not modified; e) repeating steps c) and d) for the other solution portions to be optimised until said portions are optimised; f) selecting from said preceding steps one or more complete optimal solutions (when the portions of solution can not be further optimised) for said combinatorial optimisation problem.
- the method can be applied on populations submitted to random modifications of their structure (for instance different parts of a machine, different groups of a chemical molecule, different parameters of a process, different possible interactions between elements, etc.) before consensus selection (step b)) .
- the populations are different genetic sequences (structure of the population) of a group of individuals (such as species, genera, families, etc.), wherein the random modifications of structure are mutations in the genetic sequence and wherein the solutions are philogenic bifurcating trees representing the interactions between said populations.
- one of the parameters of the consensus selection may require that the bifurcating trees exist in most of or all groups of population.
- the present invention relates also to the algorithm for performing the method according to the invention.
- Fig. 1 represents the principle of Consensus Pruning (CP) applied in the method according to the invention.
- Fig. 2 represents the times required for single rounds of perturbation algorithms (NNI , SPR and TBR branch swapping) .
- Fig. 3 represents:
- Part a the score vs. time for genetic algorithm (one population) and metapopulation genetic algorithm
- Part b the score of best tree (320 taxa) for each of four populations run in parallel (no Consensus Pruning) ;
- Fig. 4 represents: — Part a: run times for a one-population genetic algorithm search as well as for 2; 4, 6, 8 and 16 population metapopulation genetic algorithm searches;
- the present invention relates to a new genetic algorithm named the "metapopulation genetic algorithm” (metaGA) that vastly improves the speed and efficiency with which solutions are found (such that nucleotide sequence data sets incorporating hundreds or thousands of taxa can be analysed in practical computing times) and yields a probability index for each branch.
- the metaGA procedure has been incorporated into a computer program for philogeny inference, METAPIGA (Philogeny Inference using the metapopulation Genetic Algorithm) . Analyses on simulated and real data sets demonstrated (15) the efficiency of the metaGA procedure .
- each tree is evaluated (without optimisation of branch lengths or other model parameters) and its log likelihood (hereafter called score) is recorded.
- Minimal updating MU
- the pruning algorithm takes advantage of the fact that likelihood calculation at a particular node depends only on the values at the two nodes connected to it in the direction of the terminal nodes.
- MU extends this by taking advantage of the fact that perturbation of the tree topology and branch lengths during the GA search (see below) forces recomputation of the likelihood only at nodes along the path from the changed part(s) of the tree to the centre.
- rank selection individuals are assigned a probability of leaving an offspring (i.e., a copy of themselves) as a function of their position in a list in which they are ranked by their score.
- METAPIGA a rank selection identical to that described in Lewis (12) is implemented, i.e., in a population of n individuals ranked by their InL, the probability for the ith individual of leaving an offspring to the next generation is equal to (n - i + l) n (n + l)
- tournament selection two individuals are drawn randomly from the population of n individuals, and one offspring is produced from the individual with the higher score . Both trees are then placed back into the mating population, and the whole process is repeated until n offspring have been generated.
- the improve selection method avoids this problem by allowing only those individuals that have scores better than that of the best tree from the previous generation to produce an offspring. Each individual that fails this test is discarded and replaced by a copy of the current best individual .
- the latter selection scheme greatly reduces the intra-population variability after each selection step. Local optima are avoided, however, through the metapopulation procedures described hereafter. Mutation
- RCM recombination operator
- the probabilities were allowed to be assigned dynamically.
- the probability assigned to each operator is relative to the average contribution that operator has made to improving the population, within the last G generations .
- a lower probability bound (specified by the user) is placed on each operator.
- JNJ "jack-knifed” NJ
- NJ noisy NJ
- JNJ consists of generating all starting trees with the NJ algorithm, but on a different subset of the data set for each population.
- JNJ Two variations were implemented: nonoverlapping and overlapping JNJ.
- the former consists of randomly assigning each character to one of the P populations, such that the original data set is eventually divided into P nonoverlapping sets of characters .
- the overlapping JNJ consists of randomly assigning a proportion p of the characters to each of the populations. Because the process is independent (and performed from the original data set) for each population, the P sets of characters are typically overlapping and each character may be assigned to 0, 1, ..., or N of the populations.
- the second procedure, NNJ uses the full data set but with a modified method for joining nodes: at each step, suboptimal nodes (i.e., nodes that are never joined in classical NJ because they exhibit a non-minimal value in the current distance matrix) are joined with a probability proportional to the inverse of their pairwise distance.
- suboptimal nodes i.e., nodes that are never joined in classical NJ because they exhibit a non-minimal value in the current distance matrix
- JNJ and NNJ procedures allow the search to start with trees whose scores are much better than those of random trees while still keeping enough variation among the initial populations for the metaGA to be effective in finding the ML topology.
- Initial internal branch " lengths are set either to those specified by the NJ algorithm or an arbitrary value specified by the user.
- METAPIGA is written in the Java programming language and supports Windows, Unix/Linux, and Macintosh operating systems. Java was chosen to allow for cross- platform compatibility and easy implementation of a user- friendly interface. [0039] Preliminary tests suggested that coding the software in Java instead of C++ would not have an appreciable (i.e., >15%) effect on the speed of the software . Resul ts and Discussion Population Size
- the inventors have designed a family of heuristic search strategies named the "metapopulation genetic algorithm” (metaGA) .
- This new approach relies on the coexistence of two or more populations interacting in a "metapopulation” setting. Both the number of populations, and the number of individuals per population can be specified by the user.
- CP allows the elaboration of many specific inter-population communication procedures. For example, under "random" consensus pruning, each of the P populations is picked in turn and randomly paired with one of the remaining P-l populations.
- the consensus between the two solutions define the partitions that cannot be affected by topological mutations. For example, in the case of philogeny inference, if a consensus branch defines a partition between groups I and II, no swap between a taxon in group I and a taxon in group II is allowed, while topological changes within group I and within group II are allowed (Fig. 1) .
- the metaGA procedure provides a convenient stopping rule: the search stops when (i) the best solutions of all populations are identical, or (ii ) when all mutational changes allowed by the latest consensus information have been attempted on the best solutions of all populations and these changes did not improve their scores. Given the high percentage of partitions that are fixed towards the end of the search, mutating to completion is swift even on very large sets of elements (here, large trees) . Efficiency of the metaGA
- Fig. 3 indicates that the time required by a metaGA search (i.e., Consensus Pruning with strict group- consensus and 4 populations) is much less than that required by stepwise addition (StepAdd) or by the classical
- the metaGA is up to 800 times faster than stepwise addition.
- StepAdd algorithm yields a tree which is typically used as the starting point of a hill-climbing search, it is also shown on Fig. 2 the times required for single rounds of perturbation algorithms (NNI , SPR, and TBR branch swapping (7)), i.e., the time required to swap to completion the single best ML tree.
- Part a of Fig. 3 shows the relative run times vs .
- the efficiency of the metaGA can be attributed to two factors. First, it allows the stopping rule to be reached very quickly: near the end of the search, consensus-pruning-constrained mutations affect a greatly reduced number of subsolutions, hence swapping to completion is much faster than it would be on an unconstrained solution. Second, the consensus information shared among parallel populations allows them to increase their scores and the number of consensus partitions faster than if they were each searching in isolation (Parts b and c of Fig. 3) . Hence, despite the fact that a 4-population metaGA search requires evaluation of four times more solutions each generation than in a single GA search, the former completes the search much faster than the latter (part a of Fig. 3) .
- Part a of Fig. 3 shows the score vs . time for GA (1 population) and metaGA (strict CP with 4 populations of 4 individuals each) runs (80 taxa) . The asterisks indicate when the stopping rule has been reached. [0054] _ Part b of Fig. 3 shows the score of best tree
- Fig. 4 indicates that computing time increases with the number of populations involved in "Consensus Pruning" . As many as 10 populations are required to slow down the metaGA to computing times similar to those of single-population GA runs.
- Part a of Fig. 4 shows the run times for a one-population GA search (dotted circle; ⁇ SE indicated) as well as for 2-, 4-, 6-, 8-, and 16-population metaGA searches (10 runs, SE too small for being visible) .
- the vertical arrow indicates the number of populations for which a metaGA run takes the same time than a one- population GA search. Run time increases polynomially with the number of populations.
- Part b of Fig. 4 indicates the run time (160 taxa) vs . percent error for 2-, 4-, 6-, 8-, and 16- population metaGA searches under probability Consensus Pruning. Coordinates of the one-population GA run are indicated by the blue cross. The difference of speed between the GA and the metaGA are much larger with more complex ML models.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Genetics & Genomics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Physiology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2003240321A AU2003240321A1 (en) | 2002-06-10 | 2003-06-10 | Metapopulation genetic algorithm for combinatorial optimisation problems |
EP03729736A EP1520254A2 (en) | 2002-06-10 | 2003-06-10 | Metapopulation genetic algorithm for combinatorial optimisation problems |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US38733402P | 2002-06-10 | 2002-06-10 | |
US60/387,334 | 2002-06-10 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2003105080A2 true WO2003105080A2 (en) | 2003-12-18 |
WO2003105080A3 WO2003105080A3 (en) | 2004-06-03 |
Family
ID=29736297
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/BE2003/000105 WO2003105080A2 (en) | 2002-06-10 | 2003-06-10 | Metapopulation genetic algorithm for combinatorial optimisation problems |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1520254A2 (en) |
AU (1) | AU2003240321A1 (en) |
WO (1) | WO2003105080A2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2402498A (en) * | 2003-06-06 | 2004-12-08 | Visteon Global Tech Inc | Method for optimising the configuration of a pick-and-place machine |
CN112287564A (en) * | 2020-11-20 | 2021-01-29 | 国网湖南省电力有限公司 | Electrode array optimization method based on goblet sea squirt group algorithm |
CN112580865A (en) * | 2020-12-15 | 2021-03-30 | 北京工商大学 | Mixed genetic algorithm-based takeout delivery path optimization method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5222192A (en) * | 1988-02-17 | 1993-06-22 | The Rowland Institute For Science, Inc. | Optimization techniques using genetic algorithms |
US5774690A (en) * | 1995-09-14 | 1998-06-30 | The United States Of America As Represented By The Secetary Of The Navy | Method for optimization of element placement in a thinned array |
-
2003
- 2003-06-10 WO PCT/BE2003/000105 patent/WO2003105080A2/en not_active Application Discontinuation
- 2003-06-10 EP EP03729736A patent/EP1520254A2/en not_active Withdrawn
- 2003-06-10 AU AU2003240321A patent/AU2003240321A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5222192A (en) * | 1988-02-17 | 1993-06-22 | The Rowland Institute For Science, Inc. | Optimization techniques using genetic algorithms |
US5774690A (en) * | 1995-09-14 | 1998-06-30 | The United States Of America As Represented By The Secetary Of The Navy | Method for optimization of element placement in a thinned array |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2402498A (en) * | 2003-06-06 | 2004-12-08 | Visteon Global Tech Inc | Method for optimising the configuration of a pick-and-place machine |
GB2402498B (en) * | 2003-06-06 | 2005-08-24 | Visteon Global Tech Inc | Method for optimizing configuration of pick-and-place machine |
US7076313B2 (en) | 2003-06-06 | 2006-07-11 | Visteon Global Technologies, Inc. | Method for optimizing configuration of pick-and-place machine |
CN112287564A (en) * | 2020-11-20 | 2021-01-29 | 国网湖南省电力有限公司 | Electrode array optimization method based on goblet sea squirt group algorithm |
CN112287564B (en) * | 2020-11-20 | 2023-04-07 | 国网湖南省电力有限公司 | Electrode array optimization method based on goblet sea squirt group algorithm |
CN112580865A (en) * | 2020-12-15 | 2021-03-30 | 北京工商大学 | Mixed genetic algorithm-based takeout delivery path optimization method |
Also Published As
Publication number | Publication date |
---|---|
EP1520254A2 (en) | 2005-04-06 |
WO2003105080A3 (en) | 2004-06-03 |
AU2003240321A1 (en) | 2003-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sipper | Co-evolving non-uniform cellular automata to perform computations | |
Marin et al. | Macroevolutionary algorithms: a new optimization method on fitness landscapes | |
Buriol et al. | A new memetic algorithm for the asymmetric traveling salesman problem | |
Matsuda | Protein phylogenetic inference using maximum likelihood with a genetic algorithm | |
Koza | A hierarchical approach to learning the Boolean multiplexer function | |
Sekanina et al. | Evolutionary design of arbitrarily large sorting networks using development | |
Poladian et al. | Multi-objective evolutionary algorithms and phylogenetic inference with multiple data sets | |
Anbarasu et al. | Multiple molecular sequence alignment by island parallel genetic algorithm | |
Du et al. | Species tree and reconciliation estimation under a duplication-loss-coalescence model | |
Koza et al. | Evolving computer programs using rapidly reconfigurable field-programmable gate arrays and genetic programming | |
Zaritsky et al. | The preservation of favored building blocks in the struggle for fitness: The puzzle algorithm | |
WO2003105080A2 (en) | Metapopulation genetic algorithm for combinatorial optimisation problems | |
Muhlenbein | Asynchronous parallel search by the parallel genetic algorithm | |
Nakaya et al. | RNA secondary structure prediction using highly parallel computers | |
Jin et al. | Parsimony score of phylogenetic networks: hardness results and a linear-time heuristic | |
Poladian | A GA for maximum likelihood phylogenetic inference using neighbour-joining as a genotype to phenotype mapping | |
Skourikhine | Phylogenetic tree reconstruction using self-adaptive genetic algorithm | |
Nayeem et al. | A multi-objective metaheuristic approach for accurate species tree estimation | |
CN113554144A (en) | Self-adaptive population initialization method and storage device for multi-target evolutionary feature selection algorithm | |
Drennan et al. | Evolution of repressilators using a biologically-motivated model of gene expression | |
Isaacs et al. | Evolving ant colony systems in hardware for random number generation | |
CN116401037B (en) | Genetic algorithm-based multi-task scheduling method and system | |
Fatumo et al. | Aligning multiple sequences with genetic algorithm | |
Çalışkan et al. | Self-Adaptive Genetic Algorithm For Permutation Flow Shop Scheduling Problems | |
Hill et al. | Examining the use of a non-trivial fixed genotype-phenotype mapping in genetic algorithms to induce phenotypic variability over deceptive uncertain landscapes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2003729736 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2003729736 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2003729736 Country of ref document: EP |
|
NENP | Non-entry into the national phase in: |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |