CN110097169A - A kind of high dimensional feature selection method mixing ABC and CRO - Google Patents
A kind of high dimensional feature selection method mixing ABC and CRO Download PDFInfo
- Publication number
- CN110097169A CN110097169A CN201910381688.5A CN201910381688A CN110097169A CN 110097169 A CN110097169 A CN 110097169A CN 201910381688 A CN201910381688 A CN 201910381688A CN 110097169 A CN110097169 A CN 110097169A
- Authority
- CN
- China
- Prior art keywords
- population
- food source
- algorithm
- cro
- new
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000002156 mixing Methods 0.000 title claims abstract description 19
- 238000010187 selection method Methods 0.000 title claims abstract description 17
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 127
- 235000013305 food Nutrition 0.000 claims abstract description 95
- 238000000034 method Methods 0.000 claims abstract description 49
- 238000006243 chemical reaction Methods 0.000 claims abstract description 34
- 238000002790 cross-validation Methods 0.000 claims abstract description 11
- 230000014759 maintenance of location Effects 0.000 claims abstract description 4
- 230000008569 process Effects 0.000 claims description 17
- 238000000354 decomposition reaction Methods 0.000 claims description 9
- 238000011156 evaluation Methods 0.000 claims description 9
- 238000011835 investigation Methods 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 238000012795 verification Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 6
- 238000005381 potential energy Methods 0.000 claims description 4
- 238000006467 substitution reaction Methods 0.000 claims description 4
- 230000000977 initiatory effect Effects 0.000 claims description 3
- 230000008929 regeneration Effects 0.000 claims description 3
- 238000011069 regeneration method Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 208000001613 Gambling Diseases 0.000 claims 1
- 230000006870 function Effects 0.000 description 10
- 206010012818 diffuse large B-cell lymphoma Diseases 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 238000013480 data collection Methods 0.000 description 5
- 238000005457 optimization Methods 0.000 description 5
- 208000029742 colonic neoplasm Diseases 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 210000000653 nervous system Anatomy 0.000 description 4
- 229940000041 nervous system drug Drugs 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 241000282461 Canis lupus Species 0.000 description 2
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 201000005202 lung cancer Diseases 0.000 description 2
- 208000020816 lung neoplasm Diseases 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 230000000116 mitigating effect Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 241000923851 Elvira Species 0.000 description 1
- 206010064571 Gene mutation Diseases 0.000 description 1
- 206010040047 Sepsis Diseases 0.000 description 1
- 108010074506 Transfer Factor Proteins 0.000 description 1
- 241001672648 Vieira Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- HUTDUHSNJYTCAR-UHFFFAOYSA-N ancymidol Chemical compound C1=CC(OC)=CC=C1C(O)(C=1C=NC=NC=1)C1CC1 HUTDUHSNJYTCAR-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 229910002056 binary alloy Inorganic materials 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000013401 experimental design Methods 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention relates to bioinformatics technique fields, disclose a kind of high dimensional feature selection method for mixing ABC and CRO, comprising: use and initialized based on artificial bee colony algorithm ABC with finding the population that the strategy of the i.e. function fitness of best food source forms individual;Initialization population is updated using chemical reaction algorithm CRO, and calculates fitness value individual in population using the fitness function of setting, obtains the globally optimal solution in population;Elite molecule population is formed using elite retention strategy, and updates elite population after an iteration, obtained elite molecule is incorporated in population and carries out next iteration, until current iteration number reaches the number of iterations of setting;The performance of high dimensional feature selection is verified using 10 times of cross-validation method combination KNN classifiers, the high dimensional feature selection method of this mixing ABC and CRO improves the ability of searching optimum of algorithm, enhances population diversity, avoiding falling into local optimum to a certain degree.
Description
Technical field
The present invention relates to field of bioinformatics, in particular to a kind of high dimensional feature selection method for mixing ABC and CRO.
Background technique
There is a large amount of High Dimensional Data Sets in practical application, although high dimensional data may more accurately indicate things,
It is as the number of features of description data is more and more, data dimension is higher and higher, and wherein quite a few feature may be with
Mining task be incoherent or feature between mutual redundancy.These incoherent features are not only due to data latitude is high and reduces
The performance of data analysis task, and redundancy feature may also reduce the accuracy of data analysis task.It is valuable in view of extracting
Information and determine large data sets important feature in terms of challenge, feature selecting (also referred to as variables choice or Attributions selection) is
Through causing the interest of numerous areas.Feature selecting is applied to the data set with known features, attempts the important spy of identification data
Sign, and concentrated from primitive character and abandon uncorrelated or redundancy feature.With the rapid development of information technology, traditional mode is known
Other technology has been unable to satisfy the requirement of a large amount of uncorrelated features in processing higher-dimension Small Sample Database, improves the property of feature selecting algorithm
It can become more and more important.
Usual feature selection process includes with the next stage: subset generates, subset evaluation and result verification.The mesh of this process
Be remove uncorrelated or redundancy feature, the lesser feasible subset of generation.Common feature selection approach be divided into Filter,
Wrapper and Embedded three classes.For Filter model, using the distribution character of data inherently as feature selecting
Foundation, without utilizing any mining algorithm, such as T-test algorithm.Wrapper model is dependent on classification method to character subset
Assessment, this makes Wrapper model possess higher nicety of grading in three kinds of methods, such as SA algorithm.Embedded model is then straight
It connects and feature selection process is dissolved into label learning algorithm, such as decision tree.In contrast, Wrapper model utilizes engineering
The performance for practising algorithm carries out feature selecting as evaluation criteria, so that Wrapper model is more flexible and is handling higher-dimension
It is more efficient when data.In recent years, Wrapper model seeks global optimum by meta-heuristic method solution feature selection issues
Solution causes many concerns.
Vieira etc. proposes a kind of improved Binary Particle Swarm Optimization (MBPSO) for feature selecting, simultaneously
Optimize SVM kernel parameter to be arranged to predict the death rate of sepsis patient.
Subanya etc. finds best features in heart disease identification using binary system artificial bee colony algorithm (BABC)
Then collection assesses the feature of these selections using KNN model.
Hu etc. proposes a kind of improved shuffled frog leaping algorithm (ISFLA) progress feature selecting, and the algorithm is mixed by introducing
Ignorant memory weight factor, absolute equilibrium group policy and adaptive transfer factor improve the accuracy rate and performance of feature selecting.
Babatunde etc. combines K arest neighbors (KNN) classifier to carry out feature selecting, this method using genetic algorithm (GA)
Obtaining in multiple indexs such as nicety of grading has better result than previous certain methods.
Improved grey wolf optimization algorithm (IGWO) and kernel extreme learning machine (KELM) are combined and carry out feature selecting by Li et al..
Diversified initial position is first generated using heredity, the current of population in discrete search space is then updated using grey wolf optimization
Position, to obtain optimal feature subset.
Although the above-mentioned heuritic approach for feature selecting has the advantages that respective, none of meta-heuristic
Algorithm is able to solve all feature selection issues.Therefore we need to explore new meta-heuristic searching algorithm or mixing and search
Rope algorithm is applied in biometrical features selection.The invention proposes a kind of new hybrid algorithm AB- based on meta-heuristic
The characteristics of CRO algorithm, which combines artificial bee colony algorithm (ABC) and chemical reaction algorithm (CRO), optimizes feature selecting.
Summary of the invention
The present invention provides a kind of high dimensional feature selection method for mixing ABC and CRO, can solve in the prior art above-mentioned
Problem.
The present invention provides a kind of high dimensional feature selection methods for mixing ABC and CRO, comprising the following steps:
Step 1 is used based on artificial bee colony algorithm ABC to find the strategy of the i.e. function fitness of best food source to individual
The population of composition is initialized;
Step 2 updates initialization population using chemical reaction algorithm CRO, and calculates institute using the fitness function of setting
Fitness value individual in population is stated, the globally optimal solution in population is obtained;
Step 3 forms elite molecule population using elite retention strategy, and updates elite kind after an iteration
Group, obtained elite molecule is incorporated in population and carries out next iteration;
Step 4 is divided using 10 times of cross-validation methods (10-fold Cross Validation Technique) in conjunction with KNN
Class device come verify high dimensional feature selection performance, with assess classification effect;
Step 5, using step 2 to step 4 as an iteration, repeat step 2 to step 4, until current iteration time
Number reaches the number of iterations of setting.
The step 1 specifically:
Step 1.1 initializes population using artificial bee colony algorithm ABC, forms new initialization population and parameter,
Bee, observation bee is employed to be equal to food source quantity M, the number of bee colony is NP=M, and employing bee number is SN, random to generate NP=M
Food source, ABC algorithm maximum number of iterations are itermax, and it is limit that maximum, which stagnates number,;
Step 1.2, using randomisation process to population carry out initialization form initial solution, are as follows: in the individual i in group
Jth position feature XijRandom number r, r a ∈ [0,1] is randomly generated, the feature if the initialization probability P that random number r is less than setting
XijIt is selected, otherwise XijIt is not selected;1 is set by selected feature for each individual, not selected feature setting
It is 0;The solution that group after initialization is formed is as initial solution;
Step 1.3 generates new explanation, and calculates the fitness value of new explanation using employing bee to carry out neighborhood search New food source,
If the fitness value of new explanation is greater than the solution of original initialization, original solution is updated with new explanation, otherwise keeps original solution
It is constant;Preferable food source is selected with greedy algorithm, the probability that food source is selected is calculated, is produced around the food source selected
A raw new food source, calculates the fitness value of new food source, and the high substitution fitness value of fitness value is low, updates food
Material resource;Record food source best so far;
Abandoned food source is judged whether there is, if so, bee is led to be converted into investigation bee, investigation bee random search is new
Food source;
Step 1.4 judges whether current food source is enhanced in scheduled the number of iterations, if not being enhanced, needs
It reinitializes food source, otherwise return step 1.2, obtains initial population optimal solution, i.e., optimal food source.
Bee is employed to generate SN food source at random according to formula (1) in the step 1.1
Rand (0,1) indicates that the uniform random number between (0,1), N are the dimension for optimizing space in formula (1);
It employs the bee search phase: employing bee to carry out food source search and find candidate New food source formula (2) are as follows:
New food source location update formula (3) are as follows:
In formula (2),For the random number between [- 1,1], k ∈ (1,2 ..., SN) and Xij≠Xkj, new and old food source is logical
The greedy rule of formula (3) is crossed to be selected, i.e., the fitness value of new and old food source, if New food source VijAdaptation
Angle value is better than old food source XijFitness value, then use New food source VijPosition replace old food source XijPosition, it is otherwise old
Food source XijIt is constant, it is still current foodstuff source position, while bee being employed to add 1 in the stagnation number of the food source;
Observation bee follows probability are as follows:
In formula (4), k ∈ (1,2 ..., SN), in this stage, observe bee according to employ the food source information of bee acquisition by
It selects food source according to the mode of roulette and carries out food source search, if selection, which follows, is employed bee, according to employing bee to search for
The method in stage carries out the search of New food source.
In the step 2, the population is updated using chemical reaction algorithm CRO and is specifically included:
The initial population formed using ABC algorithm, is reacted by four basic operation operators of CRO algorithm, is updated
Obtain globally optimal solution.
In the step 2, the CRO algorithm Population Regeneration is specifically included:
Step 2.1, setting initiation parameter, the kinetic energy initial value InitialKE of molecule;Central Energy buffer
Buffer;Decomposition threshold α;Synthesize threshold value beta;Unimolecule collides KE loss late KELossRate in vain;Molecular kinetic energy KE, potential energy
PE;
The operation of four kinds of step 2.2, CRO algorithm operators, unimolecule collide in vain;Decompose: decomposition reaction is that unimolecule is anti-
It answers, corresponding global search process;Intermolecular elastic collision: being two intermolecular chemical reactions, is to obtain new explanation and carry out
Local search procedure;Synthetic reaction: being intermolecular chemical reaction process;
Step 2.3, using the fitness value of each individual of KNN classifier evaluation, if the fitness value of the new individual is big
Fitness value before individual updates then replaces the individual before updating using the new individual, otherwise gives up the new individual;
Step 2.4 repeats step 2.1 to 2.3, until having updated the individual in population.
In the step 2.3, used in the fitness value using the i.e. selected character subset of each individual of KNN classifier evaluation
Function formula (5) are as follows:
WhereinAcc indicates the classification accuracy of sample, numcPresentation class is just
True sample number, numiThe sample number of presentation class mistake, n indicate the corresponding selected feature of the sample of fitness value to be calculated
Number, N are the numbers that the sample of fitness value to be calculated corresponds to all features, and δ is the weight of classification accuracy, and θ is feature choosing
The weight selected ,+θ=1 δ.
The step 3 specifically:
It selects the smallest 5 molecules of PE to be stored in elite molecule population, elite molecule population is updated after an iteration,
And obtained elite molecule is incorporated in population and carries out next iteration.
At this point, it should be noted that being needed after elite population is added in energy due to the principle of the reaction front and back conservation of energy
Measure the energy for mitigating elite molecule in buffer buffer;
Buffer=buffer-PE (ωelite)-KE(ωelite)
PE(ωelite) it is elite molecular potential, KE (ωelite) it is elite molecular kinetic energy.
The step 4 specifically:
For the validity of verification result, we carry out sample using ten times of cross-validation methods and KNN classifier effective
Verifying, original sample collection is randomly divided into 10 parts, successively using a copy of it as test set, other parts as training set, thus
The average value of 10 results is calculated, to assess the effect of classification, (K-Nearest Neighbor algorithm, K are most by KNN
Neighbor method) it is a kind of statistical class set, the screening of the characteristic variable of data is particularly effective.
Compared with prior art, the beneficial effects of the present invention are:
The ability of searching optimum of the present invention combination ABC convergence speed of the algorithm and CRO algorithm, the invention proposes mixing to calculate
Method AB-CRO, elitism strategy improve CRO convergence speed of the algorithm, and CRO improves the ability of searching optimum of ABC algorithm, and
It introduces randomness during molecule reacts, and this avoid molecules to fall into local optimum in the reaction.It will search
The optimal feature subset of rope is brought into sorting algorithm combination 10- folding intersection and carries out classification verifying, in eight open biomedical numbers
According to verifying is tested on collection, which effectively reduces the number of feature selecting, and obtains compared with other feature selection approach
High-class accuracy.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of high dimensional feature selection method for mixing ABC and CRO provided by the invention.
Fig. 2 is that the initialization of AB-CRO algorithm of the present invention and molecule react schematic diagram.
Fig. 2 (a) is initialization vector solution X of the present inventioniBinary form diagram.
Fig. 2 (b) is that unimolecule of the present invention collides schematic diagram in vain.
Fig. 2 (c) is unimolecule decomposition reaction schematic diagram of the present invention.
Fig. 2 (d) is the intermolecular invalid collision schematic diagram of the present invention.
Fig. 2 (e) is the intermolecular synthetic reaction schematic diagram of the present invention.
Fig. 3 is that the average fitness value of algorithms of different of the present invention compares.
Fig. 4 is the ratio that the average characteristics that algorithms of different of the present invention is chosen account for total characteristic.
Fig. 5 is the comparison of average operating time of the different classifications device of the present invention on given data collection.
Specific embodiment
The specific embodiment of the present invention is described in detail in 1-5 with reference to the accompanying drawing, it is to be understood that this hair
Bright protection scope is not limited by the specific implementation.
As shown in Figure 1, the present invention provides a kind of high dimensional feature selection methods for mixing ABC and CRO, which is characterized in that
The following steps are included:
Step 1 is used based on artificial bee colony algorithm ABC to find the strategy of the i.e. function fitness of best food source to individual
The population of composition is initialized;
Step 2 updates initialization population using chemical reaction algorithm CRO, and calculates institute using the fitness function of setting
Fitness value individual in population is stated, the globally optimal solution in population is obtained;
Step 3 forms elite molecule population using elite retention strategy, and updates elite kind after an iteration
Group, obtained elite molecule is incorporated in population and carries out next iteration;
Step 4, the performance that high dimensional feature selection is verified using 10 times of cross-validation method combination KNN classifiers;
Step 5, using step 2 to step 4 as an iteration, repeat step 2 to step 4, until current iteration time
Number reaches the number of iterations of setting.
The present invention the specific implementation process is as follows:
S101, initialization is carried out using ABC forming new initialization population and parameter, bee and observation bee etc. are employed in setting
In population quantity M.
Each solution is expressed as the string of binary characters of a N-dimensional, as shown in Fig. 2 (a).
S1011, using randomisation process to population carry out initialization form initial solution, are as follows: in the individual i in group
Jth position feature XijRandom number r, r a ∈ [0,1] is randomly generated, the feature X if the initialization probability P that random number r is less than settingij
It is selected, otherwise XijIt is not selected;1 is set by selected feature for each individual, not selected feature is set as
0;The solution that group after initialization is formed is as initial solution;
S1012, it employs bee to carry out neighborhood search generation new explanation, and calculates its fitness value, if the fitness value of new explanation
Greater than original solution, then more new explanation, otherwise constant;Preferable food source is selected with greedy method, calculates the select probability of food source,
A new food is generated around food source, calculates the fitness value of new food source, the high substitution fitness value of fitness value
Low, more New food source.It generates investigation bee and distributes a character subset for each investigation bee.Judge whether there is abandoned food
Material resource, if so, bee is led to be converted into investigation bee, the new food of investigation bee random search
Source;Bee is employed to generate SN food source position at random according to formula (1)
Rand (0,1) indicates that the uniform random number between (0,1), N are the dimension for optimizing space in formula (1).
It employs the bee search phase: employing bee to carry out food source search and find candidate New food source formula (2) are as follows:
New food source location update formula (3) are as follows:
In formula (2),For the random number between [- 1,1], k ∈ (1,2 ..., SN) and Xij≠Xkj, new and old food source is logical
The greedy rule of formula (3) is crossed to be selected, i.e., the fitness value of new and old food source, if New food source VijAdaptation
Angle value is better than old food source XijFitness value, then use New food source VijPosition replace old food source XijPosition, it is otherwise old
Food source XijIt is constant, it is still current foodstuff source position, while bee being employed to add 1 in the stagnation number of the food source.
Observation bee follows probability are as follows:
In formula (4), k ∈ (1,2 ..., SN), in this stage, observe bee according to employ the food source information of bee acquisition by
It selects food source according to the mode of roulette and carries out food source search, if selection, which follows, is employed bee, according to employing bee to search for
The method in stage carries out the search of New food source.
S1013, feature selecting can be considered as multi-objective optimization question, need a suitable objective function (present invention
Middle title fitness function) optimization aim as algorithm.Wherein to realize two conflicting targets;It chooses the smallest
Number of features and to greatest extent raising nicety of grading.The character subset quantity chosen every time is fewer, and nicety of grading is higher, it was demonstrated that
It is proposed that model has better classifying quality.
Each solution is assessed according to the fitness function of proposition, which depends on searching algorithm and classifier, with
Obtain the quantity of selected feature in the classification accuracy and solution of the number of features reconciliation of optimum solution.In order in each solution
It is balanced between the feature quantity (minimum value) and classification accuracy (maximum value) selected in scheme, the fitness that we use
Function formula (5) is as follows:
WhereinAcc indicates the classification accuracy of sample, numcPresentation class is just
True sample number, numiThe sample number of presentation class mistake, n indicate the corresponding selected feature of the sample of fitness value to be calculated
Number, N are the numbers that the sample of fitness value to be calculated corresponds to all features, and δ is the weight of classification accuracy, and θ is feature choosing
The weight selected ,+θ=1 δ.
S1014, if there is abandoned food source (character subset), lead bee to be converted into investigation bee and generate new feature
Collection.Judge whether current food source is enhanced in scheduled the number of iterations, if not being enhanced, needs to reinitialize food
Material resource, otherwise return step S1012, obtains initial population optimal solution, i.e., optimal food source.
S1015, step S1012 to S1014 is repeated, until having updated population always all individuals.
The initial population of S102, the best foods source for selecting ABC algorithm initialization to be formed as CRO algorithm.Using chemistry
Algorithm CRO Population Regeneration is reacted, and calculates the fitness value of each individual in the population using the fitness function of setting, is obtained
Globally optimal solution into population.
S1021, setting initiation parameter, the kinetic energy initial value InitialKE of molecule;Central Energy buffer Buffer;
Decomposition threshold α;Synthesize threshold value beta;;Unimolecule collides KE loss late KELossRate in vain;Molecular kinetic energy KE, potential energy PE;
The operation of 4 kinds of operators of S1022, CRO algorithm, unimolecule collide in vain;Decompose (decomposition reaction is monomolecular reaction,
Corresponding global search process);Intermolecular elastic collision (is two intermolecular chemical reactions, is the office that carries out to obtain new explanation
Portion's search process);Synthetic reaction (being intermolecular chemical reaction process).
When current iteration number i is less than maximum number of iterations itermax, CRO search is carried out to population, generates one at random
Number r occurs monomolecular reaction if meeting decomposition condition NumHit-MinHit > α and otherwise carries out single point if r > MoleColl
Sub- wall collision;If r≤MollColl, polymolecular chemical reaction occurs, synthesis condition meets KE < β, otherwise carries out intermolecular
Invalid collision.
Four fundamental reactions of CRO algorithm:
(1) unimolecule collides in vain
It is a given molecular structure ω that unimolecule collides in vain, randomly selects a certain position of ω and sets it at random
0 or 1 is set to generate recruit ω ', if PEω’≤PEω, then subsequent chemical reaction is participated in the ω in ω ' substitution population change
For process, otherwise ω is removed, as shown in Fig. 2 (b)
(2) it decomposes
Decomposition reaction is monomolecular reaction, corresponding global search process, as shown in Fig. 2 (c), generates two by a molecule ω
A new molecular structure ω1',ω'2.Replicate ω is assigned to ω to the operator simultaneously first1' and ω '2, then respectively with random two
Metavariable changes ω1' and ω '2The molecular structure of half, and decide whether to retain newly according to the molecular energy difference of reaction front and back
The molecule of generation.
Energy difference is indicated with the tempBuff in formula (3), if tempBuff >=0 or tempbuff+ energy buffer
Energy energyBuff >=0 of middle storage, just by ω '1,ω′2It is separately added into population and removes ω, otherwise delete ω '1,ω′2。
The purpose of decomposition reaction is to increase molecular diversity in feature selecting.
(3) intermolecular elastic collision
Intermolecular elastic collision is two intermolecular chemical reactions, is to obtain new explanation and the local search procedure that carries out.
The operator randomly selects two molecule ω in population1And ω2, and a certain position randomly choosed in two molecules respectively carries out
Mutation generates new molecular structure ω '1With ω '2.If reaction front and back energy variation meets the condition of formula (4), original is deleted
First two molecules, and two new molecules are added in population;Otherwise newly generated two molecules are removed.
Intermolecular elastic collision is the same with the invalid crash response operator of unimolecule also to use unit gene mutation method, generation
Shown in recruit's structure such as Fig. 2 (d):
(4) it synthesizes
Synthetic reaction is intermolecular chemical reaction process, is by combining two already present molecule ω1And ω2, with life
At recruit ω '.The molecular structure of the middle half of ω ' comes from ω1, the other half comes from ω2Corresponding construction.React the molecule of front and back
The difference of energy determines whether synthesis really executes.ω ' is then added in population and removes ω by formula (5) if the conditions are met1
And ω2, otherwise delete ω '.
It is shown that synthetic operation generates new molecule ω ' such as Fig. 2 (e);
S1023, using the fitness value of each individual of KNN classifier evaluation, if the fitness value of the new individual is greater than
Fitness value before individual update, then the individual before updating is replaced using the new individual, otherwise gives up the new individual;
S103, the smallest 5 molecules of selection potential energy PE are stored in elite molecule population, update elite after an iteration
Molecule population, and obtained elite molecule is incorporated in population and carries out next iteration.
At this point, it should be noted that being needed after elite population is added in energy due to the principle of the reaction front and back conservation of energy
Measure the energy for mitigating elite molecule in buffer buffer.
Buffer=buffer-PE (ωelite)-KE(ωelite)
In AB-CRO algorithm provided in an embodiment of the present invention, because there is the presence of fitness value, cause in search process
On the one hand middle generation new individual, this phenomenon maintain population diversity, so that algorithm is had better ability of searching optimum, still
On the other hand convergence speed of the algorithm is also slowed down, the accuracy of calculating is reduced in limited calculation times.In order to improve
Convergence speed of the algorithm introduces elitism strategy after the completion of each iteration, in order to keep the scale of group constant, by elite individual
Worst solution is replaced, it, can be minimum by fitness value in group of new generation if elite individual is added into group of new generation
Individual eliminate.
S104, divided using 10 times of cross-validation methods (10-fold Cross Validation Technique) in conjunction with KNN
Class device carrys out the search performance of verification algorithm, to assess the effect of classification.Original sample collection is randomly divided into 10 parts, it successively will wherein one
Part is used as test set, and other parts are as training set, so that the average accuracy of 10 classifiers is calculated, to assess classification
Effect.KNN (the closest method of K-Nearest Neighbor algorithm, K) is a kind of statistical sorter, to the spy of data
The screening of sign variable is particularly effective.The distance between data characteristics to be sorted and training data feature are calculated first and are sorted.It takes
Out apart from K recent training data feature.Then new samples are determined according to this K close training data feature generics
Classification: assuming that they belong to one kind, then new sample also belongs to this class;Otherwise, each candidate categories is commented
Point, the classification of new sample is determined according to certain rule.
S105, using S101 to S104 as an iteration, repeat S101 to S104, until current iteration number reaches
The number of iterations of setting.
From above-described embodiment it is found that search process of the present invention is based on artificial bee colony algorithm ABC's and chemical reaction algorithm CRO
Effective mixed method scans for.It is intended to select important feature based on the initialization procedure that ABC algorithm carries out feature ordering
Collection, elitism strategy improve CRO convergence speed of the algorithm.In addition to other than initializing simplified redundancy feature, it is also contemplated that ABC is calculated
The characteristics of local search ability of method is weak, easily falls into local optimum scans for process using CRO algorithm thus, to increase kind
The diversity of group improves global search performance.
Four, experimental setup and interpretation of result
The data set that the present invention tests all is two classification problems, and each sample is mapped to the set of positive sample and negative sample
In.Use TP (True Positive), FP (False Positive), TN (True Negative) and FN (False
Negative the performance of this model) is assessed, classifier uses KNN classifier.
(1) data describe
The algorithm is tested on biomedical data collection disclosed in eight, these data sets come from: http: //
Csse.szu.edu.cn/staff/zhuzx/Datasets.htm and http://leo.ugr.es/elvira/
DBCRepository/.The basic description of data set is as shown in table 1.
The description of 1 data set of table
Parameter setting
We construct orthogonal experiment and choose optimal parameter combination, and the orthogonal arrage of 4 level of this experimental design, 5 factor is L16
(45), 5 factors represent 5 parameters of selection, a total of 16 kinds of situations of 4 value ranges of each parameter selection.Table 2 has recorded
The parameter value of all comparison algorithms.
2 parameter setting of table
(2) evaluation index
Ten times of cross validations
In order to verify the validity of this method, 10 times of cross-validation method (10-fold Cross are used in this experiment
Validation Technique) search performance that carrys out verification algorithm in conjunction with KNN classifier, original sample collection is randomly divided into 10
Part, successively using a copy of it as test set, other parts are as training set, to calculate the average correct of 10 classifiers
Rate, to assess the effect of classification.For the fairness of experiment, the experiment of all algorithms is also repeated 10 times, we are by all fingers
Target average value is as final result such as table 3, shown in table 4.
Bigger its rate of missed diagnosis of explanation of the sensibility of the disease is relatively low in medical diagnosis, however the bigger explanation of specificity
Its misdiagnosis rate is relatively low.
(3) evaluation index
By accuracy of the mean (Acc%) in experiment, sensibility Sensitivity, specific Specificity are average
Character subset number (AvgN), standard deviation (std), average fitness value (Avgf%), runing time (Time) index are commented
Estimate.
Accuracy Acc be exactly by point pair sample number divided by all sample numbers, measure classifying quality superiority and inferiority, description is such as
Under:
Sensitivity indicate be in all positive examples by point pair ratio, measured classifier to the identification energy of positive example
Power;
Specificity be in all negative examples by point pair ratio, measured classifier to the recognition capability of negative example;
(4) interpretation of result
For the validity of verification algorithm, by inventive algorithm AB-CRO and modified particle swarm optiziation (Modified
Particle Swarm Algorithm), the improved algorithm ISFLA that leapfrogs (Improved Shuffled Frog Leaping
Algorithm it) is compared with genetic algorithm GA (Genetic Algorithm).
Average characteristics number of subsets (AvgN)
Under eight kinds of biological data collection, algorithms of different can be judged in same data set by the character subset number of selection
Under feature subset selection ability.It is as shown in Table 3 and Table 4 to analyze result, in terms of analysis result, less feature is selected to mean
Eliminate the feature of redundancy and reduce search space, for average characteristics number, in addition to NervousSystem and
Outside the two data sets of DLBCL-Stanford, the character subset of AB-CRO algorithm picks is the smallest.
Accuracy of the mean (Acc%)
Accuracy of the mean is also an important indicator, as shown in Table 3 and Table 4, it can be seen that on most of data sets
Other algorithms are compared, and AB-CRO algorithm realizes best accuracy of the mean (Acc).
Standard deviation (std)
For the robustness of verification algorithm, this experiment acquires corresponding index accuracy of the mean and selects to put down by running 10 times
The corresponding standard deviation of equal number of features.Standard deviation is to measure the amplitude of one group of number variation, it is evident that standard deviation is smaller, it was demonstrated that experiment
As a result more stable.For data ALL-AML_train, ColonTumor, DLBCLOutcome, lungCan cer_train and
The accuracy of DLBCL-NIH-train, AB_CRO algorithm possesses smaller standard deviation compared with other two kinds of algorithms, also illustrates this hair
Bright algorithm is more stable.
Average fitness value (Avgf%)
Average fitness value and two can be good at balance characteristics selection maximum classification accuracy and subset it is most preferably long
Spend the two targets.Fig. 3 illustrate the fitness value of algorithm of the AB-CRO compared with all as a result, clearly can be with from figure
Find out for data set DLBCL-Stanford, lungCancer_train and Lung Cancer-Ontario, algorithm AB-CRO
It is essentially the same with the fitness knot value of CRO.In addition to this, the fitness value of other five data set ABC-CRO is slightly above
Other five algorithms.For the number of features of selection, what Fig. 3 was indicated is that the optimal feature subset finally chosen accounts for feature sum
Percentage, it can be seen that feature of the algorithm AB-CRO of proposition on most of data set falls sharply most obvious (in addition to data set
DLBCL-Stanford and NervousSystem).Although being adapted in Fig. 3 for most of data set algorithm AB-CRO
Angle value is substantially similar to CRO, but the optimal feature subset that Fig. 4 illustrates AB-CRO algorithms selection is significantly less than CRO.The two
Index further demonstrates, superiority of the inventive algorithm in terms of higher-dimension biomedical data feature selecting.
Runing time (Time)
Feature selecting is the dimension in order to reduce initial data, improves the efficiency of search mechanisms.It is contemplated herein that higher-dimension
The time loss of the feature selecting of biological data collection.The runing time of algorithm depends on the convergence capabilities of algorithm and the rule of data set
Mould.Fig. 5 illustrates the comparison of ABC, CRO and these three algorithms of AB-CRO of the invention runing time on 8 data sets, such as schemes
Shown in 5, AB-CRO algorithm of the invention was 846 seconds DLBCL-NIH-train data set used times, and ABC algorithm is also required to 864
Second, large data sets runing time this for DLBCL-NIH-train (instance number: 160, number of features 7400) is better than
ABC algorithm can control although the runing time on other 7 data sets is all slightly above other two kinds of algorithms 100
Within second, so fused AB-CRO algorithm is the demand that can satisfy real-time.
From table 3 it can be clearly seen that accuracy of the mean Acc, Sensitivity of AB-CRO algorithm and
Specificity is higher than other meta-heuristic algorithms.For sensibility and specificity index, data set ALL-AML_train,
DLBCL-Stanford and lungCancer_train reach 97% or more, for other diseases data set this three
The rate of missed diagnosis and misdiagnosis rate of kind disease data set are all relatively low.
The experimental result that table 3 and other algorithms compare
Inventive algorithm AB-CRO is compared by we with original ABC algorithm and CRO algorithm, as shown in table 4.
AB-CRO algorithm and primal algorithm are on most of data sets, and not only precision increases, and are selecting then
Also it is better than other algorithms in number of features, this further illustrates use ABC algorithm to carry out initialization and be better than at random just
Beginningization.Although the number of features that AB_CRO chooses is less, number of features stability still exists not on partial data collection
Foot, such as data set ColonTumor and NervousSystem etc..Sensitivity and Specificity two are referred to
Mark, algorithm proposed by the present invention are apparently higher than primal algorithm in the experimental result of most of data sets.The standard of the two indexs
Difference is also below ABC and CRO (in addition to data set ColonTumor, lungCancer_train and DLBCL-NIH-train).Institute
It is higher than original ABC and CRO with the search performance of AB_CRO in summary.
The experimental result that table 4 and primal algorithm compare
(5) influence of the different classifications device to algorithm
By the way that compared with algorithms of different, AB-CRO algorithm achieves good classification performance in terms of disease data analysis.
Herein in addition to using KNN classifier for assessing AB-CRO algorithm, the classifier SVM and NB of other two prevalence is also used to this
Literary algorithm performance is assessed, and experimental result is shown in Table 5.
Table 5 it is clear that algorithm AB_CRO be based on classifier KNN and SVM ALL-AML_train,
ColonTumor, NervousSystem, DLBCLOutcome, lungCancer_train and LungCancer-Ontario 6
The average characteristics number index of accuracy index and selection on a data set is consistent substantially, and illustrate inventive algorithm has
Effect property.For the stability of the two classifiers, KNN classifier is more stable, and the accuracy and standard deviation of KNN classifier exist
All data sets are the smallest.For specificity, KNN classifier is more dominant in most of data set for sensibility index
Gesture.For all data sets, NB classifier has worst experimental result, and relative to KNN and SVM classifier, NB classification
The stability that device is obtained a result is poor, selects suitable classifier that will improve the classification performance of algorithm.
Influence of the 5 different classifications device of table to algorithm
The advantages of technical solution provided by the present invention, is:
1, the ability of searching optimum of algorithm is improved
2, enhance population diversity.
3, it is avoiding falling into local optimum to a certain degree.
Disclosed above is only several specific embodiments of the invention, and still, the embodiment of the present invention is not limited to this, is appointed
What what those skilled in the art can think variation should all fall into protection scope of the present invention.
Claims (8)
1. a kind of high dimensional feature selection method for mixing ABC and CRO, which comprises the following steps:
Step 1, use based on artificial bee colony algorithm ABC with find the i.e. function fitness of best food source strategy to individual form
Population initialized;
Step 2 updates initialization population using chemical reaction algorithm CRO, and calculates described kind using the fitness function of setting
Individual fitness value, obtains the globally optimal solution in population in group;
Step 3 forms elite molecule population using elite retention strategy, and updates elite population after an iteration, will
Obtained elite molecule, which is incorporated in population, carries out next iteration;
Step 4, the performance that high dimensional feature selection is verified using 10 times of cross-validation method combination KNN classifiers;
Step 5, using step 2 to step 4 as an iteration, repeat step 2 to step 4, until current iteration number reaches
To the number of iterations of setting.
2. the high dimensional feature selection method of mixing ABC and CRO according to claim 1, which is characterized in that the step 1
Specifically:
Step 1.1 initializes population using artificial bee colony algorithm ABC, forms new initialization population and parameter, employs
Bee, observation bee are equal to food source quantity M, and the number of bee colony is NP=M, and employing bee number is SN, generate NP=M food at random
Source, ABC algorithm maximum number of iterations are itermax, and it is limit that maximum, which stagnates number,;
Step 1.2 carries out initialization to population using randomisation process and forms initial solution, are as follows: to the in the individual i in group
J feature XijRandom number r, r a ∈ [0,1] is randomly generated, the feature X if the initialization probability P that random number r is less than settingijQuilt
It chooses, otherwise XijIt is not selected;1 is set by selected feature for each individual, not selected feature is set as 0;
The solution that group after initialization is formed is as initial solution;
Step 1.3 generates new explanation, and calculates the fitness value of new explanation using employing bee to carry out neighborhood search New food source, if
The fitness value of new explanation is greater than the solution of original initialization, then original solution is updated with new explanation, otherwise keeps original solution constant;
Preferable food source is selected with greedy algorithm, the probability that food source is selected is calculated, generates one around the food source selected
A new food source calculates the fitness value of new food source, and the high substitution fitness value of fitness value is low, more new food
Source;Record food source best so far;
Abandoned food source is judged whether there is, if so, bee is led to be converted into investigation bee, the new food of investigation bee random search
Source;
Step 1.4 judges whether current food source is enhanced in scheduled the number of iterations, if not being enhanced, needs weight
New initialization food source, otherwise return step 1.2, obtain initial population optimal solution, i.e., optimal food source.
3. the high dimensional feature selection method of mixing ABC and CRO according to claim 2, it is characterised in that: the step
Bee is employed to generate SN food source at random according to formula (1) in 1.1
Rand (0,1) indicates that the uniform random number between (0,1), N are the dimension for optimizing space in formula (1);
It employs the bee search phase: employing bee to carry out food source search and find candidate New food source formula (2) are as follows:
New food source location update formula (3) are as follows:
In formula (2),For the random number between [- 1,1], k ∈ (1,2 ..., SN) and Xij≠Xkj, new and old food source passes through public affairs
The greedy rule of formula (3) is selected, i.e., the fitness value of new and old food source, if New food source VijFitness value
It is better than old food source XijFitness value, then use New food source VijPosition replace old food source XijPosition, otherwise old food
Source XijIt is constant, it is still current foodstuff source position, while bee being employed to add 1 in the stagnation number of the food source;
Observation bee follows probability are as follows:
In formula (4), k ∈ (1,2 ..., SN) observes bee according to the food source information for employing bee to obtain according to wheel in this stage
The mode of disk gambling selects food source to carry out food source search, if selection, which follows, is employed bee, basis employs the bee search phase
Method carry out New food source search.
4. the high dimensional feature selection method of mixing ABC and CRO according to claim 1, which is characterized in that in step 2, adopt
The population is updated with chemical reaction algorithm CRO to specifically include:
The initial population formed using ABC algorithm, is reacted, update is obtained by four basic operation operators of CRO algorithm
Globally optimal solution.
5. the high dimensional feature selection method of mixing ABC and CRO according to claim 4, which is characterized in that in step 2, institute
The CRO algorithm Population Regeneration stated specifically includes:
Step 2.1, setting initiation parameter, the kinetic energy initial value InitialKE of molecule;Central Energy buffer Buffer;Point
Solve threshold alpha;Synthesize threshold value beta;Unimolecule collides KE loss late KELossRate in vain;Molecular kinetic energy KE, potential energy PE;
The operation of four kinds of step 2.2, CRO algorithm operators, unimolecule collide in vain;Decompose: decomposition reaction is monomolecular reaction, right
Answer global search process;Intermolecular elastic collision: being two intermolecular chemical reactions, is to obtain new explanation and the part that carries out is searched
Rope process;Synthetic reaction: being intermolecular chemical reaction process;
Step 2.3, using the fitness value of each individual of KNN classifier evaluation, if the fitness value of the new individual is greater than
Otherwise fitness value before body updates gives up the new individual then using the individual before new individual replacement update;
Step 2.4 repeats step 2.1 to 2.3, until having updated the individual in population.
6. the high dimensional feature selection method of mixing ABC and CRO according to claim 5, which is characterized in that in step 2.3,
Function formula (5) used in fitness value using the i.e. selected character subset of each individual of KNN classifier evaluation are as follows:
WhereinAcc indicates the classification accuracy of sample, numcPresentation class is correct
Sample number, numiThe sample number of presentation class mistake, n indicate the number of the corresponding selected feature of the sample of fitness value to be calculated
Mesh, N are the numbers that the sample of fitness value to be calculated corresponds to all features, and δ is the weight of classification accuracy, and θ is feature selecting
Weight ,+θ=1 δ.
7. the high dimensional feature selection method of mixing ABC and CRO according to claim 1, which is characterized in that the step 3
Specifically:
It selects the smallest 5 molecules of PE to be stored in elite molecule population, elite molecule population, and handle is updated after an iteration
Obtained elite molecule, which is incorporated in population, carries out next iteration.
At this point, it should be noted that needing to delay in energy after elite population is added due to the principle of the reaction front and back conservation of energy
It rushes in device buffer and mitigates the energy of elite molecule;
Buffer=buffer-PE (ωelite)-KE(ωelite)
PE(ωelite) it is elite molecular potential, KE (ωelite) it is elite molecular kinetic energy.
8. the high dimensional feature selection method of mixing ABC and CRO according to claim 7, which is characterized in that the step 4
Specifically:
For the validity of verification result, we effectively verify sample using ten times of cross-validation methods and KNN classifier,
Original sample collection is randomly divided into 10 parts, successively using a copy of it as test set, other parts are as training set, to calculate
The average value of 10 results, to assess the effect of classification, KNN is a kind of statistical class set, the screening to the characteristic variable of data
It is particularly effective.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910381688.5A CN110097169A (en) | 2019-05-08 | 2019-05-08 | A kind of high dimensional feature selection method mixing ABC and CRO |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910381688.5A CN110097169A (en) | 2019-05-08 | 2019-05-08 | A kind of high dimensional feature selection method mixing ABC and CRO |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110097169A true CN110097169A (en) | 2019-08-06 |
Family
ID=67447425
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910381688.5A Pending CN110097169A (en) | 2019-05-08 | 2019-05-08 | A kind of high dimensional feature selection method mixing ABC and CRO |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110097169A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110837884A (en) * | 2019-10-30 | 2020-02-25 | 河南大学 | Efficient mixed feature selection method based on improved binary krill swarm algorithm and information gain algorithm |
CN110930772A (en) * | 2019-12-05 | 2020-03-27 | 中国航空工业集团公司沈阳飞机设计研究所 | Multi-aircraft collaborative route planning method |
CN110956641A (en) * | 2019-11-20 | 2020-04-03 | 南京拓控信息科技股份有限公司 | Train wheel tread image segmentation method based on chemical reaction optimization |
CN112085712A (en) * | 2020-08-25 | 2020-12-15 | 山东科技大学 | Analysis processing method of mammary gland tumor needle aspiration image |
CN112908416A (en) * | 2021-04-13 | 2021-06-04 | 湖北工业大学 | Biomedical data feature selection method and device, computing equipment and storage medium |
CN113780334A (en) * | 2021-07-09 | 2021-12-10 | 浙江理工大学 | High-dimensional data classification method based on two-stage mixed feature selection |
CN113987945A (en) * | 2021-11-01 | 2022-01-28 | 河北工业大学 | Novel degraded product health index selection method |
CN116646568A (en) * | 2023-06-02 | 2023-08-25 | 陕西旭氢时代科技有限公司 | Fuel cell stack parameter optimizing method based on meta heuristic |
CN117911799A (en) * | 2024-03-19 | 2024-04-19 | 贵州师范大学 | Feature classification method for improving shrimp algorithm based on multiple strategies |
CN118331290A (en) * | 2024-03-15 | 2024-07-12 | 国网甘肃省电力公司陇南供电公司 | Multi-machine collaborative inspection path planning method, terminal and readable storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106650917A (en) * | 2017-01-03 | 2017-05-10 | 华南理工大学 | Mechanical arm inverse kinematics solving method based on chaotic and parallelized artificial bee colony algorithm |
-
2019
- 2019-05-08 CN CN201910381688.5A patent/CN110097169A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106650917A (en) * | 2017-01-03 | 2017-05-10 | 华南理工大学 | Mechanical arm inverse kinematics solving method based on chaotic and parallelized artificial bee colony algorithm |
Non-Patent Citations (1)
Title |
---|
张戈等: "基于混合ABC和CRO的高维特征选择方法", 《计算机工程与应用》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110837884B (en) * | 2019-10-30 | 2023-08-29 | 河南大学 | Effective mixed characteristic selection method based on improved binary krill swarm algorithm and information gain algorithm |
CN110837884A (en) * | 2019-10-30 | 2020-02-25 | 河南大学 | Efficient mixed feature selection method based on improved binary krill swarm algorithm and information gain algorithm |
CN110956641A (en) * | 2019-11-20 | 2020-04-03 | 南京拓控信息科技股份有限公司 | Train wheel tread image segmentation method based on chemical reaction optimization |
CN110930772A (en) * | 2019-12-05 | 2020-03-27 | 中国航空工业集团公司沈阳飞机设计研究所 | Multi-aircraft collaborative route planning method |
CN112085712A (en) * | 2020-08-25 | 2020-12-15 | 山东科技大学 | Analysis processing method of mammary gland tumor needle aspiration image |
CN112085712B (en) * | 2020-08-25 | 2022-04-29 | 山东科技大学 | Analysis processing method of mammary gland tumor needle aspiration image |
CN112908416A (en) * | 2021-04-13 | 2021-06-04 | 湖北工业大学 | Biomedical data feature selection method and device, computing equipment and storage medium |
CN112908416B (en) * | 2021-04-13 | 2024-02-02 | 湖北工业大学 | Biomedical data feature selection method and device, computing equipment and storage medium |
CN113780334A (en) * | 2021-07-09 | 2021-12-10 | 浙江理工大学 | High-dimensional data classification method based on two-stage mixed feature selection |
CN113987945A (en) * | 2021-11-01 | 2022-01-28 | 河北工业大学 | Novel degraded product health index selection method |
CN116646568A (en) * | 2023-06-02 | 2023-08-25 | 陕西旭氢时代科技有限公司 | Fuel cell stack parameter optimizing method based on meta heuristic |
CN116646568B (en) * | 2023-06-02 | 2024-02-02 | 陕西旭氢时代科技有限公司 | Fuel cell stack parameter optimizing method based on meta heuristic |
CN118331290A (en) * | 2024-03-15 | 2024-07-12 | 国网甘肃省电力公司陇南供电公司 | Multi-machine collaborative inspection path planning method, terminal and readable storage medium |
CN117911799A (en) * | 2024-03-19 | 2024-04-19 | 贵州师范大学 | Feature classification method for improving shrimp algorithm based on multiple strategies |
CN117911799B (en) * | 2024-03-19 | 2024-05-17 | 贵州师范大学 | Feature classification method for improving shrimp algorithm based on multiple strategies |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110097169A (en) | A kind of high dimensional feature selection method mixing ABC and CRO | |
Karaboga et al. | Fuzzy clustering with artificial bee colony algorithm | |
Hong et al. | Efficient huge-scale feature selection with speciated genetic algorithm | |
Zeng et al. | Accurately clustering single-cell RNA-seq data by capturing structural relations between cells through graph convolutional network | |
Hassanien et al. | Computational intelligence techniques in bioinformatics | |
Nguyen et al. | Learning graph representation via frequent subgraphs | |
JP2018181290A (en) | Filter type feature selection algorithm based on improved information measurement and ga | |
Yan et al. | A hybrid algorithm based on binary chemical reaction optimization and tabu search for feature selection of high-dimensional biomedical data | |
Du et al. | Improving the performance of feature selection and data clustering with novel global search and elite-guided artificial bee colony algorithm | |
Duan et al. | Gradient-based elephant herding optimization for cluster analysis | |
Yang et al. | Feature selection using memetic algorithms | |
Evans | Population-based ensemble learning with tree structures for classification | |
Bouaguel | A new approach for wrapper feature selection using genetic algorithm for big data | |
Vignolo et al. | Evolutionary local improvement on genetic algorithms for feature selection | |
CN114334168A (en) | Feature selection algorithm of particle swarm hybrid optimization combined with collaborative learning strategy | |
Amaratunga et al. | Ensemble classifiers | |
Alzubaidi et al. | A multivariate feature selection framework for high dimensional biomedical data classification | |
Hengpraprohm et al. | A genetic programming ensemble approach to cancer microarray data classification | |
Anaraki et al. | A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm | |
Vinmalar¹ et al. | Prediction of lung cancer using data mining techniques | |
CN111414935A (en) | Effective mixed feature selection method based on chi-square detection algorithm and improved fruit fly optimization algorithm | |
Liu et al. | SeqMIA: Membership Inference Attacks Against Machine Learning Classifiers Using Sequential Information | |
Koziarski | Imbalanced data preprocessing techniques utilizing local data characteristics | |
Anand et al. | Building an intelligent integrated method of gene selection for facioscapulohumeral muscular dystrophy diagnosis | |
Masera | Multi-target Prediction Methods for Bioinformatics: Approaches for Protein Function Prediction and Candidate Discovery for Gene Regulatory Network Expansion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190806 |