CN109871934A - Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm - Google Patents

Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Download PDF

Info

Publication number
CN109871934A
CN109871934A CN201910040185.1A CN201910040185A CN109871934A CN 109871934 A CN109871934 A CN 109871934A CN 201910040185 A CN201910040185 A CN 201910040185A CN 109871934 A CN109871934 A CN 109871934A
Authority
CN
China
Prior art keywords
moth
flame
spark
fitness value
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910040185.1A
Other languages
Chinese (zh)
Inventor
陈宏伟
符恒
曹倩倩
韩麟
胡周
候乔
常鹏阳
严灵毓
宗欣露
徐慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University of Technology
Original Assignee
Hubei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University of Technology filed Critical Hubei University of Technology
Priority to CN201910040185.1A priority Critical patent/CN109871934A/en
Publication of CN109871934A publication Critical patent/CN109871934A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm, first reading raw data set D, are stored in HDFS;Then RDD data set and moth population M are initialized, RDD is made into map conversion process;Fitness value OM is calculated according to M;The quantity for updating flame, finds out the distance of the corresponding flame of moth;Moth position is updated according to formula on Spark distributed platform, new fitness value OM is then calculated according to spiral formula, and find out best value and moth and compare, is replaced if being better than;Finally judge, the position of best moth is found out if meeting termination condition, otherwise returns to step 4.The present invention improves classification performance and operational efficiency using optimizing on Spark distributed platform to it using the feature selecting of binary system a flying moth darts into the fire algorithm.

Description

Feature selecting based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Method
Technical field
The invention belongs to machine learning, data mining, image procossing, the multiple fields such as distributed computing are related to one kind The feature selection approach of the distributed parallel binary of Spark a flying moth darts into the fire optimization algorithm, and in particular in a kind of mass data Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire optimization algorithm.
Background technique
Feature selecting (FS) is widely used in the fields such as machine learning and data mining, in the base for retaining legacy data information On plinth, characteristic dimension is reduced by removing wherein redundancy or incoherent data characteristics and improves classification performance.In view of moth Optimization algorithm of putting out the fire has good classification capacity in terms of reducing feature redundancy, and present invention introduces a flying moth darts into the fire, optimization algorithm is carried out Feature selecting.However a flying moth darts into the fire that algorithm easily falls into the not high characteristic of local optimum and search capability seriously limits the algorithm Classification performance and dimension reduce ability.Therefore, the present invention by Spark distributed platform based on memory with the spy of parallel computation A flying moth darts into the fire optimization algorithm and Distributed Parallel Computing are combined, are proposed a kind of based on Spark distributed parallel two by point The feature selection approach of system a flying moth darts into the fire optimization algorithm (SPBMFO) avoids algorithm from falling into local optimum and improves point of algorithm Class performance minimizes Characteristic Number while maximizing classification performance.
The development of information technology so that information content is in explosive growth, big data be it is very intractable study a question, such as Data mining, image procossing and big data analysis.For most of sorting algorithms, before mass data is classified, number According to the processes such as pretreatment, feature selecting and vectorization need it is a large amount of calculate, calculate that the time is long, memory consumption is big;It calculates simultaneously The classification performance of method can also reduce.
Summary of the invention
Present invention is generally directed to the processes of the feature selecting in mass data to optimize, and propose a kind of based on Spark The feature selection approach of distributed parallel binary a flying moth darts into the fire optimization algorithm improves classification performance and the characteristic selection of algorithm Operational efficiency.
The technical scheme adopted by the invention is that: one kind is based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Feature selection approach, which comprises the following steps:
Step 1: reading raw data set, then raw data set is pre-processed, then utilize the RDD in Spark Training set is divided into multiple subsets by subregion, is stored in and is suitble to operate in the distributed file system HDFS on Spark;
Step 2: one RDD data set of initialization, the RDD data set represent moth population OM, will be in Spark distribution RDD elasticity distribution formula data set make Map conversion process, initial fitness value;Wherein, initial fitness value be it is random initial or Person's setting is initially 0;
Step 3: calculating moth and corresponding flame distance Di, the value after every moth updates is calculated, current fly is updated The position of moth;Then the value that moth updates is switched into binary system;
Step 4: the value after the update for the moth being calculated is utilized as the input variable in Spark distribution Spark distributed platform calculates the specific position of the moth after iteration;
Step 5: operation being updated to each moth, and the fitness value M of each moth part after calculation processing;
Step 6: calculating the fitness value of each moth according to fitness function, determine the maximum value of fitness value;If New fitness value is more preferable than previous fitness value, then new fitness value replaces previous fitness value;
Step 7: judging whether to reach maximum number of iterations;
If not, revolution executes step 3, adaptive optimal control angle value is continually looked for;
If so, thening follow the steps 8;
Step 8: exporting the current fitness value maximum value of updated moth and its position of corresponding moth.
The present invention is based on the distributed parallel binary of Spark a flying moth darts into the fire optimization algorithm feature selection approach, each Character subset is all encoded as 1 and 0 string of binary characters, therefore all solutions are expressed as to the form of binary vector;And And the characteristics of using Spark distributed parallel computation, the dimension reduction ability of the operational efficiency of algorithm and feature selecting is all It is obviously improved.
Detailed description of the invention
It in order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.
Fig. 1 is the general flow chart of the specific embodiment of the invention;
Fig. 2 is the process detail drawing of the specific embodiment of the invention;
Fig. 3 is the distributed frame diagram of Spark of the specific embodiment of the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
The purpose of the present invention is optimizing for the process of the feature selecting in mass data, the present invention proposes a kind of base In the feature selection approach of the distributed parallel binary of Spark a flying moth darts into the fire algorithm, classification performance and operational efficiency are improved.
Optimization algorithm that a flying moth darts into the fire (Moth-Flame Optimization Algorithm, MFO) algorithm is by the big benefit of Australia Sub- Florence Griffith Mirjalili was proposed by the flight characteristics of simulation moth in 2015.In MFO algorithm, it is assumed that moth It is candidate solution, matrix M indicates the set of moth, and array OM is for storing corresponding fitness value.Another core of the algorithm Heart component is flame, and flame matrix is indicated with F, and the dimension of array M and F are equal, and array OF is used to store corresponding fitness Value.Between them difference be exactly treated in iterative process each time it is different with the mode that updates them.Moth is to search The actual search main body moved in rope space, and flame is up to the present optimum position that moth obtains.Therefore, if finding One preferably solution, every moth are just searched near label and update it.By this mechanism, moth never misses its Optimal solution.
(1) initialization population is randomly generated;
MFO algorithm is to be similar in optimization problem global optimal triple:
MFO=(I, P, T);
I is the function for generating a random moth population and corresponding fitness value.Its system model is as follows:
P is the principal function for moving moth in search space.P receives matrix M, and returns to updated M.
P:M→M;
If meeting stop criterion, T function returns true;If conditions are not met, then T function returns to vacation.
T:M→{true,false};
(2) position of moth is updated;
MFO algorithm uses three functions, and I is the function for initializing moth population and corresponding fitness value.P is to make moth The principal function moved in search space, T are to terminate search operation function:
MFO=(I, P, T);
Using I, the general framework that P, T describe MFO algorithm is defined as follows:
Any random distribution may serve to initialization moth in the position of solution space.The realization of I function can be write as Lower form:
After the initialization of I function, P function iteration runs up to T function and returns very.For the row of accurate simulation moth For following formula can be used and update position of each moth relative to flame:
Mi=S (Mi,Fj);
M hereiniIndicate i-th moth, FjIndicate that j-th of flame, S are Spirallike function.The main update mechanism of moth is selected Log spiral is taken, and it is as follows to define log spiral:
S(Mi,Fj)=Di·ebt·cos(2πt)+Fj
Using the flameno flame randomly selected, global exploration is carried out using the method that spiral is approached, specially using such as Under formula calculate:
S(Mi, Fj) and=Di·ebt·cos(2πt)+Fflameno
D hereiniIndicate that the distance between i-th moth and j-th of flame, b are the constant for defining logarithmic spiral wire shaped, A random number of the t between [- 1,1].D is acquired by following formula calculating:
Di=| Fj-Mi|;
M hereiniIndicate i-th moth, FjIndicate j-th of flame, DiIndicate between i-th moth and j-th of flame away from From.
S(Mi,Fj)=Di·ebt·cos(2πt)+FjThe movement routine of moth is simulated, and has determined moth relative to fire Next position of flame.In this equation, parameter t is defined as moth should be close to the degree (t of flame in next position =-1 is closest to the position of flame, and t=1 illustrates the position farthest apart from flame).It is only necessary to moth directions in the formula Flame is mobile, but it but results in MFO algorithm and falls into local optimum quickly.In order to avoid such case, each moth must not S (M is not used onlyi,Fj)=Di·ebt·cos(2πt)+FflamenoIn a flame update their position.It is each time After iteration updates flame list, flame sorts according to their fitness value.Then moth updates their corresponding flames Position.First moth always updates the position of corresponding optimal flame, and that last moth update in list it is corresponding most The position of poor flame.
(3) quantity of flame is updated;
The quantity of flame can be made adaptively to reduce in an iterative process using following formula:
L is current iteration number herein, and N is the maximum value of flame quantity, and T is maximum the number of iterations.The formula shows There are the flames that quantity is N in iteration initial step.And in iteration finally, moth, which is only applicable in best flame, updates theirs Position.Flame is quantitative to gradually decrease the detection and exploitation balanced in search space.
Based on above-mentioned analysis, provided by the invention a kind of based on the distributed parallel binary of Spark see Fig. 1 and Fig. 2 The feature selection approach of a flying moth darts into the fire algorithm, comprising the following steps:
Step 1: reading raw data set, then raw data set is pre-processed, then utilize the RDD in Spark Training set is divided into multiple subsets by subregion, is stored in and is suitble to operate in the distributed file system HDFS on Spark;
In the present embodiment, data set, training set and test set are all to utilize intrusion detection data set KDDCUP99 data Collection.Then 10% is chosen as training set and test set, then carries out the standardization of data, after normalization, required for formation Data set, here it is the pretreatments of data;
Step 2: one RDD data set of initialization, the RDD data set represent moth population OM, will be in Spark distribution RDD elasticity distribution formula data set make Map conversion process, initial fitness value;Wherein, initial fitness value be it is random initial or Person's setting is initially 0;
In feature selecting, nicety of grading is an important target.The final goal of feature selecting is using least Feature obtains higher classification results.
In the present embodiment, a RDD data set is initialized, is initialization Spark, and carry out simple RDD data Map, filter, reduce operation.
In the present embodiment, the calculation formula of fitness value are as follows:
Wherein F (i) is the fitness value of i-th of moth, and n (i) is selected Characteristic Number, and Accuary (i) is that classification is accurate Rate, accuracy rate are the values that SVM classifier training obtains, and λ is weighting parameters, usually setting λ=0.01.
Step 3: calculating moth and corresponding flame distance Di, the value after every moth updates is calculated, current fly is updated The position of moth;Then the value that moth updates is switched into binary system;
In the present embodiment, the position of current moth is updated, is to judge to generate whether fitness value corresponding to moth is greater than The fitness value of moth before generating seed;If so, the maximum moth of the fitness value of generation is replaced maximum in update fly Moth;If it is not, retaining current moth;Update the position of current moth.
Since feature selection issues are one specific features of selection or are not selected, each character subset is encoded as 1 All solutions, therefore are expressed as the form of binary vector by the string of binary characters with 0, wherein 1 indicates one feature of selection Form new data set, 0 indicate not select.
In the present embodiment, by moth update value switch to binary system, be constructed with Sigmoid function this binary system to Amount:
S(Mi,Fj)=Di·ebt·cos(2πt)+Fflameno(4);
Di=| Fj-Mi| (5);
Wherein, (0,1) σ~U, S (M, F) represent position of the new binary moth relative to flame;DiIndicate i-th The distance between moth and j-th of flame, b are the constant for defining logarithmic spiral wire shaped, and t one for [- 1,1] between is at random Number, MiIndicate i-th moth, FjIndicate j-th of flame;Flameno represents the quantity of flame, FflamenoBeing construed in the text Indicate the flameno flame;Two kinds of update mode differences, therefore the quantity of flame can also change therewith;L is current iteration time Number, N are the maximum value of flame quantity, and T is maximum the number of iterations.
Step 4: the value after the update for the moth being calculated is utilized as the input variable in Spark distribution Spark distributed platform calculates the specific position of the moth after iteration;
Step 5: operation being updated to each moth, and the fitness value M of each moth part after calculation processing;
It is the fitness value that moth is updated using two kinds of update modes in following algorithm renewal process that local adaptation's value, which calculates, Each time the process of iteration then institute calculated all moths position and adaptive value be referred to as part process, therefore often Primary adaptive value is also referred to as local adaptation's angle value.What local adaptation's value was calculated using fitness function, initial fitness Value can initial at random or setting be initially 0.
The iteration for carrying out population updates, according to flame come quantity determine the position of its corresponding moth, and update institute There is every dimension of moth;After iteration updates flame list each time, flame sorts according to their fitness value.Then fly Moth updates their positions relative to corresponding flame.First moth always updates the position relative to optimal flame, and last That moth updates the position in list relative to worst flame.
In the present embodiment, if moth population is less than the number of flame, then moth is updated by the method that spiral is circled in the air, it is no Then, the method for taking spiral to approach updates moth.
The formula for updating moth using the method that spiral is circled in the air is as follows:
S(Mi,Fj)=Di·ebt·cos(2πt)+Fj(7);
Wherein, MiIndicate i-th moth, FjIndicate that j-th of flame, S are Spirallike function, DiIndicate i-th moth and the The distance between j flame, b are the constant for defining logarithmic spiral wire shaped, a random number of the t between [- 1,1];
Using the flameno flame randomly selected, global exploration is carried out using the method that spiral is approached, specially using such as Under formula calculate:
S(Mi,Fj)=Di·ebt·cos(2πt)+Fflameno(8);
Wherein,Flameno represents the quantity of flame, FflamenoIn the text It is construed to indicate the flameno flame;L is current iteration number, and N is the maximum value of flame quantity, and T is maximum iteration time Number.
Step 6: calculating the fitness value of each moth according to fitness function, determine the maximum value of fitness value;If New fitness value is more preferable than previous fitness value, then new fitness value replaces previous fitness value;
Step 7: judging whether to reach maximum number of iterations;
If not, revolution executes step 3, adaptive optimal control angle value is continually looked for;
If so, thening follow the steps 8;
Step 8: exporting the current fitness value maximum value of updated moth and its position of corresponding moth.
A flying moth darts into the fire, and algorithm uses the navigation mechanism of located lateral, selects Spirallike Functions as the update of moth spatial position Operator, adaptive adjusts path coefficient and flame quantity balance ability of searching optimum and local development ability in algorithm. MFO algorithm has convergence rate faster, and precision is higher, and the optimization process in processing Complex Constraints and the unknown space of search In more advantage.Currently, MFO has been successfully applied to the optimizations such as electric system and has asked as a kind of new Heuristic Method In topic.
As the scale of data increases, distributed storage is carried out to high dimensional data by cloud platform and calculating has become Gesture, wherein being most widely used with Hadoop and Spark.MapReduce computation module needs repeatedly to access in iterative processing Disk affects training speed, and the feature of Spark maximum will exactly calculate data, intermediate result is stored in memory, greatly Reduce I/O expense greatly, is more suitable the operational efficiency for improving the more group's optimization algorithm of the number of iterations.
The present invention is examined based on the feature selection approach of the distributed parallel binary of Spark a flying moth darts into the fire optimization algorithm For considering most of sorting algorithms, the feature space of higher-dimension all has a great impact for nicety of grading and dimension reduction, So need to be extracted from higher-dimension primitive character to classification useful feature, thus achieve the purpose that reduce feature space dimension, To improve nicety of grading.Therefore the problem of present invention is able to solve many machine learning and data minings in practice, Data Dimensionality Reduction.
Moth iteration is found the concurrent process of optimal solution, position and the searching optimal solution of every moth by the present embodiment Process is known as an independent Parallel Unit.Therefore, n moth constitutes n independent Parallel Units, then simultaneously using Spark Row processing.What the feature selection approach of the distributed parallel binary of Spark of the invention a flying moth darts into the fire optimization algorithm used Mapper-Chainer model, consists of two parts.First part is the initialization of moths population, and second part is Mapper- Reducer iterative process determines optimal solution.Specific Mapper-Chainer model is as shown in Figure 3:
First part is the initialization of moths population, and second part is Mapper-Reducer iterative process to determine most Excellent solution.
Part-1:SPBMFO-Initialization: initial population is generated at random by an integer value vector.? After initialization population parameter, the initial fitness value of each moth is calculated, finally obtains current best moth position.
Part-2 iterative process: the map task of iteration will start after the completion of initialization of population, each Map function according to Format (key, value) sequentially reads the data set of each subregion from local input, and multiple Map functions are transported in parallel mode It goes, and calculates the fitness value of each moth using given partitioned data set.
Next, having n moth by the fitness value that the different Map functions of the data set of m different subregions calculate It sums in Combiner.Then the average fitness value of each moth of entire data set is obtained.
Finally, passing through the transmitted value for the combiner Combiner for obtaining (composite key, values) form.Moth Next position will will be updated, and generate a new solution.Best solution is stored in best position and will be updated Moth position and fitness value will be passed to next iteration.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For system disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.
Used herein a specific example illustrates the principle and implementation of the invention, and above embodiments are said It is bright to be merely used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, foundation Thought of the invention, there will be changes in the specific implementation manner and application range.In conclusion the content of the present specification is not It is interpreted as limitation of the present invention.

Claims (9)

1. a kind of feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm, which is characterized in that packet Include following steps:
Step 1: reading raw data set, then raw data set is pre-processed, then utilize the RDD subregion in Spark Training set is divided into multiple subsets, is stored in and is suitble to operate in the distributed file system HDFS on Spark;
Step 2: one RDD data set of initialization, the RDD data set represent moth population OM, will be in Spark distribution RDD elasticity distribution formula data set makees Map conversion process, initial fitness value;Wherein, initial fitness value be it is random initial or Setting is initially 0;
Step 3: calculating moth and corresponding flame distance Di, the value after every moth updates is calculated, the position of current moth is updated It sets;Then the value that moth updates is switched into binary system;
Step 4: the value after the update for the moth being calculated is utilized as the input variable in Spark distribution Spark distributed platform calculates the specific position of the moth after iteration;
Step 5: operation being updated to each moth, and the fitness value M of each moth part after calculation processing;
Step 6: calculating the fitness value of each moth according to fitness function, determine the maximum value of fitness value;If new Fitness value is more preferable than previous fitness value, then new fitness value replaces previous fitness value;
Step 7: judging whether to reach maximum number of iterations;
If not, revolution executes step 3, adaptive optimal control angle value is continually looked for;
If so, thening follow the steps 8;
Step 8: exporting the current fitness value maximum value of updated moth and its position of corresponding moth.
2. the feature selecting side according to claim 1 based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Method, it is characterised in that: in step 1, data set, training set and test set are all to utilize intrusion detection data set KDDCUP99 number According to collection, 10% is then chosen as training set and test set, then carry out the standardization of data, after normalization, required for formation Data set, here it is the pretreatments of data.
3. the feature selecting side according to claim 1 based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Method, it is characterised in that: in step 2, initialize a RDD data set, be initialization Spark, and carry out simple RDD data Map, filter, reduce operation.
4. the feature selecting side according to claim 1 based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Method, it is characterised in that: in step 3, the position for updating current moth is that fitness value corresponding to judgement generation moth is The fitness value of the no moth being greater than before generating seed;If so, the maximum moth of the fitness value of generation is replaced in update most Big moth;If it is not, retaining current moth;Update the position of current moth.
5. the feature selecting side according to claim 1 based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Method, it is characterised in that: in step 3, by moth update value switch to binary system, be constructed with Sigmoid function this two into Vector processed:
Wherein (0,1) σ~U, S (M, F) represent position of the new binary moth relative to flame;S(Mi,Fj)=Di·ebt·cos(2πt)+Fflameno, Di=| Fj-Mi|,Flameno represents the quantity of flame, FflamenoIn the text be construed to expression Flameno flame;DiIndicate that the distance between i-th moth and j-th of flame, b are the normal of definition logarithmic spiral wire shaped Number, a random number of the t between [- 1,1], MiIndicate i-th moth, FjIndicate j-th of flame;L is current iteration number, N For the maximum value of flame quantity, T is maximum the number of iterations.
6. the feature selecting side according to claim 1 based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Method, it is characterised in that: in step 5, if moth population is less than the number of flame, then moth is updated by the method that spiral is circled in the air, Otherwise, the method for taking spiral to approach updates moth.
7. the feature selecting side according to claim 6 based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Method, it is characterised in that: according to flame come quantity determine the position of its corresponding moth, and update all moths per one-dimensional Degree;After iteration updates flame list each time, flame sorts according to their fitness value;Then it is opposite to update them for moth In the position of corresponding flame;First moth always updates the position relative to optimal flame, and that last moth updates column Position in table relative to worst flame;
The formula for updating moth using the method that spiral is circled in the air is as follows:
S(Mi,Fj)=Di·ebt·cos(2πt)+Fj
Wherein, MiIndicate i-th moth, FjIndicate that j-th of flame, S are Spirallike function, DiIndicate i-th moth and j-th The distance between flame, b are the constant for defining logarithmic spiral wire shaped, a random number of the t between [- 1,1];
Using the flameno flame randomly selected, global exploration is carried out using the method that spiral is approached, is specially used as follows Formula calculates:
S(Mi,Fj)=Di·ebt·cos(2πt)+Fflameno
Wherein,Flameno represents the quantity of flame, FflamenoExplanation in the text To indicate the flameno flame;L is current iteration number, and N is the maximum value of flame quantity, and T is maximum the number of iterations.
8. the feature selecting side according to claim 1 based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Method, it is characterised in that: in step 6, the calculation formula of fitness value are as follows:
Wherein F (i) is the fitness value of i-th of moth, and n (i) is selected Characteristic Number, and Accuary (i) is classification accuracy, Accuracy rate is that SVM classifier training obtains, and λ is weighting parameters.
9. according to any one of claims 1 to 8 based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Feature selection approach, it is characterised in that: the method uses Mapper-Chainer model realization.
CN201910040185.1A 2019-01-16 2019-01-16 Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm Pending CN109871934A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910040185.1A CN109871934A (en) 2019-01-16 2019-01-16 Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910040185.1A CN109871934A (en) 2019-01-16 2019-01-16 Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm

Publications (1)

Publication Number Publication Date
CN109871934A true CN109871934A (en) 2019-06-11

Family

ID=66917703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910040185.1A Pending CN109871934A (en) 2019-01-16 2019-01-16 Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm

Country Status (1)

Country Link
CN (1) CN109871934A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619144A (en) * 2019-08-08 2019-12-27 杭州电子科技大学 Microstrip antenna design method based on improved moth fire-fighting algorithm
CN111880402A (en) * 2020-07-30 2020-11-03 广州大学 Method and device for controlling product parameters of fluorescent powder layer and storage medium
CN112580198A (en) * 2020-12-03 2021-03-30 国网山西省电力公司晋城供电公司 Improved optimization classification method for transformer state evaluation
CN114282130A (en) * 2021-12-03 2022-04-05 重庆邮电大学 Fraud website identification method based on selection of mutant moth flame optimization algorithm
CN114745394A (en) * 2022-04-07 2022-07-12 山东理工大学 Mobile service selection method based on moth fire suppression optimization algorithm in cloud and edge environment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866904A (en) * 2015-06-16 2015-08-26 中电科软件信息服务有限公司 Parallelization method of BP neural network optimized by genetic algorithm based on spark
CN106802822A (en) * 2016-12-30 2017-06-06 南京邮电大学 A kind of cloud data center cognitive resources dispatching method based on moth algorithm
CN107766927A (en) * 2017-11-03 2018-03-06 西南交通大学 Universal parallel method of the intelligent optimization algorithm based on individual population on Spark
CN108197708A (en) * 2017-12-14 2018-06-22 河海大学 A kind of parallel time genetic algorithm based on Spark
CN108228819A (en) * 2017-12-29 2018-06-29 武汉长江仪器自动化研究所有限公司 Methods For The Prediction Ofthe Deformation of A Large Dam based on big data platform
CN108288074A (en) * 2018-01-31 2018-07-17 湖北工业大学 A kind of selection method and system of data characteristics
DE102018107831A1 (en) * 2017-04-05 2018-10-11 GM Global Technology Operations LLC METHOD FOR CLASSIFYING THE SYSTEM PERFORMANCE AND FOR RECORDING ENVIRONMENT INFORMATION

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866904A (en) * 2015-06-16 2015-08-26 中电科软件信息服务有限公司 Parallelization method of BP neural network optimized by genetic algorithm based on spark
CN106802822A (en) * 2016-12-30 2017-06-06 南京邮电大学 A kind of cloud data center cognitive resources dispatching method based on moth algorithm
DE102018107831A1 (en) * 2017-04-05 2018-10-11 GM Global Technology Operations LLC METHOD FOR CLASSIFYING THE SYSTEM PERFORMANCE AND FOR RECORDING ENVIRONMENT INFORMATION
CN107766927A (en) * 2017-11-03 2018-03-06 西南交通大学 Universal parallel method of the intelligent optimization algorithm based on individual population on Spark
CN108197708A (en) * 2017-12-14 2018-06-22 河海大学 A kind of parallel time genetic algorithm based on Spark
CN108228819A (en) * 2017-12-29 2018-06-29 武汉长江仪器自动化研究所有限公司 Methods For The Prediction Ofthe Deformation of A Large Dam based on big data platform
CN108288074A (en) * 2018-01-31 2018-07-17 湖北工业大学 A kind of selection method and system of data characteristics

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HOSSAM M. ZAWBAA 等: "Feature Selection Approach based on Moth-Flame Optimization Algorithm", 《2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATIO》 *
徐慧 等: "改进的飞蛾扑火优化算法在网络入侵检测系统中的应用", 《计算机应用》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619144A (en) * 2019-08-08 2019-12-27 杭州电子科技大学 Microstrip antenna design method based on improved moth fire-fighting algorithm
CN110619144B (en) * 2019-08-08 2024-03-05 杭州电子科技大学 Microstrip antenna design method based on improved moth fire suppression algorithm
CN111880402A (en) * 2020-07-30 2020-11-03 广州大学 Method and device for controlling product parameters of fluorescent powder layer and storage medium
CN112580198A (en) * 2020-12-03 2021-03-30 国网山西省电力公司晋城供电公司 Improved optimization classification method for transformer state evaluation
CN114282130A (en) * 2021-12-03 2022-04-05 重庆邮电大学 Fraud website identification method based on selection of mutant moth flame optimization algorithm
CN114745394A (en) * 2022-04-07 2022-07-12 山东理工大学 Mobile service selection method based on moth fire suppression optimization algorithm in cloud and edge environment
CN114745394B (en) * 2022-04-07 2023-07-07 山东理工大学 Mobile service selection method based on moth fire suppression optimization algorithm in cloud and edge environments

Similar Documents

Publication Publication Date Title
CN109871934A (en) Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm
CN110070117B (en) Data processing method and device
Chen et al. Particle swarm optimization algorithm and its application to clustering analysis
Bayati et al. MLPSO: a filter multi-label feature selection based on particle swarm optimization
CN109948149B (en) Text classification method and device
CN109460793A (en) A kind of method of node-classification, the method and device of model training
CN109063355A (en) Near-optimal method based on particle group optimizing Yu Kriging model
CN114841257B (en) Small sample target detection method based on self-supervision comparison constraint
Das et al. A harmony search based wrapper feature selection method for holistic bangla word recognition
CN110490298A (en) Lightweight depth convolutional neural networks model based on expansion convolution
CN110020435B (en) Method for optimizing text feature selection by adopting parallel binary bat algorithm
CN108983180A (en) A kind of high-precision radar sea clutter forecast system of colony intelligence
CN110443428A (en) A kind of air compressor group load forecasting method and its control equipment
El-Tarabily et al. A PSO-based subtractive data clustering algorithm
CN113435108A (en) Battlefield target grouping method based on improved whale optimization algorithm
CN114556364A (en) Neural architecture search based on similarity operator ordering
CN110795736B (en) Malicious android software detection method based on SVM decision tree
KR102144010B1 (en) Methods and apparatuses for processing data based on representation model for unbalanced data
Babu et al. A simplex method-based bacterial colony optimization algorithm for data clustering analysis
CN111782904B (en) Unbalanced data set processing method and system based on improved SMOTE algorithm
CN112881869A (en) Cable joint partial discharge ultrasonic sequence prediction method
CN115344693B (en) Clustering method based on fusion of traditional algorithm and neural network algorithm
Zhang et al. Data clustering using multivariant optimization algorithm
Chen et al. Feature selection of parallel binary moth-flame optimization algorithm based on spark
CN115936773A (en) Internet financial black product identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190611

RJ01 Rejection of invention patent application after publication