CN113076695B - Ionosphere high-dimensional data feature selection method based on improved BBA algorithm - Google Patents

Ionosphere high-dimensional data feature selection method based on improved BBA algorithm Download PDF

Info

Publication number
CN113076695B
CN113076695B CN202110390672.8A CN202110390672A CN113076695B CN 113076695 B CN113076695 B CN 113076695B CN 202110390672 A CN202110390672 A CN 202110390672A CN 113076695 B CN113076695 B CN 113076695B
Authority
CN
China
Prior art keywords
algorithm
dimension
improved
bba
individual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110390672.8A
Other languages
Chinese (zh)
Other versions
CN113076695A (en
Inventor
梁会军
钟建伟
杨永超
秦勉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Xinzhuoya Technology Development Co ltd
Original Assignee
Hubei University for Nationalities
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University for Nationalities filed Critical Hubei University for Nationalities
Priority to CN202110390672.8A priority Critical patent/CN113076695B/en
Publication of CN113076695A publication Critical patent/CN113076695A/en
Application granted granted Critical
Publication of CN113076695B publication Critical patent/CN113076695B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/06Multi-objective optimisation, e.g. Pareto optimisation using simulated annealing [SA], ant colony algorithms or genetic algorithms [GA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses an ionosphere high-dimensional data feature selection method based on an improved BBA algorithm, which comprises the following steps: acquiring ionization layer data; taking a dimension classification loss function as a target function; solving an objective function by adopting an improved BBA algorithm, wherein the improved BBA algorithm comprises the steps of updating individual speeds on a single dimension, and mapping the updated individual speeds from a continuous space to a discrete space according to a time-varying V-shaped conversion function; and determining a target dimension after solving, and obtaining the ionospheric characteristics corresponding to the target dimension after performing dimension reduction processing on the ionospheric data according to the target dimension. A random black hole model is introduced, a time-varying V-shaped conversion function is provided to improve a BBA algorithm, and after the dimensionality of ionospheric high-dimensional data is reduced based on an improved discrete binary bat algorithm, a minimized feature subset is generated, the data error rate is reduced, the dimensionality classification precision is improved, and more accurate ionospheric data features are selected.

Description

Ionosphere high-dimensional data feature selection method based on improved BBA algorithm
Technical Field
The invention relates to the technical field of high-dimensional data feature selection, in particular to an ionospheric high-dimensional data feature selection method based on an improved BBA algorithm.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Data mining is a branch derived from the rapid development of information technology, which requires that useful information hidden in data is analyzed and extracted from mass data through an algorithm, and generally comprises the following processing processes: preparing data, mining data and expressing and explaining results. There is an important preprocessing step, feature selection, whose primary role is to reduce irrelevant or redundant attributes in a particular data.
Currently, the commonly used feature selection methods can be roughly classified into a filter method, a wrapper method, and a hybrid (hybrid) method of the two methods. The filtering method directly depends on the characteristics of the real data set without the help of a learning algorithm, weights are given to each dimension characteristic, each weight represents the importance of the characteristic, then the ranking is carried out according to the importance, and the required characteristic information is selected. The parcel rule distinguishes whether the feature is selected or not by means of the learning algorithm and the classification accuracy corresponding to the selected feature; the method has high calculation complexity and can be solved by means of an optimization algorithm. The hybrid method is a combination of the above two methods. Studies have shown that, in general, the wrapping method can achieve better feature selection results than the filtering method.
In recent years, meta-heuristic algorithms show outstanding capabilities in the aspects of machine learning, data mining, engineering design, feature selection and the like, and particularly in the aspect of feature selection problem research, the meta-heuristic algorithms do not need to search the whole data space and can obtain satisfactory feature selection results; while the general precise search method can ensure that an accurate feature subset can be found, the method is time-consuming and limited by computer storage, and particularly for a high-dimensional complex system, the method is low in efficiency or even can not be used. In the feature selection problem, if a data set has n features, then there are 2^ n feature selection methods, so the feature selection complexity grows exponentially. Although the meta-heuristic method cannot guarantee that an accurate optimal feature subset can be found, a feature subset very close to the meta-heuristic method can be generated, and the meta-heuristic method is not limited by the dimension of a research problem, so that the feature selection method is widely researched, and typically comprises a discrete binary particle swarm optimization algorithm (BPSO), a discrete binary gray wolf algorithm (BGWO), a Genetic Algorithm (GA), a discrete Binary Bat Algorithm (BBA) and the like.
Although metaheuristic methods have been successful in feature selection, such methods have limitations, such as insufficient exploration (exploration) and exploitation (exploitation) capabilities of the algorithms, and suffer from premature convergence. For example, for a binary particle swarm optimization algorithm, the weight coefficient of a particle is a constant, which limits the global search capability of the algorithm, thereby making the algorithm prone to premature convergence. Like the genetic algorithm, the individual initial values have a large influence on the final optimization result, and other methods are usually needed to be mixed to improve the performance.
For the existing discrete binary bat algorithm, the algorithm only realizes the mapping of the individual speed from a continuous space to a binary discrete space, has no substantial change for the exploration and development capability of the algorithm, and does not process parameters related to the convergence of the algorithm. Therefore, how to improve the accuracy of high-dimensional data feature selection to the maximum extent and improve the performance and convergence rate of the optimization algorithm at the same time is a technical problem which needs to be urgently solved by a person skilled in the art.
Objects which are inconvenient to model by using a mathematical method but can be characterized by using high-dimensional data, namely objects which can be described by using specific data, such as ionosphere data used by declared document simulation, are characterized by ionosphere in 34 dimensions. The problems currently existing for this type of object are: generally, the ionospheric objects of this type are characterized more, i.e. have high dimensionality, and the challenge of feature selection is: under the condition of high-dimensional data, how to generate a minimized feature subset, and meanwhile, the classification precision is not influenced or the influence is reduced to the minimum; the dimensionality obtained by the existing algorithm after dimensionality reduction is relatively more, which means that the dimensionality still has a reduced space; in addition, for the ionospheric object, the error rate of the data obtained by the existing algorithm after dimensionality reduction is high, and the reason that the error rate is high is that the selection of dimensionality by the existing algorithm is not accurate enough.
Disclosure of Invention
In order to solve the problems, the invention provides an ionospheric high-dimensional data feature selection method based on an improved BBA algorithm, a random black hole model is introduced, a time-varying V-type conversion function is provided to improve the BBA algorithm, and after the ionospheric high-dimensional data is subjected to dimensionality reduction based on the improved discrete binary bat algorithm, a minimized feature subset is generated, the data error rate is reduced, the dimensionality classification precision is improved, and more accurate ionospheric data features are selected.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides an ionospheric high-dimensional data feature selection method based on an improved BBA algorithm, including:
acquiring ionization layer data;
taking a dimension classification loss function as a target function;
solving an objective function by adopting an improved BBA algorithm, wherein the improved BBA algorithm comprises the steps of updating individual speeds on a single dimension, and mapping the updated individual speeds from a continuous space to a discrete space according to a time-varying V-shaped conversion function;
and determining a target dimension after solving, and obtaining the ionospheric characteristics corresponding to the target dimension after performing dimension reduction processing on the ionospheric data according to the target dimension.
In a second aspect, the present invention provides an ionospheric high-dimensional data feature selection system based on an improved BBA algorithm, including:
a data acquisition module configured to acquire ionospheric data;
an objective function determination module configured to take the dimension classification loss function as an objective function;
an objective function solving module configured to solve an objective function using an improved BBA algorithm, the improved BBA algorithm including updating individual speeds in a single dimension, and mapping the updated individual speeds from a continuous space to a discrete space according to a time-varying V-type conversion function;
and the feature selection module is configured to determine a target dimension after solving, and obtain the ionospheric feature corresponding to the target dimension after performing dimension reduction processing on the ionospheric data according to the target dimension.
In a third aspect, the present invention provides an electronic device comprising a memory and a processor, and computer instructions stored on the memory and executed on the processor, wherein when the computer instructions are executed by the processor, the method of the first aspect is performed.
In a fourth aspect, the present invention provides a computer readable storage medium for storing computer instructions which, when executed by a processor, perform the method of the first aspect.
Compared with the prior art, the invention has the beneficial effects that:
because ionosphere data show the characteristics of an ionosphere through 34 dimensions, the ionosphere data have more characteristics and high dimensions, the invention provides an ionosphere high-dimensional data characteristic selection method based on an improved discrete binary Bat algorithm RBBA (binary Bat Algorithm mountain modified random variance measure), after the ionosphere high-dimensional data are subjected to dimensionality reduction, a minimized characteristic subset is generated, the data error rate is reduced, the dimensionality classification precision is improved, and more accurate ionosphere data characteristics are selected.
The invention provides a high-dimensional data feature selection method based on an improved discrete binary system bat algorithm RBBA, which is based on the discrete binary system bat algorithm and takes the aspects of improving the exploration and development capability of the algorithm and avoiding premature convergence as starting points.
The invention discloses a high-dimensional data feature selection method based on an improved discrete binary system bat algorithm RBBA, which comprises the steps of selecting a classification algorithm, determining an optimization objective function, constructing an optimization Model of the high-dimensional data feature selection method, solving the Model by the improved discrete binary optimization algorithm, introducing an improved Random Black Hole Model (Random Black Hole Model) in the solving process, enabling the Model to be suitable for the discrete algorithm, enabling the algorithm to search a larger space at the beginning stage of iteration, focusing on the vicinity of a global optimal solution to perform optimization at the later stage of iteration, and greatly improving the global searching capability of a population.
The invention provides a time-varying V-shaped transfer function (time-varying V-shaped transfer function), compared with the V-shaped transfer function, the time-varying V-shaped transfer function not only can provide faster switching speed, but also can enhance the exploration and development capability of the algorithm.
The chaotic map is used for replacing monotonously changing parameters in a BBA algorithm of a discrete binary bat algorithm, namely pulse sending frequency, so that the diversity of solutions is improved, and premature convergence of the algorithm is avoided to the maximum extent.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
Fig. 1 is a flowchart of an ionospheric high-dimensional data feature selection method based on an improved BBA algorithm according to embodiment 1 of the present invention;
fig. 2 is a schematic diagram of a correspondence between a population random initial value and a data dimension provided in embodiment 1 of the present invention;
FIG. 3 is a schematic diagram of a time-varying V-shaped transfer function provided in embodiment 1 of the present invention;
fig. 4 is a flowchart of solving the objective function provided in embodiment 1 of the present invention.
The specific implementation mode is as follows:
the invention is further described with reference to the following figures and examples.
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and it should be understood that the terms "comprises" and "comprising", and any variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
Example 1
As shown in fig. 1, the present embodiment provides an ionospheric high-dimensional data feature selection method based on an improved BBA algorithm, including:
s1: acquiring ionization layer data;
s2: taking a dimension classification loss function as a target function;
s3: solving an objective function by adopting an improved BBA algorithm, wherein the improved BBA algorithm comprises the steps of updating individual speeds on a single dimension, and mapping the updated individual speeds from a continuous space to a discrete space according to a time-varying V-shaped conversion function;
s4: and determining a target dimension after solving, and obtaining the ionospheric characteristics corresponding to the target dimension after performing dimension reduction processing on the ionospheric data according to the target dimension.
In step S1, the present embodiment first performs a preprocessing operation on the acquired ionospheric high-dimensional data, where the preprocessing operation mainly includes the following steps: checking data consistency, processing invalid and missing values, data normalization, and the like.
In step S2, the embodiment selects a feature classification method, determines a dimension classification loss function, and determines an optimization model by using the loss function as an optimization objective function of a discrete binary bat algorithm RBBA;
specifically, kNN is used as a feature selection classification tool, kNN related parameters are reasonably set, a k-fold cross validation (k-fold cross validation) method is used, a loss function of the k-fold cross validation is used as an optimization objective function, and the optimization objective function is defined as follows:
Figure BDA0003016590170000071
therein, MSEiThe mean square error of the ith k-fold cross validation.
The parameter k in the kNN algorithm directly influences the classification efficiency of the algorithm and is the most important hyper-parameter in the kNN algorithm; in this embodiment, the k value is set to 5 according to the dimensionality of the experimental data, in order to improve the accuracy of classification, a k-fold cross validation method is used to train the test data, and in each iteration process, each sample point has only one chance of being divided into a training set or a test set.
In the kNN algorithm, for an input vector needing prediction, k vector sets closest to the input vector are found in a training data set according to Euclidean distances between the input vector and other vectors in the training data set, and then the k vector sets are generalized to a class with the largest number of classes in k samples; the Euclidean distance calculation formula is as follows:
Figure BDA0003016590170000081
wherein, akAnd bkThe value of k is empirically set to 5 for any two vectors in the class to which it belongs.
In step S3, reasonably setting relevant parameters in the improved RBBA algorithm, determining an effective radius and a gravitation threshold of the improved random black hole model, determining a jump number of a time-varying V-type transfer function, and determining a used chaotic mapping function;
specifically, the parameters to be set in the improved discrete binary bat algorithm RBBA are mainly as follows: the pulse sending rate, the pulse loudness and the initialization value of the population of the individual;
the pulse sending rate is defined by chaotic mapping, and the chaotic system has non-repeatability and periodicity characteristics due to the inherent random characteristics, so that the chaotic system is suitable for replacing some random variables or monotonously-changing variables in a meta-heuristic algorithm; therefore, the present embodiment proposes to use chaotic mapping instead of the monotonically varying pulse transmission rate r in the BBA algorithm, which is specifically as follows:
r(i+1)=2.3*r(i)*r(i)*sin(π*r(i))。 (3)
the pulse loudness a does not use the monotonically decreasing mode of existing algorithms, but rather uses a constant value, which may increase the admission probability of a newly generated solution.
For the random initial values of the population, the dimension setting is the same as the studied data dimension, the corresponding relation between the two is shown in fig. 2, the individual dimension and the data dimension correspond to each other one by one, each initial value of the individual dimension is randomly selected from 0 and 1, 0 represents unselected, and 1 represents the selection of the currently corresponding dimension.
In the embodiment, a random black hole model is introduced and improved, so that the method is suitable for a discrete optimization problem, and individual speeds are updated on a single dimension in a discrete space; the existing random black hole model is provided aiming at a continuous optimization problem, mainly aiming at a solution in an iterative process of a continuous optimization algorithm, and a feature selection problem belongs to a discrete optimization problem, so that the original solution idea aiming at continuous variables is not applicable in a discrete space. Therefore, in discrete space, the update of the solution no longer depends on the "location update", but depends on the speed update of the individual in the algorithm, and according to the above analysis, the RBBA algorithm updates the solution in the following manner:
Figure BDA0003016590170000091
wherein the content of the first and second substances,
Figure BDA0003016590170000092
represents the velocity value (embodied as a continuous real-number domain variable) of the ith individual in the k dimension at the time of t +1,
Figure BDA0003016590170000093
representing the current of the kth dimension at time tPopulation optimum value, reRepresents the effective radius of the improved random black hole model, and tau is [ -1,1 [)]Obeying uniformly distributed random numbers.
The improved random black hole model is described as follows:
(1) traverse individual velocity
Figure BDA0003016590170000094
Each dimension of (a);
(2) generating a random number u between [0,1 ];
(3) if u ≦ p, one dimension of the velocity v is updated as equation (4), i.e.
Figure BDA0003016590170000095
(4) And finishing the traversal.
In the above process, p represents a predefined threshold (p ∈ [0,1 ])]) Determining the probability of execution obtained in the step (3); for the improved random black hole model, the embodiment will determine the effective radius reSet to 0.1 and set the threshold p to 0.5.
The above update procedure for the velocity v has the following advantages: (1) compared with the existing algorithm, the algorithm provided by the embodiment can realize the update on any single dimension aiming at the speed, and the existing algorithm needs to use the same parameter to update all dimensions simultaneously, so that the change can improve the global search capability of an individual and provide more diversified solutions. (2) The exploration and development capability of the algorithm is greatly improved, and the effective radius r is at the initial stage of iterationeIs set slightly larger, and as the iteration progresses, reWill slowly become smaller in such a way that the algorithm can search a larger space in the beginning and, in the later stages of the iteration, focus on the near optimal solution for optimization.
In order to further improve the exploration and development capabilities of the algorithm and overcome the disadvantages caused by the use of the V-type conversion function in the BBA algorithm, the embodiment provides a time-varying V-type conversion function, and the updated individual speed is mapped from the continuous space to the discrete space according to the time-varying V-type conversion function, as shown in fig. 3, the specific mathematical description is as follows:
Figure BDA0003016590170000101
Figure BDA0003016590170000102
wherein t represents the iteration number of the algorithm, i represents the sequence number of the individual in the population, k represents the dimension of solving x, and thetatIs a time varying parameter that varies with the number of iterations t,
Figure BDA0003016590170000103
presentation pair
Figure BDA0003016590170000104
Taking an inverse (
Figure BDA0003016590170000105
Is a discrete binary number, can only take values between 0 or 1), and rand is [0.1 ]]Obey the uniformly distributed random numbers,
Figure BDA0003016590170000106
representing the velocity of the k-th dimension of the individual i at the t-th and t + 1-th iterations, respectively.
The time-varying V-shaped transfer function proposed in this embodiment varies 6 times with the number of iterations according to experience, and compared with the V-shaped transfer function, the time-varying V-shaped transfer function not only can provide a faster switching speed, but also can help the algorithm to obtain a stronger exploration and exploitation capability.
In step S4, the present embodiment uses the proposed RBBA to solve the optimization problem in step S1 to generate an optimal high-dimensional data feature selection result; as shown in fig. 4, the specific steps include:
(1) initializing RBBA algorithm-related parameters in discrete binary space, namely randomly initializing population xiVelocity values v of the individualsiAll set to 0, initialChanging the pulse sending rate r; initializing relevant parameters of a kNN algorithm, a random black hole model and a time-varying V-shaped conversion function;
(2) calculating fitting values of all individuals by using the initialized population values, and determining the current optimal solution of the population;
(3) setting the maximum iteration times, and starting iteration;
(4-1) traversing all individuals in the population, and solving an individual optimal solution and a global optimal solution;
(4-2) updating the individual frequency fiSum velocity value
Figure BDA0003016590170000111
(4-3) obtaining a new solution by adopting the improved random black hole model, namely traversing the individual speed
Figure BDA0003016590170000112
Each dimension of (a);
(4-4) generating a random number u between [0,1 ];
(4-5) if u ≦ p, one dimension of the velocity v is updated as in equation (4), i.e.
Figure BDA0003016590170000113
(4-6) ending traversing the individual dimensions;
(4-7) calculating a time-varying V-type transfer function, that is, equation (5), and mapping the velocity values from the continuous space to the discrete space according to the time-varying V-type transfer function using equation (6), and updating the solution;
(4-8) judging whether the solution is a new global optimal solution or not, and if so, accepting according to conditions;
(4-9) updating individual pulse frequency r using chaotic mappingiAnd loudness Ai
(5) Traversing population individuals and ending;
(6) if the maximum iteration number is not reached, jumping to the steps (4-1) - (4-9) to continue execution;
(7) and (5) finishing the iteration and giving an optimal dimension selection result.
The describedIn the step (4-2), the speed v in the individual frequency and speed values is updatediOptimizing in all dimensions, and then optimizing one by one in single dimension by using an improved random black hole model so as to provide a solution with higher quality for an algorithm; the method specifically comprises the following steps:
the frequency updating method comprises the following steps:
fi=fmin+(fmax-fmin)*δ, (7)
wherein δ is a constant between [0,1 ];
the speed updating method comprises the following steps:
Figure BDA0003016590170000121
wherein u is a random number between [0,1 ].
The following experiments are combined to disclose the advancement of the method proposed in this embodiment; in the embodiment, uci public ionosphere data in a data set is adopted for carrying out an experiment, the data set consists of 34 features, in order to verify the effectiveness and the advancement of the method provided by the embodiment, the execution result of the algorithm provided by the embodiment is compared with the execution result of the BPSO algorithm, the result is shown in the table 1, and as can be seen from the table above, the RBBA algorithm not only obtains fewer feature selection results, but also obtains a smaller error rate, which also shows the effectiveness and the advancement of the method provided by the invention on high-dimensional data feature selection;
table 1 comparison of the results of the executions
Error rate Feature selection results
RBBA 0.028571 1、5、7、13、14、17、19、24、25、27、28、29
BPSO 0.04286 1、5、6、7、8、12、13、15、18、25、29、30、32、33、34
Example 2
The embodiment provides an ionospheric high-dimensional data feature selection system based on an improved BBA algorithm, which includes:
a data acquisition module configured to acquire ionospheric data;
an objective function determination module configured to take the dimension classification loss function as an objective function;
an objective function solving module configured to solve an objective function using an improved BBA algorithm, the improved BBA algorithm including updating individual speeds in a single dimension, and mapping the updated individual speeds from a continuous space to a discrete space according to a time-varying V-type conversion function;
and the feature selection module is configured to determine a target dimension after solving, and obtain the ionospheric feature corresponding to the target dimension after performing dimension reduction processing on the ionospheric data according to the target dimension.
It should be noted that the modules correspond to the steps described in embodiment 1, and the modules are the same as the corresponding steps in the implementation examples and application scenarios, but are not limited to the disclosure in embodiment 1. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
In further embodiments, there is also provided:
an electronic device comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, the computer instructions when executed by the processor performing the method of embodiment 1. For brevity, no further description is provided herein.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
A computer readable storage medium storing computer instructions which, when executed by a processor, perform the method described in embodiment 1.
The method in embodiment 1 may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements, i.e., algorithm steps, described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (7)

1. An ionospheric high-dimensional data feature selection method based on an improved BBA algorithm is characterized by comprising the following steps:
acquiring ionization layer data;
taking a dimension classification loss function as a target function;
solving an objective function by adopting an improved BBA algorithm, wherein the improved BBA algorithm comprises the steps of updating individual speeds on a single dimension, and mapping the updated individual speeds from a continuous space to a discrete space according to a time-varying V-shaped conversion function;
determining a target dimension after solving, and obtaining an ionospheric characteristic corresponding to the target dimension after performing dimension reduction processing on the ionospheric data according to the target dimension;
the BBA is a discrete binary bat algorithm;
initializing improved BBA algorithm parameters including pulse sending rate, pulse loudness and population initialization value in the process of solving the objective function by adopting the improved BBA algorithm;
the updating of the individual speed on a single dimension is realized by adopting a random black hole model, and the method specifically comprises the following steps: traversing each dimension of the individual speed, generating a random number u between [0,1], and updating one dimension of the individual speed if the random number u is less than or equal to a predefined threshold value p;
the process of solving the objective function by adopting the improved BBA algorithm specifically comprises the following steps:
calculating an individual fitting value by using the population initialization value, and determining the current optimal solution of the population;
traversing all individuals in the population, and updating individual speeds on a single dimension by adopting an improved random black hole model to obtain new individual speeds;
mapping the individual speed in a discrete space according to a time-varying V-type conversion function, and updating the individual speed;
and judging whether the updated individual speed is a new global optimal solution, if so, updating the individual pulse frequency and loudness by adopting chaotic mapping until all population individuals are traversed, and obtaining an optimal dimension selection result after iteration is finished.
2. An ionospheric high-dimensional data characteristic selection method based on an improved BBA algorithm according to claim 1, wherein the initialization of the pulse transmission rate is to define the pulse transmission rate using a chaotic map;
or, the pulse loudness is initialized to be set to a constant value;
or, the population initialization value is initialized and set according to the dimension of the ionized layer data.
3. The ionospheric high-dimensional data feature selection method based on an improved BBA algorithm according to claim 1, wherein the time-varying V-shaped transfer function is:
Figure FDA0003621532240000021
Figure FDA0003621532240000022
wherein t represents the number of iterations, i represents the number of individuals in the population, k represents the dimension of solving x, and thetatIs a function of the number of iterations tThe time-varying parameters of the change are,
Figure FDA0003621532240000023
all represent discrete binary numbers, representing [0.1 ]]Random numbers obeying uniform distribution therebetween;
Figure FDA0003621532240000024
representing the velocity of the k-th dimension of the individual i at the t-th and t + 1-th iterations, respectively.
4. The improved BBA algorithm-based ionospheric high-dimensional data feature selection method of claim 1, wherein the constructing of the objective function comprises: and selecting a classification model by adopting the kNN characteristics, and taking a loss function of k-fold cross validation as an objective function.
5. An ionospheric high-dimensional data feature selection system based on an improved BBA algorithm, comprising:
a data acquisition module configured to acquire ionospheric data;
an objective function determination module configured to take the dimension classification loss function as an objective function;
an objective function solving module configured to solve an objective function using an improved BBA algorithm, the improved BBA algorithm including updating individual speeds in a single dimension, and mapping the updated individual speeds from a continuous space to a discrete space according to a time-varying V-type conversion function;
the characteristic selection module is configured to determine a target dimension after solving, and obtain an ionospheric characteristic corresponding to the target dimension after performing dimension reduction processing on ionospheric data according to the target dimension;
the BBA is a discrete binary bat algorithm;
initializing improved BBA algorithm parameters including pulse sending rate, pulse loudness and population initialization value in the process of solving the objective function by adopting the improved BBA algorithm;
the updating of the individual speed on a single dimension is realized by adopting a random black hole model, and the method specifically comprises the following steps: traversing each dimension of the individual speed, generating a random number u between [0,1], and updating one dimension of the individual speed if the random number u is less than or equal to a predefined threshold value p;
the process of solving the objective function by adopting the improved BBA algorithm specifically comprises the following steps:
calculating an individual fitting value by using the population initialization value, and determining the current optimal solution of the population;
traversing all individuals in the population, and updating individual speeds on a single dimension by adopting an improved random black hole model to obtain new individual speeds;
mapping the individual speed in a discrete space according to a time-varying V-type conversion function, and updating the individual speed;
and judging whether the updated individual speed is a new global optimal solution, if so, updating the individual pulse frequency and loudness by adopting chaotic mapping until all population individuals are traversed, and obtaining an optimal dimension selection result after iteration is finished.
6. An electronic device comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, the computer instructions when executed by the processor performing the method of any of claims 1-4.
7. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the method of any one of claims 1 to 4.
CN202110390672.8A 2021-04-12 2021-04-12 Ionosphere high-dimensional data feature selection method based on improved BBA algorithm Active CN113076695B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110390672.8A CN113076695B (en) 2021-04-12 2021-04-12 Ionosphere high-dimensional data feature selection method based on improved BBA algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110390672.8A CN113076695B (en) 2021-04-12 2021-04-12 Ionosphere high-dimensional data feature selection method based on improved BBA algorithm

Publications (2)

Publication Number Publication Date
CN113076695A CN113076695A (en) 2021-07-06
CN113076695B true CN113076695B (en) 2022-06-17

Family

ID=76617405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110390672.8A Active CN113076695B (en) 2021-04-12 2021-04-12 Ionosphere high-dimensional data feature selection method based on improved BBA algorithm

Country Status (1)

Country Link
CN (1) CN113076695B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012083371A1 (en) * 2010-12-23 2012-06-28 Crc Care Pty Ltd Analyte ion detection method and device
CN107579518A (en) * 2017-09-15 2018-01-12 山东大学 Power system environment economic load dispatching method and apparatus based on MHBA
CN108694077A (en) * 2017-04-10 2018-10-23 中国科学院声学研究所 Based on the distributed system method for scheduling task for improving binary system bat algorithm
CN109711373A (en) * 2018-12-29 2019-05-03 浙江大学 A kind of big data feature selection approach based on improvement bat algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012083371A1 (en) * 2010-12-23 2012-06-28 Crc Care Pty Ltd Analyte ion detection method and device
CN108694077A (en) * 2017-04-10 2018-10-23 中国科学院声学研究所 Based on the distributed system method for scheduling task for improving binary system bat algorithm
CN107579518A (en) * 2017-09-15 2018-01-12 山东大学 Power system environment economic load dispatching method and apparatus based on MHBA
CN109711373A (en) * 2018-12-29 2019-05-03 浙江大学 A kind of big data feature selection approach based on improvement bat algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《Feature selection based on chaotic binary black hole algorithm for data classification》;Omar SaberQasim等;《Chemometrics and Intelligent Laboratory Systems》;20200915;第1-6页 *
《智能优化算法及其在电力系统经济/排放调度中的应用》;梁会军;《中国优秀博硕士学位论文全文数据库(博士)》;20210115;第1-153页 *
基于改进二进制蝙蝠算法的独立型微网容量优化配置;盛四清等;《电力建设》;20171101(第11期);第130-137页 *

Also Published As

Publication number Publication date
CN113076695A (en) 2021-07-06

Similar Documents

Publication Publication Date Title
Luketina et al. Scalable gradient-based tuning of continuous regularization hyperparameters
US20230244904A1 (en) Neural Architecture Search with Factorized Hierarchical Search Space
Tran et al. Improved PSO for feature selection on high-dimensional datasets
CN117313789A (en) Black box optimization using neural networks
CN112164426A (en) Drug small molecule target activity prediction method and device based on TextCNN
CN115563610B (en) Training method, recognition method and device for intrusion detection model
CN110738362A (en) method for constructing prediction model based on improved multivariate cosmic algorithm
Jafar et al. High-speed hyperparameter optimization for deep ResNet models in image recognition
Bertran et al. Scalable membership inference attacks via quantile regression
CN110796268A (en) Method and device for determining hyper-parameters of business processing model
CN114637881A (en) Image retrieval method based on multi-agent metric learning
CN112488188B (en) Feature selection method based on deep reinforcement learning
CN113076695B (en) Ionosphere high-dimensional data feature selection method based on improved BBA algorithm
CN112487933B (en) Radar waveform identification method and system based on automatic deep learning
CN111260056B (en) Network model distillation method and device
CN116993548A (en) Incremental learning-based education training institution credit assessment method and system for LightGBM-SVM
CN115174170B (en) VPN encryption flow identification method based on ensemble learning
CN116304144A (en) Image processing method and device based on antagonistic neural network structure search
CN115345303A (en) Convolutional neural network weight tuning method, device, storage medium and electronic equipment
Er et al. A systematic method to guide the choice of ridge parameter in ridge extreme learning machine
US20230126695A1 (en) Ml model drift detection using modified gan
US20220405599A1 (en) Automated design of architectures of artificial neural networks
CN111539536B (en) Method and device for evaluating service model hyper-parameters
CN112085180B (en) Machine learning super parameter determination method, device, equipment and readable storage medium
US20230351174A1 (en) Method of automatically creating ai diagnostic model for diagnosing abnormal state based on noise and vibration data to which enas is applied

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230504

Address after: 430000 room 04, 13 / F, unit 1, building g, St. emilen scientific research section, No. 27, Nanli Road, Hongshan District, Wuhan City, Hubei Province

Patentee after: Wuhan xinzhuoya Technology Development Co.,Ltd.

Address before: No. 39 College Road, Enshi City, Enshi Tujia and Miao Autonomous Prefecture, Hubei Province

Patentee before: HUBEI MINZU University