CN115296837A - SSA optimization-based sustainable integrated intrusion detection method - Google Patents
SSA optimization-based sustainable integrated intrusion detection method Download PDFInfo
- Publication number
- CN115296837A CN115296837A CN202210721435.XA CN202210721435A CN115296837A CN 115296837 A CN115296837 A CN 115296837A CN 202210721435 A CN202210721435 A CN 202210721435A CN 115296837 A CN115296837 A CN 115296837A
- Authority
- CN
- China
- Prior art keywords
- individual
- ssa
- intrusion detection
- population
- detection method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Security & Cryptography (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Hardware Design (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Image Analysis (AREA)
Abstract
A sustainable integrated intrusion detection method based on SSA optimization relates to a network intrusion detection method. The method comprises the following steps: a standard intrusion detection data set is selected as a training set and a test set. Preprocessing the data, and searching the preprocessed data through SSA to obtain a feature subset which maximizes the classification performance of the model. The different models are then trained using a training set containing corresponding feature subsets, and the prediction results are combined by an adaptive integrated decision process. And finally, testing by using the test set. The invention solves the problems that the current network intrusion detection method based on the machine learning model is difficult to classify complex multi-class traffic data and is difficult to obtain the characteristic subset for optimizing the model. The invention can effectively detect complex multi-class flow data, has higher detection precision compared with the traditional intrusion detection method, has the characteristic of sustainable integration, and can continuously integrate a new ML model to optimize the existing model.
Description
Technical Field
The invention relates to the field of network intrusion detection, in particular to a sustainable integrated intrusion detection method based on SSA optimization.
Background
With the increasing intelligence and digitization of society, related network intrusion events occur at times, and the property safety of enterprises and individuals is seriously affected. The intrusion detection technology is an active defense technology applied to an intrusion detection system, and can effectively detect network intrusion behaviors by continuously monitoring the flow of a key network link.
The traditional intrusion detection technology is mainly divided into two types: signature-based (misuse) intrusion detection, and anomaly-based intrusion detection. And carrying out pattern matching on the unique features carried by the attack behaviors based on the intrusion detection of the signature so as to judge whether the flow is abnormal or not. However, the method can only detect the currently known attack type, and is easy to cause higher false negative rate. The intrusion detection based on the abnormity judges whether the flow behavior is abnormal or not by establishing a normal behavior model, and the method has the advantages that unknown attacks can be found, but higher false alarm rate is easily caused.
Aiming at the problems existing in the traditional intrusion detection technology, a Machine Learning (ML) technology based on data driving is introduced into the intrusion detection field, and an ML model can directly mine the behavior rules of normal and abnormal flow, so that the problems existing in the traditional intrusion detection technology are solved to a certain extent.
However, a single ML classification model often cannot effectively detect all classes on a multi-classification problem, and an Ensemble Learning (EL) method combining the classification advantages of multiple ML models can effectively alleviate such problems. The idea of EL is to learn multiple models from data, explicitly or implicitly, and combine them efficiently to obtain more reliable and accurate predictions. Training a more reliable and accurate EL model requires two preconditions, namely that the basis classifiers be quasi-distinct, and an efficient integration strategy.
Feature selection can remove redundant and irrelevant features, thereby improving the performance of the base classifier. The sea squirt Algorithm (Salp Swarm Algorithm, SSA) is a group optimization Algorithm and is widely applied to the field of feature selection and the field of engineering optimization. Weighted hard voting is a simple and effective heterogeneous classifier integration strategy, and the weights after careful calibration are often more competitive compared with other integration strategies.
Disclosure of Invention
The invention relates to a sustainable integrated intrusion detection method based on SSA optimization. And then training corresponding machine learning models by using different optimal feature subsets, finally integrating the prediction results of a plurality of machine learning models in a multi-class weighted hard voting mode, and optimizing the corresponding voting weights by SSA (simple steady state analysis) so as to effectively combine the classification advantages of different ML (maximum likelihood) models and further obtain more accurate and reliable prediction results. In addition, the method has the characteristic of sustainable integration, and new ML models can be continuously integrated to optimize the existing models.
The technical scheme of the invention is as follows:
a sustainable integrated intrusion detection method based on SSA optimization, the method comprising the steps of:
step (1): inputting a reference data set; taking an NSL-KDD data set as an example, the data set comprises normal communication traffic and four different types of attack traffic, namely Dos, probe, U2R and R2L;
step (2): preprocessing a data set; the method comprises three parts of data cleaning, feature coding and data normalization; cleaning data, namely removing repeated samples in the reference data set and samples containing missing values and abnormal values; feature coding is to encode character type discrete features in a reference data set into digital features so as to introduce a subsequent machine learning model; normalizing the data, namely eliminating dimension difference between the features;
and (3): selecting characteristics; searching optimal feature subsets corresponding to different ML models, namely feature subsets with optimal fitness values, through a SSA-based packaged feature selection algorithm;
and (4): classifying the models; training a plurality of heterogeneous machine learning classification models by using the reference data set after feature selection;
and (5): self-adaptive integrated decision making; the predictions of multiple ML models are integrated by way of multi-class weighted hard voting, with the corresponding voting weights determined and optimized by an SSA-based weight optimization algorithm.
The sustainable integration intrusion detection method based on SSA optimization, wherein the reference data set used in the step (1) is as follows: the original NSL-KDD data set contains 148517 samples in total, 30% of the samples are extracted for testing according to the layering idea, and the rest 70% of the samples are used for training, so that the proportion of the samples of different classes in the training set is consistent with that in the testing set.
The sustainable integration intrusion detection method based on SSA optimization is characterized in that in the feature coding part of step (2): three discrete character type characteristics exist in an original NSL-KDD data set, wherein the three discrete character type characteristics are respectively 'protocol-type', 'service' and 'flag', the 'protocol-type' has 3 states, the 'service' has 70 states, and the 'flag' has 11 states; adopting single hot coding for the 'protocol-type' characteristic, and expanding the characteristic into a three-dimensional characteristic; for the 'service' and 'flag' features with more states, replacing the corresponding states by the frequency counts of the states; the encoded data set contains 43-dimensional features in total.
The sustainable integration intrusion detection method based on SSA optimization, wherein in the data normalization part of the step (2): data was scaled to interval [0,1] using a minimum-maximum function, with the specific normalization:
wherein the content of the first and second substances,a characteristic value representing the characteristic of the sample,andrespectively representing the maximum and minimum values of the feature,representing the normalized eigenvalues.
The sustainable integrated intrusion detection method based on the SSA optimization comprises the following steps of (3) modeling the SSA-based packaged feature selection algorithm:
(1) Setting a fitness function:
wherein acc and F1 respectively represent the overall accuracy mean value and the weighted F1 score mean value of the model in 5-fold cross validation on the training set;
(2) Setting parameters; setting the population number to be 30, the maximum iteration number to be 200, the upper search limit to be 1 and the lower search limit to be 0;
(3) Initializing a population; randomly initializing the position of the individual goblet sea squirt in the population within the search limit;
(4) Position coding; binary coding is carried out on the position of each individual in the goblet sea squirt population so as to adapt to the problem of feature selection; where 1 indicates that the feature is selected and 0 indicates that the feature is not selected. The specific coding formula is as follows:
note that the encoding here is only for calculating the fitness value, and the position of individual goblet sea squirt in the population will not change;
(5) Determining a food location; calculating the fitness value of each individual goblet ascidian, determining the goblet ascidian individual with the maximum fitness value, and setting the position as the food position;
(6) Searching a population; respectively updating the individual positions of the leader and the follower according to a population updating formula; in the goblet sea squirt population, the first individual is taken as a leader, and the position updating formula is as follows:
wherein the content of the first and second substances,the first to represent the leaderThe position of the dimension(s) is,to indicate foodThe position of the dimension(s) is,andare respectively the firstUpper and lower bounds of the dimension decision variables;、is thatA random number in between, and a random number,is a convergence factor of the algorithm, plays a role in balancing global exploration and local development, and has an expression ofIn the formula (I), wherein,andrespectively representing the current iteration times and the maximum iteration times;
the other individuals are used as followers, and the position updating formula is as follows:
wherein the content of the first and second substances,indicating the updated position of the individual and,is indicative of the current location of the individual,indicating the location of the previous individual;
(7) Repeating (4) - (6) until a maximum number of iterations is reached.
According to the SSA optimization-based sustainable integrated intrusion detection method, the model classification part in the step (4) is associated with feature selection, and an SSA-based feature selection algorithm can select corresponding optimal feature subsets for different machine learning models.
According to the sustainable integration intrusion detection method based on SSA optimization, the model classification part in the step (4) can integrate multiple different ML models at the same time, a new ML model can be added on the basis of the original model to optimize the classification performance of the existing model, the classification can be realized only by selecting a corresponding optimal feature subset for the new ML model and further optimizing voting weight, and certain universality and expandability are achieved.
The sustainable integrated intrusion detection method based on SSA optimization, wherein in step (5): the adaptive integrated decision making process combines predictions of multiple ML models in a multi-class weighted hard voting manner; the specific decision making process is as follows:
suppose there isA different base classifierThe reference data set hasIndividual category labelThen the weight matrix can be represented as
For a certain sampleClass ofThe weighted probability is output asWhereinIndicating weighted sum ofThe probability of a particular class of the object,denotes the firstIndividual base classifier for classesPredicting; the integrated probability prediction for all base classifiers can be represented as oneDimension vector(ii) a The final decision can be expressed as。
The sustainable integrated intrusion detection method based on SSA optimization, wherein the modeling process of the weight optimization algorithm based on SSA in step (5) is as follows:
a. setting a fitness function:
acc represents the average value of the overall accuracy of the model in 5-fold cross validation on the training set;
b. setting parameters; setting the population quantity to be 30, the maximum iteration number to be 200, the upper search boundary to be 1 and the lower search boundary to be 0;
c. initializing a population; randomly initializing the positions of the goblet and sea squirt individuals in the population within the search limit, wherein the number of the position vector elements represented by each goblet and sea squirt individual is equal to the number of the one-dimensional vector elements generated by the weight matrix according to column extension;
d. determining a food location; calculating the individual fitness value of all goblet ascidians, and determining the goblet ascidian individual position with the maximum fitness value as the food position;
e. searching a population; respectively updating the individual positions of the leader and the follower according to a population updating formula; in the goblet ascidian population, the first individual is used as a leader, and the position updating formula is as follows:
wherein the content of the first and second substances,first to represent leaderThe position of the dimension is measured,to indicate the first of foodThe position of the dimension is measured,andare respectively the firstUpper and lower bounds of the dimension decision variables;、is thatA random number in between, and a random number,is a convergence factor of the algorithm, plays a role in balancing global exploration and local development, and has an expression ofIn the formula (I), the reaction is carried out,andrespectively representing the current iteration times and the maximum iteration times;
the other individuals are used as followers, and the position updating formula is as follows:
wherein the content of the first and second substances,indicating the updated position of the individual and,which is indicative of the current location of the individual,indicating the location of the previous individual;
f. repeating d-e until a maximum number of iterations is reached.
The invention has the following beneficial effects:
according to the sustainable integration intrusion detection method based on SSA optimization, redundant and irrelevant features in original data are removed through packaged feature selection based on an SSA algorithm, the classification performance of a single ML model is enhanced, then decisions of multiple ML models are integrated in a multi-class weighting hard voting mode, voting weights are continuously optimized through a weight optimization algorithm based on SSA, the classification advantages of different models are fully combined, and finally the overall classification performance of an intrusion detection model is effectively improved. The method also provides an effective implementation mode for different ML models, and various novel ML models can be continuously integrated into the intrusion detection model, so that the intrusion detection model is continuously optimized to improve the overall detection performance.
Drawings
FIG. 1 is a block diagram of an overall modeling flow of an embodiment of the present invention;
FIG. 2 is a flow chart of an embodiment of an SSA-based packaged feature selection process;
FIG. 3 is a pseudo-code diagram of an SSA-based packed feature selection algorithm according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating multi-class weighted voting modeling according to an embodiment of the present invention;
fig. 5 is a pseudo code diagram of an SSA-based weight optimization algorithm according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings so that those skilled in the art can refer thereto and implement the same.
The invention provides a sustainable integrated intrusion detection method based on SSA optimization, which comprises the following steps:
in the step (1): selecting a public intrusion detection data set NSL-KDD as an evaluation sample, wherein the data set comprises 148517 sample data in total, extracting 30% of samples from the data set for testing according to a layering idea, and using the rest 70% of samples for training, so as to ensure that the proportion of different types of samples in a training set is consistent with that in a testing set.
In the step (2): three discrete character type characteristics exist in an original NSL-KDD data set, namely 'protocol-type', 'service' and 'flag', and the 'protocol-type' characteristic is subjected to independent thermal coding and expanded into a three-dimensional characteristic. For the 'service' and 'flag' features with more states, the frequency count of the state is used to replace the corresponding state. The encoded data set contains 43-dimensional features in total. Secondly, in order to eliminate dimension difference between different characteristics, normalization processing is carried out on the data, values of all the characteristics are scaled to an interval [0,1], and a specific normalization formula is as follows:
wherein the content of the first and second substances,a characteristic value representing the characteristic of the sample,andrespectively representing the maximum and minimum values of the feature,representing the normalized eigenvalues.
In the step (3), the SSA-based packaging type feature selection algorithm is used for searching the optimal feature subsets corresponding to different machine learning models, and the specific SSA-based packaging type feature selection algorithm comprises the following steps:
(1) Setting a fitness function:
wherein acc and F1 respectively represent the overall accuracy mean value and the F1 score mean value of 5-fold cross validation of the model on the training set;
(2) And setting parameters. Setting the population number to be 30, the maximum iteration number to be 200, the upper search limit to be 1 and the lower search limit to be 0;
(3) And (4) initializing a population. Randomly initializing the position of the individual of the goblet sea squirt in the population within the search limits.
(4) And (4) position coding. The location of each individual in the cask ascidian population is binary coded to accommodate the feature selection problem. Where 1 indicates that a feature is selected and 0 indicates that a feature is not selected. The specific coding formula is as follows:
note that the encoding here is only for calculating the fitness value, and the position of individual casoderma in the population will not change
(5) The food location is determined. Calculating the fitness value of each goblet ascidian individual, determining the goblet ascidian individual with the maximum fitness value, and setting the position as the food position.
(6) And (4) searching the population. And respectively updating the individual positions of the leader and the follower according to a population updating formula. In the goblet sea squirt population, the first individual is taken as a leader, and the position updating formula is as follows:
wherein, the first and the second end of the pipe are connected with each other,the first to represent the leaderThe position of the dimension(s) is,to indicate foodThe position of the dimension(s) is,andare respectively the firstThe upper and lower bounds of the dimension decision variables.、Is thatA random number in between, and a random number,is a convergence factor of the algorithm, plays a role in balancing global exploration and local development, and has an expression ofIn the formula (I), wherein,andrespectively representing the current iteration number and the maximum iteration number.
The other individuals are used as followers, and the position updating formula is as follows:
wherein, the first and the second end of the pipe are connected with each other,indicating the updated position of the individual and,which is indicative of the current location of the individual,indicating the location of the previous individual of the cask ascidian.
(7) Repeating (4) - (6) until a maximum number of iterations is reached.
In the step (4): the SSA-based packed feature selection algorithm is first used to search for optimal feature subsets corresponding to different ML models, and then the different ML models are trained and evaluated using a training set that contains only the optimal feature subsets.
In the step (5): the adaptive integrated decision process combines predictions of multiple ML models in a multi-class weighted hard vote, with corresponding vote weights determined and optimized by an SSA-based weight optimization algorithm. The specific decision making process is as follows:
suppose there isA different base classifierThe reference data set hasIndividual category labelThen the weight matrix can be represented as
For a certain sampleClass ofThe weighted probability is output asWhereinIndicating weighted sum ofThe probability of an individual class of the object,denotes the firstIndividual base classifier for classesThe prediction of (a) is performed,voting a weight for it. The integrated probability prediction for all base classifiers can be represented as oneDimension vector. The final decision can be expressed as。
The weight matrix in the weighted hard voting process is determined and optimized through a weight optimization algorithm based on SSA, and the specific modeling process is as follows:
a. setting a fitness function:
acc represents the average value of the overall accuracy of the model in 5-fold cross validation on the training set;
b. and setting parameters. Setting the population number to be 30, the maximum iteration number to be 200, the upper search limit to be 1 and the lower search limit to be 0;
c. and (5) initializing a population. And randomly initializing the positions of the goblet and sea squirt individuals in the population within the search limit, wherein the number of the position vector elements represented by each goblet and sea squirt individual is equal to the number of the one-dimensional vector elements generated by column extension of the weight matrix.
d. The food location is determined. Calculating the individual fitness value of all the goblet ascidians, and determining the individual position of the goblet ascidian with the maximum fitness value as the food position.
e. And (5) searching the population. And respectively updating the individual positions of the leader and the follower according to a population updating formula. In the goblet sea squirt population, the first individual is taken as a leader, and the position updating formula is as follows:
wherein the content of the first and second substances,first to represent leaderThe position of the dimension(s) is,to indicate foodThe position of the dimension is measured,andare respectively the firstThe upper and lower bounds of the dimension decision variables.、Is thatA random number in between, and a random number,is a convergence factor of the algorithm, plays a role in balancing global exploration and local development, and has an expression ofIn the formula (I), the reaction is carried out,andrespectively representing the current iteration number and the maximum iteration number.
The other individuals are used as followers, and the position updating formula is as follows:
wherein the content of the first and second substances,indicating the updated position of the individual and,which is indicative of the current location of the individual,indicating the location of the previous individual of the cask ascidian.
f. Repeating d-e until a maximum number of iterations is reached.
In order to verify the beneficial effects of the method, three machine learning models, namely a Decision Tree (DT), a Random Forest (RF) and an eXtreme Gradient Boosting (XGboost) with default parameters are selected to realize the method, then indexes such as accuracy, an F1 score, detection time and the like are used for evaluation, and finally the method is compared with a Particle Swarm Optimization (PSO) algorithm and a Grey Wolf Optimization (GHO) algorithm.
TABLE 1 comparison of Performance of different optimization algorithms on the NSL-KDD test set
As shown in table 1, the accuracy and F1 score of the ML model can be effectively improved by applying the group optimization algorithm to the feature selection process, wherein the SSA according to the present invention obtains the highest accuracy and F1 score, which are better than PSO and GWO. After adaptive voting, the method effectively combines the classification advantages of different ML models, and obtains higher accuracy and F1 score. In terms of detection time, the detection time of the method is also reduced by more than 30% compared with the other two methods.
According to the sustainable integration intrusion detection method based on SSA optimization, firstly, SSA is utilized to independently select the optimal feature subset for different ML models, and then the classification performance of a base classifier is enhanced. And then, the classification advantages of different ML models are combined through self-adaptive decision-making, and finally, the classification performance of the intrusion detection model is effectively improved.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and modifications and variations of the present invention are possible for those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the technical scheme and the conception of the invention shall be included in the protection scope of the invention.
Claims (9)
1. A sustainable integrated intrusion detection method based on SSA optimization, characterized by comprising the following steps:
step (1): inputting a reference data set; taking an NSL-KDD data set as an example, the data set comprises normal communication traffic and four different types of attack traffic, namely Dos, probe, U2R and R2L;
step (2): preprocessing a data set; the method comprises three parts of data cleaning, feature coding and data normalization; cleaning data, namely removing repeated samples in the reference data set and samples containing missing values and abnormal values; feature encoding, namely encoding character type discrete features in a reference data set into digital features so as to introduce a subsequent machine learning model; normalizing the data, namely eliminating dimension difference between the features;
and (3): selecting characteristics; searching optimal feature subsets corresponding to different ML models, namely feature subsets with optimal fitness values, through a SSA-based packaged feature selection algorithm;
and (4): classifying the models; training a plurality of heterogeneous machine learning classification models by using the reference data set after feature selection;
and (5): self-adaptive integrated decision making; the predictions of multiple ML models are integrated by way of multi-class weighted hard voting, with the corresponding voting weights determined and optimized by an SSA-based weight optimization algorithm.
2. The SSA-optimization-based sustainable integrated intrusion detection method according to claim 1, wherein the step (1) uses a benchmark dataset comprising: the original NSL-KDD data set contains 148517 samples in total, 30% of the samples are extracted for testing according to the layering idea, and the rest 70% of the samples are used for training, so that the proportion of the samples of different classes in the training set is consistent with that in the testing set.
3. The SSA-optimized sustainable integrated intrusion detection method according to claim 1, wherein the signature coding part of step (2): three discrete character type characteristics exist in an original NSL-KDD data set, wherein the three discrete character type characteristics are respectively 'protocol-type', 'service' and 'flag', the 'protocol-type' has 3 states, the 'service' has 70 states, and the 'flag' has 11 states; adopting single hot coding for the 'protocol-type' characteristic, and expanding the characteristic into a three-dimensional characteristic; for the 'service' and 'flag' features with more states, replacing the corresponding states by the frequency counts of the states; the encoded data set contains 43-dimensional features in total.
4. The SSA optimization-based sustainable integrated intrusion detection method of claim 1, wherein in the data normalization part of step (2): the data is scaled to the interval [0,1] using a minimum-maximum function, with the specific normalization:
5. The SSA-based optimized sustainable integration intrusion detection method according to claim 1, wherein the SSA-based packed feature selection algorithm modeling process of step (3) is:
(1) Setting a fitness function:
wherein acc and F1 respectively represent the overall accuracy mean value and the weighted F1 score mean value of the model in 5-fold cross validation on the training set;
(2) Setting parameters; setting the population number to be 30, the maximum iteration number to be 200, the upper search limit to be 1 and the lower search limit to be 0;
(3) Initializing a population; randomly initializing the position of the individual goblet sea squirt in the population within the search limit;
(4) Position coding; binary coding the position of each individual in the goblet sea squirt population to adapt to the problem of feature selection; wherein 1 indicates that the feature is selected, and 0 indicates that the feature is not selected; the specific coding formula is as follows:
note that the encoding here is only for calculating fitness values, and the location of individual casuia haichoides in the population will not change;
(5) Determining a food location; calculating the fitness value of each individual goblet ascidian, determining the goblet ascidian individual with the maximum fitness value, and setting the position as the food position;
(6) Searching a population; respectively updating the individual positions of the leader and the follower according to a population updating formula; in the goblet sea squirt population, the first individual is taken as a leader, and the position updating formula is as follows:
wherein the content of the first and second substances,the first to represent the leaderThe position of the dimension is measured,to indicate the first of foodThe position of the dimension is measured,andare respectively the firstUpper and lower bounds of the dimension decision variables;、is thatA random number in between, and a random number,is a convergence factor of the algorithm, plays a role in balancing global exploration and local development, and has an expression ofIn the formula (I), the reaction is carried out,andrespectively representing the current iteration times and the maximum iteration times;
the other individuals are used as followers, and the position updating formula is as follows:
wherein the content of the first and second substances,indicating the updated position of the individual and,is indicative of the current location of the individual,indicating the location of the previous individual;
(7) Repeating (4) - (6) until a maximum number of iterations is reached.
6. The SSA optimization-based sustainable integrated intrusion detection method according to claim 1, wherein the model classification part of step (4) is associated with feature selection, and the SSA-based feature selection algorithm can select corresponding optimal feature subsets for different machine learning models.
7. The SSA optimization-based sustainable integrated intrusion detection method according to claim 1, wherein the model classification part in step (4) can integrate a plurality of different ML models simultaneously, and can also add a new ML model on the basis of the original model to optimize the existing model classification performance, and the model classification can be realized by only selecting a corresponding optimal feature subset for the new ML model and further optimizing voting weight, and has certain universality and expandability.
8. The SSA-optimized sustainable integrated intrusion detection method according to claim 1, wherein in the step (5): the adaptive integrated decision making process combines predictions of multiple ML models in a multi-class weighted hard voting manner; the specific decision making process is as follows:
suppose there isA different base classifierThe reference data set hasIndividual category labelThen the weight matrix can be represented as
For a certain sampleClass ofThe weighted probability is output asWhereinIndicating weighted sum ofThe probability of an individual class of the object,denotes the firstIndividual base classifier for classesPredicting; the integrated probability prediction for all base classifiers can be represented as oneDimension vector(ii) a The final decision can be expressed as。
9. The sustainable integrated intrusion detection method based on SSA optimization according to claim 1, wherein the modeling process of the SSA-based weight optimization algorithm in the step (5) is:
a. setting a fitness function:
acc represents the average value of the overall accuracy of the model in 5-fold cross validation on the training set;
b. setting parameters; setting the population number to be 30, the maximum iteration number to be 200, the upper search limit to be 1 and the lower search limit to be 0;
c. initializing a population; randomly initializing the positions of the goblet and sea squirt individuals in the population within the search limit, wherein the number of the position vector elements represented by each goblet and sea squirt individual is equal to the number of the one-dimensional vector elements generated by the weight matrix according to column extension;
d. determining a food location; calculating the individual fitness value of all goblet ascidians, and determining the goblet ascidian individual position with the maximum fitness value as the food position;
e. searching the population; respectively updating the individual positions of the leader and the follower according to a population updating formula; in the goblet sea squirt population, the first individual is taken as a leader, and the position updating formula is as follows:
wherein the content of the first and second substances,the first to represent the leaderThe position of the dimension(s) is,to indicate foodThe position of the dimension(s) is,andare respectively the firstUpper and lower bounds of the dimension decision variables;、is thatA random number in between, and a random number,is a convergence factor of the algorithm, plays a role in balancing global exploration and local development, and has an expression ofIn the formula (I), wherein,andrespectively representing the current iteration times and the maximum iteration times;
the other individuals are used as followers, and the position updating formula is as follows:
wherein, the first and the second end of the pipe are connected with each other,indicating the updated position of the individual and,which is indicative of the current location of the individual,indicating the location of the previous individual;
f. repeating d-e until a maximum number of iterations is reached.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210721435.XA CN115296837B (en) | 2022-06-24 | 2022-06-24 | Sustainable integrated intrusion detection method based on SSA optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210721435.XA CN115296837B (en) | 2022-06-24 | 2022-06-24 | Sustainable integrated intrusion detection method based on SSA optimization |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115296837A true CN115296837A (en) | 2022-11-04 |
CN115296837B CN115296837B (en) | 2023-09-15 |
Family
ID=83821233
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210721435.XA Active CN115296837B (en) | 2022-06-24 | 2022-06-24 | Sustainable integrated intrusion detection method based on SSA optimization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115296837B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105512686A (en) * | 2015-12-14 | 2016-04-20 | 深圳大学 | Integrated feature selection method and system |
CN112511519A (en) * | 2020-11-20 | 2021-03-16 | 华北电力大学 | Network intrusion detection method based on feature selection algorithm |
CN113839926A (en) * | 2021-08-31 | 2021-12-24 | 哈尔滨工业大学 | Intrusion detection system modeling method, system and device based on gray wolf algorithm feature selection |
CN114244549A (en) * | 2021-08-10 | 2022-03-25 | 和安科技创新有限公司 | GSSK-means abnormal flow detection method, memory and processor for industrial internet |
-
2022
- 2022-06-24 CN CN202210721435.XA patent/CN115296837B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105512686A (en) * | 2015-12-14 | 2016-04-20 | 深圳大学 | Integrated feature selection method and system |
CN112511519A (en) * | 2020-11-20 | 2021-03-16 | 华北电力大学 | Network intrusion detection method based on feature selection algorithm |
CN114244549A (en) * | 2021-08-10 | 2022-03-25 | 和安科技创新有限公司 | GSSK-means abnormal flow detection method, memory and processor for industrial internet |
CN113839926A (en) * | 2021-08-31 | 2021-12-24 | 哈尔滨工业大学 | Intrusion detection system modeling method, system and device based on gray wolf algorithm feature selection |
Non-Patent Citations (1)
Title |
---|
ALANOUD ALSALEH等: "The Influence of Salp Swarm Algorithm-Based Feature Selection on Network Anomaly Intrusion Detection", 《IEEE ACCESS》, vol. 9 * |
Also Published As
Publication number | Publication date |
---|---|
CN115296837B (en) | 2023-09-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109408389B (en) | Code defect detection method and device based on deep learning | |
CN111753881B (en) | Concept sensitivity-based quantitative recognition defending method against attacks | |
CN110704840A (en) | Convolutional neural network CNN-based malicious software detection method | |
CN110070141A (en) | A kind of network inbreak detection method | |
CN111783442A (en) | Intrusion detection method, device, server and storage medium | |
CN112039903B (en) | Network security situation assessment method based on deep self-coding neural network model | |
CN113297572B (en) | Deep learning sample-level anti-attack defense method and device based on neuron activation mode | |
CN111726349B (en) | GRU parallel network flow abnormity detection method based on GA optimization | |
CN111143838B (en) | Database user abnormal behavior detection method | |
CN112437053B (en) | Intrusion detection method and device | |
CN112215278B (en) | Multi-dimensional data feature selection method combining genetic algorithm and dragonfly algorithm | |
CN111400713B (en) | Malicious software population classification method based on operation code adjacency graph characteristics | |
CN112950445A (en) | Compensation-based detection feature selection method in image steganalysis | |
CN115577357A (en) | Android malicious software detection method based on stacking integration technology | |
CN116318928A (en) | Malicious traffic identification method and system based on data enhancement and feature fusion | |
CN114897124A (en) | Intrusion detection feature selection method based on improved wolf optimization algorithm | |
CN117278314A (en) | DDoS attack detection method | |
CN113283901A (en) | Byte code-based fraud contract detection method for block chain platform | |
CN111737688B (en) | Attack defense system based on user portrait | |
CN113098862A (en) | Intrusion detection method based on combination of hybrid sampling and expansion convolution | |
CN115296837A (en) | SSA optimization-based sustainable integrated intrusion detection method | |
CN113591962B (en) | Network attack sample generation method and device | |
CN114528908A (en) | Network request data classification model training method, classification method and storage medium | |
CN113269217A (en) | Radar target classification method based on Fisher criterion | |
KR20200067713A (en) | System and method for detecting of Incorrect Triple |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |