CN114065933A - Unknown threat detection method based on artificial immunity thought - Google Patents

Unknown threat detection method based on artificial immunity thought Download PDF

Info

Publication number
CN114065933A
CN114065933A CN202111420523.8A CN202111420523A CN114065933A CN 114065933 A CN114065933 A CN 114065933A CN 202111420523 A CN202111420523 A CN 202111420523A CN 114065933 A CN114065933 A CN 114065933A
Authority
CN
China
Prior art keywords
detector
module
classification
neural network
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111420523.8A
Other languages
Chinese (zh)
Other versions
CN114065933B (en
Inventor
彭海朋
陈冠华
李丽香
黄京泽
孙婧瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202111420523.8A priority Critical patent/CN114065933B/en
Publication of CN114065933A publication Critical patent/CN114065933A/en
Application granted granted Critical
Publication of CN114065933B publication Critical patent/CN114065933B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/231Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an unknown threat detection method based on an artificial immunity thought, which can effectively detect known threats and unknown threats. The pre-classification module introduces a convolutional neural network to solve the pre-collection of the self set and the pre-classification problem of network flow in detection; the negative selection module introduces a gene library for solving the random generation problem of an initial detector and the specific immunity problem aiming at unknown threats, and introduces hierarchical clustering for improving the training efficiency of the detector; the clone mutation module solves the overlapping detection problem of the high affinity detector by introducing a detector optimization algorithm based on a genetic algorithm; meanwhile, a memory detector fading mechanism based on LRU is introduced, so that the storage space is effectively released, and the detection efficiency is improved; the mRNA vaccination module introduces an mRNA vaccine algorithm based on characteristic importance ordering, decomposes the detected unknown threats according to the gene importance and injects the decomposed unknown threats into a gene library, and generates a corresponding detector to realize specific immunity to the unknown threats and the variants thereof.

Description

Unknown threat detection method based on artificial immunity thought
Technical Field
The invention relates to the technical field of intrusion detection, in particular to an unknown threat detection method based on an artificial immunity thought.
Background
The biological immune system has the characteristics of diversity, tolerance, self-organization, self-adaptation and the like, artificial immunity is a mathematical model by referring to the concept of the biological immune system, a cell model is constructed by defining morphological space, self-body and non-self-body and affinity calculation, the maturation process of immune cells in bone marrow is simulated by a negative selection algorithm, the immune cells which can be matched with the self-body are eliminated, the rest immune cells reach the mature state, and threat detection is carried out by the mature immune cells.
Traditional artificial immunity-based models: firstly, a Self set is constructed manually, then an immature detector is randomly generated to enter a negative selection process for tolerance, the tolerance process is a process of Self-matching of the detector and Self, when the affinity reaches a matching threshold, the immature detector dies, otherwise, the immature detector evolves to become a mature detector and participates in detection.
Traditional intrusion detection models based on artificial immunity. Has the following disadvantages:
first, Self sets the adaptive problem. The Self set is usually a very large set, the collected normal sample is only a small subset of the Self set and cannot necessarily represent the real Self set, and if the number of elements of the two sets is greatly different, error scaling is caused, and finally the error of an actual result is large; in addition, in practical application, the number of elements in the Self set is often changed along with time, the traditional Self-non Self set model is usually manually distinguished, and the manually distinguished elements in the Self set are often dangerous, because the elements in the current Self set may become the elements in the tomorrow non Self set, such Self set lacks dynamic coverage, more false positives are generated, and the accuracy of the antibody set is difficult to guarantee; finally, most of the conventional Self sets are constructed according to static behavior information, and in a real environment, elements in the Self sets may change along with time, so that it is not desirable to artificially distinguish the elements in one Self set, and therefore, the failure of Self-adaptive updating of the Self sets becomes one of the disadvantages of the conventional model.
Second, the detector creates efficiency problems. In the negative selection algorithm proposed by Forrest et al, the detector generation efficiency is very low, the candidate detectors mature through the negative selection process, assuming that N is the size of the self-set to be trained, P is the probability of matching between antigen and antibody, P isfIs the failure rate (probability that antigen is not matched by any antibody), the number of candidate detectors should be Nc=-ln(Pf)/(P(1-P)N) Then the time complexity of the algorithm is O (N)cN), the number of candidate detectors grows exponentially as the size of the training set increases, and the time cost of the detection phase is higher. In the negative selection algorithm of real-valued representation, when the radius of the detector is constant, the volume of the hyper-sphere decreases with the increase of the dimension, and when the dimension exceeds 20, the volume is close to 0, which is also the reason that the performance of the detector in the high-dimensional real-valued space is not high. Furthermore, the increase in dimension also brings an increase in temporal complexity as well as spatial complexity.
Third, the detector identifies spatially large area overlap problems. When a high affinity detector that has evolved frequently is selected, many high affinity detectors may overlap each other in large areas of the recognition space in the next generation of detector clusters, which results in a relatively smaller recognition space for the entire detector cluster, which may then fall into a subset of the nonself set.
Fourth, the specific immune problem of unknown threats. When an unknown threat is found in a traditional model, only the threat information is recorded and a corresponding detector is generated, no countermeasure is provided for the threat, when the threat is changed, the process is carried out again, the reaction speed is low, and the time cost is high.
Disclosure of Invention
Aiming at the problems, the invention provides an unknown threat detection method based on an artificial immunity thought, which solves the Self-adaption problem of a Self set by introducing a pre-classification module, solves the generation efficiency problem of a detector by carrying out pre-hierarchical clustering on the Self set, solves the large-area overlapping problem of the identification space of the detector by introducing a detector optimization algorithm based on a genetic algorithm, and solves the specific immunity problem of the unknown threat by introducing an mRNA vaccine injection module.
In order to achieve the above purpose, the invention provides the following technical scheme:
an unknown threat detection method based on an artificial immunity idea is disclosed, wherein a detection model comprises a pre-classification module, a negative selection module, a clone variation module and an mRNA vaccination module, wherein the pre-classification module is introduced into a convolutional neural network; introducing a gene library and hierarchical clustering by a negative selection module; the clonal variation module incorporates a genetic algorithm-based detector optimization algorithm that controls the affinity of the detector by detector concentration, while incorporating an LRU-based memory detector regression mechanism; the mRNA vaccination module introduces an mRNA vaccine algorithm based on characteristic importance ranking, decomposes the detected unknown threats according to gene importance and injects the threats into a gene library, and generates a corresponding detector.
Further, the unknown threat detection method based on the artificial immunity thought comprises a training phase and a detection phase, wherein,
a training stage: training a convolutional neural network by using a labeled data set to enable the convolutional neural network to have initial classification capability, constructing a gene library by using an initial nonself set and generating an initial detector, and participating in negative selection by using an initial self set to generate a first generation mature detector set;
a detection stage: inputting network flow, extracting and coding features, and entering the following steps:
s1, matching with the memory detector group, if detected, indicating that the threat is the recorded threat; if not, go to S2;
s2, classifying the convolutional neural network by using a pre-classification module, then matching the convolutional neural network with a mature detector group, and entering a self set to participate in the next negative selection process if the convolutional neural network is not matched and the classification is positive; if so, and the classification is negative, then the unknown threat is determined to proceed to S3; if the matching condition is inconsistent with the classification condition, entering a manual checking process, marking correct labels on the data and adding the data into a training set of the neural network for training;
s3, enabling the discovered unknown threats to enter an mRNA vaccination module, extracting feature importance by using a random forest algorithm, decomposing key features and storing the key features in a gene bank, generating a maturity and memory detector capable of detecting similar attacks through negative selection and clonal variation, adding the generated maturity detector into a maturity detector group, and adding the memory detector into a memory detector group;
s4, updating the maturity detector and the memory detector using an LRU-based detector kill algorithm.
Further, in step S2, a convolutional neural network is trained in advance using the correctly labeled data set, and the convolutional neural network uses 2 convolutional layers, 2 pooling layers, a full-link layer, and a classification layer.
Further, in step S2, the step of classifying the convolutional neural network is:
s201, collecting data, making classification labels, and dividing the classification labels into a training set and a data set;
s202, designing a convolutional neural network: the convolutional neural network comprises 23 multiplied by 3 convolutional layers, 2 pooling layers, 1 full connection layer and 1 classification layer, wherein the pooling layers adopt a Max pooling mode, and the full connection layers are provided with 256 neurons;
s203, training the model in the S202 by using the training set in the S201, adjusting training parameters according to the test result and obtaining an optimal model;
and S204, dividing the data set into a self set and a non-nself set by using the optimal model of S203.
Further, the negative selection in step S3 includes the steps of:
s301, collecting corresponding data by using an existing data set or a pre-classification module to construct a self set, and performing hierarchical clustering on the self set to obtain N clustering centers;
s302, randomly generating N initial detectors from a gene library, wherein the initial detectors in the same batch are defined as: { character string, R, tag, fixness, age }, where the radius R is determined by the minimum distance of the previous generation initial detector from self,
Figure BDA0003377189690000041
d is the Euclidean distance between two character strings, and the fitness is updated when the two character strings are matched in the detection stage; tag is a label and is used for recording the state of the detector, and the value is Image, match and memory; age is used for recording the algebra of the detector, and the initial immature detector age is 0;
s303, for each detector: the detector is sequentially matched with the clustering center, the distance from the detector to the clustering center is calculated, whether the clustering center is within the radius or not is judged, and if the distance is smaller than the radius, the detector disappears;
and S304, when the negative selection process of all the detectors is completed, recording the minimum distance r from the same batch of detectors to the self-set for the next generation, adding the minimum distance r into the mature detector set, and updating tag (match) of the mature detector set, wherein the age (age + 1) of the mature detector set.
Further, the cloning variation in step S3 includes the steps of:
s311, sending a series of parent detector sets which detect the threat to a clone mutation module;
s312, sequentially selecting the detectors with the maximum affinity, and setting the initial concentration of each detector to be N-1;
s313, recording the selected detector as A and the mutated detector as a, defining a mutation number threshold value x, randomly selecting a mutation operator and mutating, if the distance between A and a is larger than the radius RA of A and the mutation (a)>fitness (a), concentration N +1,
Figure BDA0003377189690000042
and generating a detector a, wherein Ra ═ Ra, fitness (a) ═ fitness (a), age (a) is initialized to 1, and age (a) ═ age (a) + 1;
s314, adding A and a into the memory detector set and the maturity detector set.
Further, the mRNA vaccination in step S3 includes the steps of:
s321, when an unknown threat is found, sending the unknown threat to an mRNA vaccination module;
s322, constructing a data set and outputting feature importance by using a random forest algorithm;
s323, setting a threshold value to be N, taking the characteristic that the sum of the contribution of the characteristic values reaches the set threshold value N, and adding the characteristic into a gene library;
s324, randomly generating a detector with important genes;
s325, entering a negative selection step;
s326, cloning and mutating to generate a corresponding detector.
Further, the step of updating the method in step S4 is: if the threshold value of the number of matches is set to be N and the threshold value of age A, detectors whose age is equal to or more than A and the number of matches does not reach N die.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides an unknown threat detection method based on an artificial immunity thought, a detection model is divided into four modules: a pre-classification module, a negative selection module, a clonal variation module and an mRNA vaccination module. The pre-classification module introduces a convolutional neural network to solve the pre-collection of the negative selection module self set and the pre-classification problem of network traffic in detection. The negative selection module introduces a gene bank for solving the random generation problem of the initial detector and the specific immunity problem of the mRNA vaccine module against the position threat, and introduces hierarchical clustering for improving the training efficiency of the detector. The clone mutation module solves the overlapping detection problem of the high affinity detector by introducing a detector optimization algorithm based on a genetic algorithm, controls the affinity of the detector through the concentration of the detector, and further enables the frequently mutated detector to be slowly unselected, and avoids the detection range of the detector falling into a subset of a nonself set. Meanwhile, a memory detector fading mechanism based on the LRU is introduced, so that the storage space is effectively released, and the detection efficiency is improved. The mRNA vaccination module introduces an mRNA vaccine algorithm based on characteristic importance ordering, decomposes the detected unknown threats according to the gene importance and injects the decomposed unknown threats into a gene library, and generates a corresponding detector to realize specific immunity to the unknown threats and the variants thereof. The four modules act together to effectively detect known and unknown threats.
Drawings
In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a flowchart of an unknown threat detection method based on an artificial immunization idea according to an embodiment of the present invention.
Fig. 2 is a flowchart of a pre-classification module according to an embodiment of the present invention.
FIG. 3 is a flow chart of a negative selection module according to an embodiment of the present invention.
FIG. 4 is a flow chart of a clonal variation module provided in an embodiment of the present invention.
FIG. 5 is a flow chart of an mRNA vaccination module provided by an embodiment of the present invention.
Detailed Description
For a better understanding of the present solution, the method of the present invention is described in detail below with reference to the accompanying drawings.
The unknown threat detection method based on the artificial immunity thought provided by the embodiment of the invention has the advantages that the whole model is shown in figure 1 and is divided into four modules: a pre-classification module, a negative selection module, a clone variation module and an mRNA vaccination module, comprising a training stage and a detection stage, wherein,
a training stage: training a convolutional neural network by using a labeled data set to enable the convolutional neural network to have initial classification capability, constructing a gene library by using an initial nonself set and generating an initial detector, and participating in negative selection by using an initial self set to generate a first generation mature detector set;
a detection stage: inputting network flow, extracting and coding features, and entering the following steps:
s1, matching with the memory detector group, if detected, indicating that the threat is the recorded threat; if not, go to S2;
s2, classifying the convolutional neural network by using a pre-classification module, then matching the convolutional neural network with a mature detector group, and entering a self set to participate in the next negative selection process if the convolutional neural network is not matched and the classification is positive; if so, and the classification is negative, then the unknown threat is determined to proceed to S3; if the matching condition is inconsistent with the classification condition, entering a manual checking process, marking correct labels on the data and adding the data into a training set of the neural network for training;
further, in step S2, a convolutional neural network is trained in advance using the correctly labeled data set, and the convolutional neural network uses 2 convolutional layers, 2 pooling layers, a full-link layer, and a classification layer.
S3, enabling the discovered unknown threats to enter an mRNA vaccination module, extracting feature importance by using a random forest algorithm, decomposing key features and storing the key features in a gene bank, generating a maturity and memory detector capable of detecting similar attacks through negative selection and clonal variation, adding the generated maturity detector into a maturity detector group, and adding the memory detector into a memory detector group;
s4, updating the maturity detector and the memory detector using an LRU-based detector kill algorithm.
Further, the step of updating the method in step S4 is: if the threshold value of the number of matches is set to be N and the threshold value of age A, detectors whose age is equal to or more than A and the number of matches does not reach N die.
Each module is described separately below.
With respect to the pre-classification module:
in step S2, as shown in fig. 2, the convolutional neural network classification step is:
s201, collecting data, making classification labels, and dividing the classification labels into a training set and a data set;
s202, designing a convolutional neural network: the convolutional neural network comprises 23 multiplied by 3 convolutional layers, 2 pooling layers, 1 full connection layer and 1 classification layer, wherein the pooling layers adopt a Max pooling mode, and the full connection layers are provided with 256 neurons;
s203, training the model in the S202 by using the training set in the S201, adjusting training parameters according to the test result and obtaining an optimal model;
and S204, dividing the data set into a self set and a non-nself set by using the optimal model of S203.
Regarding the negative selection module:
in step S3, as shown in fig. 3, the negative selection step is:
s301, collecting corresponding data by using an existing data set or a pre-classification module to construct a self set, and performing hierarchical clustering on the self set to obtain N clustering centers;
s302, randomly generating N initial detectors from a gene library, wherein the initial detectors in the same batch are defined as: { character string, R, tag, fixness, age }, where the radius R is determined by the minimum distance of the previous generation initial detector from self,
Figure BDA0003377189690000071
d is the Euclidean distance between two character strings, and the fitness is updated when the two character strings are matched in the detection stage; tag is a label and is used for recording the state of the detector, and the value is Image, match and memory; age is used for recording the algebra of the detector, and the initial immature detector age is 0;
s303, for each detector: the detector is sequentially matched with the clustering center, the distance from the detector to the clustering center is calculated, whether the clustering center is within the radius or not is judged, and if the distance is smaller than the radius, the detector disappears;
and S304, when the negative selection process of all the detectors is completed, recording the minimum distance r from the same batch of detectors to the self-set for the next generation, adding the minimum distance r into the mature detector set, and updating tag (match) of the mature detector set, wherein the age (age + 1) of the mature detector set.
Regarding clonal variation modules:
in step S3, as shown in fig. 4, the clonal variation steps are:
s311, sending a series of parent detector sets which detect the threat to a clone mutation module;
s312, sequentially selecting the detectors with the maximum affinity, and setting the initial concentration of each detector to be N-1;
s313, recording the selected detector as A and the mutated detector as a, defining a mutation number threshold value x, randomly selecting a mutation operator and mutating, if the distance between A and a is larger than the radius RA of A and the mutation (a)>fitness (a), concentration N +1,
Figure BDA0003377189690000081
and generating a detector a, wherein Ra ═ Ra, fitness (a) ═ fitness (a), age (a) is initialized to 1, and age (a) ═ age (a) + 1;
s314, adding A and a into the memory detector set and the maturity detector set.
For the mRNA vaccination module:
in step S3, as shown in fig. 5, the mRNA vaccination step is:
s321, when an unknown threat is found, sending the unknown threat to an mRNA vaccination module;
s322, constructing a data set and outputting feature importance by using a random forest algorithm;
s323, setting a threshold value to be N, taking the characteristic that the sum of the contribution of the characteristic values reaches the set threshold value N, and adding the characteristic into a gene library;
s324, randomly generating a detector with important genes;
s325, entering a negative selection step;
s326, cloning and mutating to generate a corresponding detector.
Compared with the prior art, the method of the invention has the following beneficial effects:
1. pre-classification module
The pre-classification module can effectively solve the Self-adaption problem of the Self set, so that the Self set has dynamic coverage, and specifically, the invention has two functions, namely, generating an initial Self set and a nonself set, and continuously and dynamically expanding the Self set for network traffic classification.
2. And training an optimization algorithm based on a detector of hierarchical clustering.
In a traditional artificial immunity-based unknown threat detection model, any strategy is rarely adopted to reduce the distance calculation cost: the distance from the candidate detector to the self-set must be calculated, thereby reducing efficiency. The dynamic tolerance training of the detector based on clustering is carried out by introducing a detector training optimization algorithm based on hierarchical clustering, the self-clustering centers participate in immune tolerance training, and the number of the clustering centers is far less than that of the self-sets and is relatively stable, so that the training efficiency of the detector is greatly improved compared with that of the traditional method.
3. Genetic algorithm-based detector model clonal variation optimization mechanism
When a frequently evolving high affinity detector is selected, many high affinity detectors may overlap each other in large areas of the recognition space in the next generation of detector populations, which results in a relatively smaller recognition space for the entire detector population. By introducing a detector model clone mutation optimization mechanism based on a genetic algorithm and depending on concentration to weaken the action of affinity force, the frequently evolved high-affinity detectors are greatly reduced in the selected chance after being evolved for a certain number of generations and are finally not selected. Also, the chance of obtaining evolution for some newly added high affinity detectors will be much higher than for detectors of equal affinity but high concentration due to the low concentration. Thus, through the evolution of the concentration control detector, the diversity of the detector group can be kept to a certain extent, and the situation that the identification space of the detector group evolves to a subset of the non-self set is avoided.
4. LRU-based detector extinction algorithm
As the detected network traffic is more, the mature detector set and the memory detector set are larger, and the unlimited growth not only needs a large amount of storage space, but also sacrifices the query efficiency. By introducing LRU-based detector extinction algorithm, the sizes of mature detectors and memory detectors can be controlled, storage cost is reduced, and efficiency is improved.
5. mRNA vaccination mechanism based on feature importance ranking
When a known or unknown viral threat is present in the real world, the most effective method is to vaccinate mRNA vaccines for targeted immunization. The process of injecting inactivated mRNA vaccine is to inject inactivated antigen into human body and induce the immune system to produce corresponding antibody and memory cell. Therefore, in the process of detecting unknown threats based on immunity, when the system finds the unknown threats, the useful information of the threats is fully utilized, and specific immunity to antigens can be realized by injecting mRNA vaccines to generate specific antibodies.
In conclusion, the invention provides an unknown threat detection method based on the artificial immunity idea, and the known and unknown threats are effectively detected. The system comprises a pre-classification module, a detection module and a control module, wherein the pre-classification module introduces a convolutional neural network to solve the pre-collection of a self set and the pre-classification problem of network flow in detection; the negative selection module introduces a gene library for solving the random generation problem of an initial detector and the specific immunity problem aiming at the position threat, and introduces hierarchical clustering for improving the training efficiency of the detector; the clone mutation module solves the overlapping detection problem of the high affinity detector by introducing a detector optimization algorithm based on a genetic algorithm; meanwhile, a memory detector fading mechanism based on LRU is introduced, so that the storage space is effectively released, and the detection efficiency is improved; the mRNA vaccination module introduces an mRNA vaccine algorithm based on characteristic importance ordering, decomposes the detected unknown threats according to the gene importance and injects the decomposed unknown threats into a gene library, and generates a corresponding detector to realize specific immunity to the unknown threats and the variants thereof.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: it is to be understood that modifications may be made to the technical solutions described in the foregoing embodiments, or equivalents may be substituted for some of the technical features thereof, but such modifications or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. An unknown threat detection method based on an artificial immunity idea is characterized in that a detection model comprises a pre-classification module, a negative selection module, a clone variation module and an mRNA vaccination module, wherein the pre-classification module introduces a convolutional neural network; introducing a gene library and hierarchical clustering by a negative selection module; the clonal variation module incorporates a genetic algorithm-based detector optimization algorithm that controls the affinity of the detector by detector concentration, while incorporating an LRU-based memory detector regression mechanism; the mRNA vaccination module introduces an mRNA vaccine algorithm based on characteristic importance ranking, decomposes the detected unknown threats according to gene importance and injects the threats into a gene library, and generates a corresponding detector.
2. The method for detecting unknown threats according to the artificial immunization scheme of the enclosed patent claim 1, which comprises a training phase and a detection phase, wherein,
a training stage: training a convolutional neural network by using a labeled data set to enable the convolutional neural network to have initial classification capability, constructing a gene library by using an initial nonself set and generating an initial detector, and participating in negative selection by using an initial self set to generate a first generation mature detector set;
a detection stage: inputting network flow, extracting and coding features, and entering the following steps:
s1, matching with the memory detector group, if detected, indicating that the threat is the recorded threat; if not, go to S2;
s2, classifying the convolutional neural network by using a pre-classification module, then matching the convolutional neural network with a mature detector group, and entering a self set to participate in the next negative selection process if the convolutional neural network is not matched and the classification is positive; if so, and the classification is negative, then the unknown threat is determined to proceed to S3; if the matching condition is inconsistent with the classification condition, entering a manual checking process, marking correct labels on the data and adding the data into a training set of the neural network for training;
s3, enabling the discovered unknown threats to enter an mRNA vaccination module, extracting feature importance by using a random forest algorithm, decomposing key features and injecting the key features into a gene library, generating a maturity and memory detector capable of detecting similar attacks through negative selection and clonal variation, adding the generated maturity detector into a maturity detector group, and adding the memory detector into a memory detector group;
s4, updating the maturity detector and the memory detector using an LRU-based detector kill algorithm.
3. The method for detecting unknown threats according to the artificial immune thought of the claim 2, wherein in the step S2, a convolutional neural network is trained in advance by using the correctly labeled data set, and the convolutional neural network comprises 2 convolutional layers, 2 pooling layers, a full-link layer and a classification layer.
4. The method for detecting unknown threats according to the artificial immune thought of the claim 2, wherein in the step S2, the step of classifying the convolutional neural network is:
s201, collecting data, making classification labels, and dividing the classification labels into a training set and a data set;
s202, designing a convolutional neural network: the convolutional neural network comprises 23 multiplied by 3 convolutional layers, 2 pooling layers, 1 full connection layer and 1 classification layer, wherein the pooling layers adopt a Max pooling mode, and the full connection layers are provided with 256 neurons;
s203, training the model in the S202 by using the training set in the S201, adjusting training parameters according to the test result and obtaining an optimal model;
and S204, dividing the data set into a self set and a non-nself set by using the optimal model of S203.
5. The method for detecting unknown threats according to the artificial immunization scheme as claimed in claim 2, wherein the negative selection in the step S3 comprises the steps of:
s301, collecting corresponding data by using an existing data set or a pre-classification module to construct a self set, and performing hierarchical clustering on the self set to obtain N clustering centers;
s302, randomly generating N initial detectors from a gene library, wherein the initial detectors in the same batch are defined as: { character string, R, tag, fixness, age }, where the radius R is determined by the minimum distance of the previous generation initial detector from self,
Figure FDA0003377189680000021
d is the Euclidean distance between two character strings, and the fitness is updated when the two character strings are matched in the detection stage; tag is a label and is used for recording the state of the detector, and the value is Image, match and memory; age is used for recording the algebra of the detector, and the initial immature detector age is 0;
s303, for each detector: the detector is sequentially matched with the clustering center, the distance from the detector to the clustering center is calculated, whether the clustering center is within the radius or not is judged, and if the distance is smaller than the radius, the detector disappears;
and S304, when the negative selection process of all the detectors is completed, recording the minimum distance r from the same batch of detectors to the self-set for the next generation, adding the minimum distance r into the mature detector set, and updating tag (match) of the mature detector set, wherein the age (age + 1) of the mature detector set.
6. The method for detecting unknown threats according to the artificial immunization scheme as claimed in claim 5, wherein the clonal variation in the step S3 comprises the steps of:
s311, sending a series of parent detector sets which detect the threat to a clone mutation module;
s312, sequentially selecting the detectors with the maximum affinity, and setting the initial concentration of each detector to be N-1;
s313, recording the selected detector as A and the mutated detector as a, defining a mutation number threshold value x, randomly selecting a mutation operator and mutating, if the distance between A and a is larger than the radius RA of A and the mutation (a)>fitness (a), concentration N +1,
Figure FDA0003377189680000031
and generating a detector a, wherein Ra ═ Ra, fitness (a) ═ fitness (a), age (a) is initialized to 1, and age (a) ═ age (a) + 1;
s314, adding A and a into the memory detector set and the maturity detector set.
7. The method for detecting unknown threats according to the artificial immunization scheme as claimed in claim 6, wherein the mRNA vaccination in step S3 comprises the following steps:
s321, when an unknown threat is found, sending the unknown threat to an mRNA vaccination module;
s322, constructing a data set and outputting feature importance by using a random forest algorithm;
s323, setting a threshold value to be N, taking the characteristic that the sum of the contribution of the characteristic values reaches the set threshold value N, and adding the characteristic into a gene library;
s324, randomly generating a detector with important genes;
s325, entering a negative selection step;
s326, cloning and mutating to generate a corresponding detector.
8. The method for detecting unknown threats according to the artificial immunization scheme as claimed in claim 7, wherein the updating method in the step S4 comprises the steps of: if the threshold value of the number of matches is set to be N and the threshold value of age A, detectors whose age is equal to or more than A and the number of matches does not reach N die.
CN202111420523.8A 2021-11-26 2021-11-26 Unknown threat detection method based on artificial immunity thought Active CN114065933B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111420523.8A CN114065933B (en) 2021-11-26 2021-11-26 Unknown threat detection method based on artificial immunity thought

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111420523.8A CN114065933B (en) 2021-11-26 2021-11-26 Unknown threat detection method based on artificial immunity thought

Publications (2)

Publication Number Publication Date
CN114065933A true CN114065933A (en) 2022-02-18
CN114065933B CN114065933B (en) 2024-07-12

Family

ID=80276664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111420523.8A Active CN114065933B (en) 2021-11-26 2021-11-26 Unknown threat detection method based on artificial immunity thought

Country Status (1)

Country Link
CN (1) CN114065933B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115037558A (en) * 2022-08-10 2022-09-09 军事科学院系统工程研究院网络信息研究所 Anomaly detection and evolution method for antagonistic driving
CN115296856A (en) * 2022-07-12 2022-11-04 四川大学 Encrypted traffic network threat detector evolution learning method based on ResNet-AIS

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101299691A (en) * 2008-06-13 2008-11-05 南京邮电大学 Method for detecting dynamic gridding instruction based on artificial immunity
CN101478534A (en) * 2008-12-02 2009-07-08 广东海洋大学 Network exception detecting method based on artificial immunity principle
CN102750490A (en) * 2012-03-23 2012-10-24 南京邮电大学 Virus detection method based on collaborative immune network evolutionary algorithm
CN104168152A (en) * 2014-09-19 2014-11-26 西南大学 Network intrusion detection method based on multilayer immunization
CN109462578A (en) * 2018-10-22 2019-03-12 南开大学 Threat intelligence use and propagation method based on statistical learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101299691A (en) * 2008-06-13 2008-11-05 南京邮电大学 Method for detecting dynamic gridding instruction based on artificial immunity
CN101478534A (en) * 2008-12-02 2009-07-08 广东海洋大学 Network exception detecting method based on artificial immunity principle
CN102750490A (en) * 2012-03-23 2012-10-24 南京邮电大学 Virus detection method based on collaborative immune network evolutionary algorithm
CN104168152A (en) * 2014-09-19 2014-11-26 西南大学 Network intrusion detection method based on multilayer immunization
CN109462578A (en) * 2018-10-22 2019-03-12 南开大学 Threat intelligence use and propagation method based on statistical learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨义先;李丽香;彭海朋;袁静;陈永刚;张浩;: "群体智能算法及其在信息安全中的应用探索", 信息安全学报, no. 01, 15 January 2016 (2016-01-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115296856A (en) * 2022-07-12 2022-11-04 四川大学 Encrypted traffic network threat detector evolution learning method based on ResNet-AIS
CN115296856B (en) * 2022-07-12 2024-04-19 四川大学 ResNet-AIS-based evolution learning method for encrypted traffic network threat detector
CN115037558A (en) * 2022-08-10 2022-09-09 军事科学院系统工程研究院网络信息研究所 Anomaly detection and evolution method for antagonistic driving
CN115037558B (en) * 2022-08-10 2022-10-21 军事科学院系统工程研究院网络信息研究所 Anomaly detection and evolution method for antagonistic driving

Also Published As

Publication number Publication date
CN114065933B (en) 2024-07-12

Similar Documents

Publication Publication Date Title
CN108632279B (en) Multilayer anomaly detection method based on network traffic
CN110321957B (en) Multi-label image retrieval method fusing triple loss and generating countermeasure network
CN114065933B (en) Unknown threat detection method based on artificial immunity thought
CN111343171B (en) Intrusion detection method based on mixed feature selection of support vector machine
CN109902740B (en) Re-learning industrial control intrusion detection method based on multi-algorithm fusion parallelism
CN111105045A (en) Method for constructing prediction model based on improved locust optimization algorithm
CN113764034B (en) Method, device, equipment and medium for predicting potential BGC in genome sequence
CN112087447A (en) Rare attack-oriented network intrusion detection method
CN110287985B (en) Depth neural network image identification method based on variable topology structure with variation particle swarm optimization
CN113179276B (en) Intelligent intrusion detection method and system based on explicit and implicit feature learning
CN112926640A (en) Cancer gene classification method and equipment based on two-stage depth feature selection and storage medium
CN112734051A (en) Evolutionary ensemble learning method for classification problem
CN112101473A (en) Smoke detection algorithm based on small sample learning
Zhu et al. Multiobjective evolutionary algorithm-based soft subspace clustering
CN108737429B (en) Network intrusion detection method
Liu et al. Multi-objective particle swarm optimization biclustering of microarray data
Čavojský et al. Comparative Analysis of Feed-Forward and RNN Models for Intrusion Detection in Data Network Security with UNSW-NB15 Dataset
CN107273842B (en) Selective integrated face recognition method based on CSJOGA algorithm
Ullah et al. Crow-ENN: An Optimized Elman Neural Network with Crow Search Algorithm for Leukemia DNA Sequence Classification
CN115996135B (en) Industrial Internet malicious behavior real-time detection method based on feature combination optimization
CN114863508B (en) Expression recognition model generation method, medium and device of self-adaptive attention mechanism
Zhou Using immune algorithm to optimize anomaly detection based on SVM
CN109711460A (en) The acquisition methods and device of initial cluster center
Murthy Genetic Algorithms: Basic principles and applications
CN114334168A (en) Feature selection algorithm of particle swarm hybrid optimization combined with collaborative learning strategy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant