CN114065933A - An Unknown Threat Detection Method Based on Artificial Immune Thought - Google Patents

An Unknown Threat Detection Method Based on Artificial Immune Thought Download PDF

Info

Publication number
CN114065933A
CN114065933A CN202111420523.8A CN202111420523A CN114065933A CN 114065933 A CN114065933 A CN 114065933A CN 202111420523 A CN202111420523 A CN 202111420523A CN 114065933 A CN114065933 A CN 114065933A
Authority
CN
China
Prior art keywords
detector
module
classification
unknown
age
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111420523.8A
Other languages
Chinese (zh)
Other versions
CN114065933B (en
Inventor
彭海朋
陈冠华
李丽香
黄京泽
孙婧瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202111420523.8A priority Critical patent/CN114065933B/en
Publication of CN114065933A publication Critical patent/CN114065933A/en
Application granted granted Critical
Publication of CN114065933B publication Critical patent/CN114065933B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/231Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种基于人工免疫思想的未知威胁检测方法,有效检测已知和未知威胁。预分类模块将卷积神经网络引入用于解决self集合的预收集以及检测中网络流量的预分类问题;阴性选择模块将基因库引入用于解决初始检测器的随机生成问题和针对未知威胁的特异性免疫问题,将层次聚类引入用于提高检测器的训练效率;克隆变异模块通过引入基于遗传算法的检测器优化算法,解决高亲和力检测器的重叠检测问题;同时引入基于LRU的记忆检测器消退机制,有效释放存储空间,提高检测效率;mRNA疫苗接种模块引入基于特征重要性排序的mRNA疫苗算法,将检测到的未知威胁按基因重要性分解并注入到基因库中,并生成相应的检测器,实现对该未知威胁及其变种的特异性免疫。

Figure 202111420523

The invention discloses an unknown threat detection method based on artificial immune thought, which can effectively detect known and unknown threats. The pre-classification module introduces the convolutional neural network to solve the pre-collection of the self set and the pre-classification of network traffic in detection; the negative selection module introduces the gene pool to solve the random generation problem of the initial detector and the specificity for unknown threats. To solve the problem of sexual immunity, hierarchical clustering is introduced to improve the training efficiency of the detector; the clone mutation module solves the problem of overlapping detection of high-affinity detectors by introducing a detector optimization algorithm based on genetic algorithm; at the same time, a memory detector based on LRU is introduced The extinction mechanism effectively releases storage space and improves detection efficiency; the mRNA vaccination module introduces an mRNA vaccine algorithm based on feature importance ranking, decomposes the detected unknown threats according to their genetic importance and injects them into the gene bank, and generates corresponding detections device to achieve specific immunity to this unknown threat and its variants.

Figure 202111420523

Description

Unknown threat detection method based on artificial immunity thought
Technical Field
The invention relates to the technical field of intrusion detection, in particular to an unknown threat detection method based on an artificial immunity thought.
Background
The biological immune system has the characteristics of diversity, tolerance, self-organization, self-adaptation and the like, artificial immunity is a mathematical model by referring to the concept of the biological immune system, a cell model is constructed by defining morphological space, self-body and non-self-body and affinity calculation, the maturation process of immune cells in bone marrow is simulated by a negative selection algorithm, the immune cells which can be matched with the self-body are eliminated, the rest immune cells reach the mature state, and threat detection is carried out by the mature immune cells.
Traditional artificial immunity-based models: firstly, a Self set is constructed manually, then an immature detector is randomly generated to enter a negative selection process for tolerance, the tolerance process is a process of Self-matching of the detector and Self, when the affinity reaches a matching threshold, the immature detector dies, otherwise, the immature detector evolves to become a mature detector and participates in detection.
Traditional intrusion detection models based on artificial immunity. Has the following disadvantages:
first, Self sets the adaptive problem. The Self set is usually a very large set, the collected normal sample is only a small subset of the Self set and cannot necessarily represent the real Self set, and if the number of elements of the two sets is greatly different, error scaling is caused, and finally the error of an actual result is large; in addition, in practical application, the number of elements in the Self set is often changed along with time, the traditional Self-non Self set model is usually manually distinguished, and the manually distinguished elements in the Self set are often dangerous, because the elements in the current Self set may become the elements in the tomorrow non Self set, such Self set lacks dynamic coverage, more false positives are generated, and the accuracy of the antibody set is difficult to guarantee; finally, most of the conventional Self sets are constructed according to static behavior information, and in a real environment, elements in the Self sets may change along with time, so that it is not desirable to artificially distinguish the elements in one Self set, and therefore, the failure of Self-adaptive updating of the Self sets becomes one of the disadvantages of the conventional model.
Second, the detector creates efficiency problems. In the negative selection algorithm proposed by Forrest et al, the detector generation efficiency is very low, the candidate detectors mature through the negative selection process, assuming that N is the size of the self-set to be trained, P is the probability of matching between antigen and antibody, P isfIs the failure rate (probability that antigen is not matched by any antibody), the number of candidate detectors should be Nc=-ln(Pf)/(P(1-P)N) Then the time complexity of the algorithm is O (N)cN), the number of candidate detectors grows exponentially as the size of the training set increases, and the time cost of the detection phase is higher. In the negative selection algorithm of real-valued representation, when the radius of the detector is constant, the volume of the hyper-sphere decreases with the increase of the dimension, and when the dimension exceeds 20, the volume is close to 0, which is also the reason that the performance of the detector in the high-dimensional real-valued space is not high. Furthermore, the increase in dimension also brings an increase in temporal complexity as well as spatial complexity.
Third, the detector identifies spatially large area overlap problems. When a high affinity detector that has evolved frequently is selected, many high affinity detectors may overlap each other in large areas of the recognition space in the next generation of detector clusters, which results in a relatively smaller recognition space for the entire detector cluster, which may then fall into a subset of the nonself set.
Fourth, the specific immune problem of unknown threats. When an unknown threat is found in a traditional model, only the threat information is recorded and a corresponding detector is generated, no countermeasure is provided for the threat, when the threat is changed, the process is carried out again, the reaction speed is low, and the time cost is high.
Disclosure of Invention
Aiming at the problems, the invention provides an unknown threat detection method based on an artificial immunity thought, which solves the Self-adaption problem of a Self set by introducing a pre-classification module, solves the generation efficiency problem of a detector by carrying out pre-hierarchical clustering on the Self set, solves the large-area overlapping problem of the identification space of the detector by introducing a detector optimization algorithm based on a genetic algorithm, and solves the specific immunity problem of the unknown threat by introducing an mRNA vaccine injection module.
In order to achieve the above purpose, the invention provides the following technical scheme:
an unknown threat detection method based on an artificial immunity idea is disclosed, wherein a detection model comprises a pre-classification module, a negative selection module, a clone variation module and an mRNA vaccination module, wherein the pre-classification module is introduced into a convolutional neural network; introducing a gene library and hierarchical clustering by a negative selection module; the clonal variation module incorporates a genetic algorithm-based detector optimization algorithm that controls the affinity of the detector by detector concentration, while incorporating an LRU-based memory detector regression mechanism; the mRNA vaccination module introduces an mRNA vaccine algorithm based on characteristic importance ranking, decomposes the detected unknown threats according to gene importance and injects the threats into a gene library, and generates a corresponding detector.
Further, the unknown threat detection method based on the artificial immunity thought comprises a training phase and a detection phase, wherein,
a training stage: training a convolutional neural network by using a labeled data set to enable the convolutional neural network to have initial classification capability, constructing a gene library by using an initial nonself set and generating an initial detector, and participating in negative selection by using an initial self set to generate a first generation mature detector set;
a detection stage: inputting network flow, extracting and coding features, and entering the following steps:
s1, matching with the memory detector group, if detected, indicating that the threat is the recorded threat; if not, go to S2;
s2, classifying the convolutional neural network by using a pre-classification module, then matching the convolutional neural network with a mature detector group, and entering a self set to participate in the next negative selection process if the convolutional neural network is not matched and the classification is positive; if so, and the classification is negative, then the unknown threat is determined to proceed to S3; if the matching condition is inconsistent with the classification condition, entering a manual checking process, marking correct labels on the data and adding the data into a training set of the neural network for training;
s3, enabling the discovered unknown threats to enter an mRNA vaccination module, extracting feature importance by using a random forest algorithm, decomposing key features and storing the key features in a gene bank, generating a maturity and memory detector capable of detecting similar attacks through negative selection and clonal variation, adding the generated maturity detector into a maturity detector group, and adding the memory detector into a memory detector group;
s4, updating the maturity detector and the memory detector using an LRU-based detector kill algorithm.
Further, in step S2, a convolutional neural network is trained in advance using the correctly labeled data set, and the convolutional neural network uses 2 convolutional layers, 2 pooling layers, a full-link layer, and a classification layer.
Further, in step S2, the step of classifying the convolutional neural network is:
s201, collecting data, making classification labels, and dividing the classification labels into a training set and a data set;
s202, designing a convolutional neural network: the convolutional neural network comprises 23 multiplied by 3 convolutional layers, 2 pooling layers, 1 full connection layer and 1 classification layer, wherein the pooling layers adopt a Max pooling mode, and the full connection layers are provided with 256 neurons;
s203, training the model in the S202 by using the training set in the S201, adjusting training parameters according to the test result and obtaining an optimal model;
and S204, dividing the data set into a self set and a non-nself set by using the optimal model of S203.
Further, the negative selection in step S3 includes the steps of:
s301, collecting corresponding data by using an existing data set or a pre-classification module to construct a self set, and performing hierarchical clustering on the self set to obtain N clustering centers;
s302, randomly generating N initial detectors from a gene library, wherein the initial detectors in the same batch are defined as: { character string, R, tag, fixness, age }, where the radius R is determined by the minimum distance of the previous generation initial detector from self,
Figure BDA0003377189690000041
d is the Euclidean distance between two character strings, and the fitness is updated when the two character strings are matched in the detection stage; tag is a label and is used for recording the state of the detector, and the value is Image, match and memory; age is used for recording the algebra of the detector, and the initial immature detector age is 0;
s303, for each detector: the detector is sequentially matched with the clustering center, the distance from the detector to the clustering center is calculated, whether the clustering center is within the radius or not is judged, and if the distance is smaller than the radius, the detector disappears;
and S304, when the negative selection process of all the detectors is completed, recording the minimum distance r from the same batch of detectors to the self-set for the next generation, adding the minimum distance r into the mature detector set, and updating tag (match) of the mature detector set, wherein the age (age + 1) of the mature detector set.
Further, the cloning variation in step S3 includes the steps of:
s311, sending a series of parent detector sets which detect the threat to a clone mutation module;
s312, sequentially selecting the detectors with the maximum affinity, and setting the initial concentration of each detector to be N-1;
s313, recording the selected detector as A and the mutated detector as a, defining a mutation number threshold value x, randomly selecting a mutation operator and mutating, if the distance between A and a is larger than the radius RA of A and the mutation (a)>fitness (a), concentration N +1,
Figure BDA0003377189690000042
and generating a detector a, wherein Ra ═ Ra, fitness (a) ═ fitness (a), age (a) is initialized to 1, and age (a) ═ age (a) + 1;
s314, adding A and a into the memory detector set and the maturity detector set.
Further, the mRNA vaccination in step S3 includes the steps of:
s321, when an unknown threat is found, sending the unknown threat to an mRNA vaccination module;
s322, constructing a data set and outputting feature importance by using a random forest algorithm;
s323, setting a threshold value to be N, taking the characteristic that the sum of the contribution of the characteristic values reaches the set threshold value N, and adding the characteristic into a gene library;
s324, randomly generating a detector with important genes;
s325, entering a negative selection step;
s326, cloning and mutating to generate a corresponding detector.
Further, the step of updating the method in step S4 is: if the threshold value of the number of matches is set to be N and the threshold value of age A, detectors whose age is equal to or more than A and the number of matches does not reach N die.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides an unknown threat detection method based on an artificial immunity thought, a detection model is divided into four modules: a pre-classification module, a negative selection module, a clonal variation module and an mRNA vaccination module. The pre-classification module introduces a convolutional neural network to solve the pre-collection of the negative selection module self set and the pre-classification problem of network traffic in detection. The negative selection module introduces a gene bank for solving the random generation problem of the initial detector and the specific immunity problem of the mRNA vaccine module against the position threat, and introduces hierarchical clustering for improving the training efficiency of the detector. The clone mutation module solves the overlapping detection problem of the high affinity detector by introducing a detector optimization algorithm based on a genetic algorithm, controls the affinity of the detector through the concentration of the detector, and further enables the frequently mutated detector to be slowly unselected, and avoids the detection range of the detector falling into a subset of a nonself set. Meanwhile, a memory detector fading mechanism based on the LRU is introduced, so that the storage space is effectively released, and the detection efficiency is improved. The mRNA vaccination module introduces an mRNA vaccine algorithm based on characteristic importance ordering, decomposes the detected unknown threats according to the gene importance and injects the decomposed unknown threats into a gene library, and generates a corresponding detector to realize specific immunity to the unknown threats and the variants thereof. The four modules act together to effectively detect known and unknown threats.
Drawings
In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a flowchart of an unknown threat detection method based on an artificial immunization idea according to an embodiment of the present invention.
Fig. 2 is a flowchart of a pre-classification module according to an embodiment of the present invention.
FIG. 3 is a flow chart of a negative selection module according to an embodiment of the present invention.
FIG. 4 is a flow chart of a clonal variation module provided in an embodiment of the present invention.
FIG. 5 is a flow chart of an mRNA vaccination module provided by an embodiment of the present invention.
Detailed Description
For a better understanding of the present solution, the method of the present invention is described in detail below with reference to the accompanying drawings.
The unknown threat detection method based on the artificial immunity thought provided by the embodiment of the invention has the advantages that the whole model is shown in figure 1 and is divided into four modules: a pre-classification module, a negative selection module, a clone variation module and an mRNA vaccination module, comprising a training stage and a detection stage, wherein,
a training stage: training a convolutional neural network by using a labeled data set to enable the convolutional neural network to have initial classification capability, constructing a gene library by using an initial nonself set and generating an initial detector, and participating in negative selection by using an initial self set to generate a first generation mature detector set;
a detection stage: inputting network flow, extracting and coding features, and entering the following steps:
s1, matching with the memory detector group, if detected, indicating that the threat is the recorded threat; if not, go to S2;
s2, classifying the convolutional neural network by using a pre-classification module, then matching the convolutional neural network with a mature detector group, and entering a self set to participate in the next negative selection process if the convolutional neural network is not matched and the classification is positive; if so, and the classification is negative, then the unknown threat is determined to proceed to S3; if the matching condition is inconsistent with the classification condition, entering a manual checking process, marking correct labels on the data and adding the data into a training set of the neural network for training;
further, in step S2, a convolutional neural network is trained in advance using the correctly labeled data set, and the convolutional neural network uses 2 convolutional layers, 2 pooling layers, a full-link layer, and a classification layer.
S3, enabling the discovered unknown threats to enter an mRNA vaccination module, extracting feature importance by using a random forest algorithm, decomposing key features and storing the key features in a gene bank, generating a maturity and memory detector capable of detecting similar attacks through negative selection and clonal variation, adding the generated maturity detector into a maturity detector group, and adding the memory detector into a memory detector group;
s4, updating the maturity detector and the memory detector using an LRU-based detector kill algorithm.
Further, the step of updating the method in step S4 is: if the threshold value of the number of matches is set to be N and the threshold value of age A, detectors whose age is equal to or more than A and the number of matches does not reach N die.
Each module is described separately below.
With respect to the pre-classification module:
in step S2, as shown in fig. 2, the convolutional neural network classification step is:
s201, collecting data, making classification labels, and dividing the classification labels into a training set and a data set;
s202, designing a convolutional neural network: the convolutional neural network comprises 23 multiplied by 3 convolutional layers, 2 pooling layers, 1 full connection layer and 1 classification layer, wherein the pooling layers adopt a Max pooling mode, and the full connection layers are provided with 256 neurons;
s203, training the model in the S202 by using the training set in the S201, adjusting training parameters according to the test result and obtaining an optimal model;
and S204, dividing the data set into a self set and a non-nself set by using the optimal model of S203.
Regarding the negative selection module:
in step S3, as shown in fig. 3, the negative selection step is:
s301, collecting corresponding data by using an existing data set or a pre-classification module to construct a self set, and performing hierarchical clustering on the self set to obtain N clustering centers;
s302, randomly generating N initial detectors from a gene library, wherein the initial detectors in the same batch are defined as: { character string, R, tag, fixness, age }, where the radius R is determined by the minimum distance of the previous generation initial detector from self,
Figure BDA0003377189690000071
d is the Euclidean distance between two character strings, and the fitness is updated when the two character strings are matched in the detection stage; tag is a label and is used for recording the state of the detector, and the value is Image, match and memory; age is used for recording the algebra of the detector, and the initial immature detector age is 0;
s303, for each detector: the detector is sequentially matched with the clustering center, the distance from the detector to the clustering center is calculated, whether the clustering center is within the radius or not is judged, and if the distance is smaller than the radius, the detector disappears;
and S304, when the negative selection process of all the detectors is completed, recording the minimum distance r from the same batch of detectors to the self-set for the next generation, adding the minimum distance r into the mature detector set, and updating tag (match) of the mature detector set, wherein the age (age + 1) of the mature detector set.
Regarding clonal variation modules:
in step S3, as shown in fig. 4, the clonal variation steps are:
s311, sending a series of parent detector sets which detect the threat to a clone mutation module;
s312, sequentially selecting the detectors with the maximum affinity, and setting the initial concentration of each detector to be N-1;
s313, recording the selected detector as A and the mutated detector as a, defining a mutation number threshold value x, randomly selecting a mutation operator and mutating, if the distance between A and a is larger than the radius RA of A and the mutation (a)>fitness (a), concentration N +1,
Figure BDA0003377189690000081
and generating a detector a, wherein Ra ═ Ra, fitness (a) ═ fitness (a), age (a) is initialized to 1, and age (a) ═ age (a) + 1;
s314, adding A and a into the memory detector set and the maturity detector set.
For the mRNA vaccination module:
in step S3, as shown in fig. 5, the mRNA vaccination step is:
s321, when an unknown threat is found, sending the unknown threat to an mRNA vaccination module;
s322, constructing a data set and outputting feature importance by using a random forest algorithm;
s323, setting a threshold value to be N, taking the characteristic that the sum of the contribution of the characteristic values reaches the set threshold value N, and adding the characteristic into a gene library;
s324, randomly generating a detector with important genes;
s325, entering a negative selection step;
s326, cloning and mutating to generate a corresponding detector.
Compared with the prior art, the method of the invention has the following beneficial effects:
1. pre-classification module
The pre-classification module can effectively solve the Self-adaption problem of the Self set, so that the Self set has dynamic coverage, and specifically, the invention has two functions, namely, generating an initial Self set and a nonself set, and continuously and dynamically expanding the Self set for network traffic classification.
2. And training an optimization algorithm based on a detector of hierarchical clustering.
In a traditional artificial immunity-based unknown threat detection model, any strategy is rarely adopted to reduce the distance calculation cost: the distance from the candidate detector to the self-set must be calculated, thereby reducing efficiency. The dynamic tolerance training of the detector based on clustering is carried out by introducing a detector training optimization algorithm based on hierarchical clustering, the self-clustering centers participate in immune tolerance training, and the number of the clustering centers is far less than that of the self-sets and is relatively stable, so that the training efficiency of the detector is greatly improved compared with that of the traditional method.
3. Genetic algorithm-based detector model clonal variation optimization mechanism
When a frequently evolving high affinity detector is selected, many high affinity detectors may overlap each other in large areas of the recognition space in the next generation of detector populations, which results in a relatively smaller recognition space for the entire detector population. By introducing a detector model clone mutation optimization mechanism based on a genetic algorithm and depending on concentration to weaken the action of affinity force, the frequently evolved high-affinity detectors are greatly reduced in the selected chance after being evolved for a certain number of generations and are finally not selected. Also, the chance of obtaining evolution for some newly added high affinity detectors will be much higher than for detectors of equal affinity but high concentration due to the low concentration. Thus, through the evolution of the concentration control detector, the diversity of the detector group can be kept to a certain extent, and the situation that the identification space of the detector group evolves to a subset of the non-self set is avoided.
4. LRU-based detector extinction algorithm
As the detected network traffic is more, the mature detector set and the memory detector set are larger, and the unlimited growth not only needs a large amount of storage space, but also sacrifices the query efficiency. By introducing LRU-based detector extinction algorithm, the sizes of mature detectors and memory detectors can be controlled, storage cost is reduced, and efficiency is improved.
5. mRNA vaccination mechanism based on feature importance ranking
When a known or unknown viral threat is present in the real world, the most effective method is to vaccinate mRNA vaccines for targeted immunization. The process of injecting inactivated mRNA vaccine is to inject inactivated antigen into human body and induce the immune system to produce corresponding antibody and memory cell. Therefore, in the process of detecting unknown threats based on immunity, when the system finds the unknown threats, the useful information of the threats is fully utilized, and specific immunity to antigens can be realized by injecting mRNA vaccines to generate specific antibodies.
In conclusion, the invention provides an unknown threat detection method based on the artificial immunity idea, and the known and unknown threats are effectively detected. The system comprises a pre-classification module, a detection module and a control module, wherein the pre-classification module introduces a convolutional neural network to solve the pre-collection of a self set and the pre-classification problem of network flow in detection; the negative selection module introduces a gene library for solving the random generation problem of an initial detector and the specific immunity problem aiming at the position threat, and introduces hierarchical clustering for improving the training efficiency of the detector; the clone mutation module solves the overlapping detection problem of the high affinity detector by introducing a detector optimization algorithm based on a genetic algorithm; meanwhile, a memory detector fading mechanism based on LRU is introduced, so that the storage space is effectively released, and the detection efficiency is improved; the mRNA vaccination module introduces an mRNA vaccine algorithm based on characteristic importance ordering, decomposes the detected unknown threats according to the gene importance and injects the decomposed unknown threats into a gene library, and generates a corresponding detector to realize specific immunity to the unknown threats and the variants thereof.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: it is to be understood that modifications may be made to the technical solutions described in the foregoing embodiments, or equivalents may be substituted for some of the technical features thereof, but such modifications or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1.一种基于人工免疫思想的未知威胁检测方法,其特征在于,检测模型包括预分类模块、阴性选择模块、克隆变异模块和mRNA疫苗接种模块,其中预分类模块引入卷积神经网络;阴性选择模块引入基因库和层次聚类;克隆变异模块引入基于遗传算法的检测器优化算法,通过检测器浓度控制检测器的亲和力,同时引入基于LRU的记忆检测器消退机制;mRNA疫苗接种模块引入基于特征重要性排序的mRNA疫苗算法,将检测到的未知威胁按基因重要性分解并注入到基因库中,并生成相应的检测器。1. an unknown threat detection method based on artificial immune thought, is characterized in that, detection model comprises pre-classification module, negative selection module, clone variation module and mRNA vaccination module, wherein pre-classification module introduces convolutional neural network; negative selection The module introduces gene library and hierarchical clustering; the clone mutation module introduces the detector optimization algorithm based on genetic algorithm, controls the affinity of the detector through the detector concentration, and introduces the LRU-based memory detector extinction mechanism; the mRNA vaccination module introduces feature-based The importance-ranked mRNA vaccine algorithm decomposes the detected unknown threats according to their genetic importance and injects them into the gene pool, and generates corresponding detectors. 2.本剧权利要求1所述的基于人工免疫思想的未知威胁检测方法,其特征在于,包括训练阶段和检测阶段,其中,2. the unknown threat detection method based on artificial immune thought according to claim 1 of this play, is characterized in that, comprises training phase and detection phase, wherein, 训练阶段:使用带标签的数据集训练卷积神经网络,使其具有初始分类能力,使用初始nonself集合构造基因库并生成初始检测器,使用初始self集合参与阴性选择,生成第一代成熟检测器集合;Training phase: use the labeled dataset to train the convolutional neural network to make it have initial classification ability, use the initial nonself set to construct the gene pool and generate the initial detector, use the initial self set to participate in negative selection, and generate the first generation mature detector gather; 检测阶段:输入网络流量,特征提取并编码,进入以下步骤:Detection stage: input network traffic, feature extraction and encoding, and enter the following steps: S1、与记忆检测器群体匹配,若检测到,说明其为已经被记录的威胁;若未检测到,进入S2;S1. Match with the memory detector group. If detected, it means that it is a threat that has been recorded; if not detected, enter S2; S2、使用预分类模块进行卷积神经网络分类,后与成熟检测器群体进行匹配,如果未匹配到,且分类为正,则进入self集合参与接下来的阴性选择过程;如果匹配到,且分类为负,则判定为未知威胁进入S3;若匹配情况与分类情况不一致,进入人工审核流程,将数据打上正确的标签并加入神经网络的训练集训练;S2. Use the pre-classification module for convolutional neural network classification, and then match with the mature detector population. If it does not match and the classification is positive, enter the self set to participate in the next negative selection process; if it matches, and classify If it is negative, it is determined that the unknown threat enters S3; if the matching situation is inconsistent with the classification situation, enter the manual review process, label the data correctly and add it to the training set of the neural network; S3、发现的未知威胁进入mRNA疫苗接种模块,使用随机森林算法提取特征重要性,将关键特征分解并注入基因库,通过阴性选择和克隆变异,生成能检测相似攻击的成熟和记忆检测器,生成的成熟检测器加入成熟检测器群体,记忆检测器加入记忆检测器群体;S3. The discovered unknown threats enter the mRNA vaccination module, use the random forest algorithm to extract the feature importance, decompose and inject the key features into the gene bank, and generate mature and memory detectors that can detect similar attacks through negative selection and clonal mutation. The mature detector joins the mature detector group, and the memory detector joins the memory detector group; S4、使用基于LRU的检测器消亡算法更新成熟检测器和记忆检测器。S4. Use the LRU-based detector demise algorithm to update the mature detector and the memory detector. 3.根据权利要求2所述的基于人工免疫思想的未知威胁检测方法,其特征在于,步骤S2中,使用带正确标签的数据集事先训练一个卷积神经网络,使用2个卷积层,2个池化层,一个全连接层,一个分类层的卷积神经网络。3. The unknown threat detection method based on artificial immunity thought according to claim 2, is characterized in that, in step S2, use the data set with correct label to train a convolutional neural network in advance, use 2 convolution layers, 2 A convolutional neural network with a pooling layer, a fully connected layer, and a classification layer. 4.根据权利要求2所述的基于人工免疫思想的未知威胁检测方法,其特征在于,步骤S2中,卷积神经网络分类的步骤为:4. The unknown threat detection method based on artificial immunity thought according to claim 2, is characterized in that, in step S2, the step of convolutional neural network classification is: S201、收集数据并做好分类标签,并分成训练集和数据集两部分;S201, collect data and make classification labels, and divide it into two parts: training set and data set; S202、设计卷积神经网络:该卷积神经网络包含2个3×3卷积层,2个池化层,1个全连接层和1一个分类层,池化层采用Max池化方式,全连接层设置256个神经元;S202. Design a convolutional neural network: the convolutional neural network includes 2 3×3 convolutional layers, 2 pooling layers, 1 fully connected layer and 1 classification layer. The pooling layer adopts the Max pooling method. The connection layer is set to 256 neurons; S203、用S201中的训练集训练S202中的模型,根据测试结果调整训练参数并获得最佳模型;S203, use the training set in S201 to train the model in S202, adjust the training parameters according to the test results and obtain the best model; S204、使用S203最佳模型将数据集分为self集合和nonself集合。S204. Use the best model of S203 to divide the dataset into a self set and a nonself set. 5.根据权利要求2所述的基于人工免疫思想的未知威胁检测方法,其特征在于,步骤S3中阴性选择的步骤为:5. the unknown threat detection method based on artificial immune thought according to claim 2, is characterized in that, the step of negative selection in step S3 is: S301、使用已有的数据集或者预分类模块收集相应数据构造self集合,并对self集合进行层次聚类,得到N个聚类中心;S301. Use an existing data set or a pre-classification module to collect corresponding data to construct a self set, and perform hierarchical clustering on the self set to obtain N cluster centers; S302、从基因库中随机生成N个初始检测器,同批次初始检测器定义为:{字符串,R,tag,fitness,age},其中半径R由上一代初始检测器与self的最小距离确定,
Figure FDA0003377189680000021
D为两个字符串之间的欧式距离,当在检测阶段匹配时更新fitness;tag为标签,用于记录检测器的状态,取值为Immature、mature、memory;age用于记录检测器的代数,初始未成熟检测器age=0;
S302, randomly generate N initial detectors from the gene library, the initial detectors of the same batch are defined as: {string, R, tag, fitness, age}, where the radius R is the minimum distance between the previous generation initial detector and self Sure,
Figure FDA0003377189680000021
D is the Euclidean distance between the two strings, and the fitness is updated when matching in the detection stage; tag is the label, which is used to record the state of the detector, and the values are Immature, mature, and memory; age is used to record the algebra of the detector , the initial immature detector age=0;
S303、对每一个检测器:检测器依次和聚类中心匹配,计算检测器到聚类中心的距离并判断聚类中心是否在半径内,如果距离小于半径,则检测器消亡;S303, for each detector: the detector matches the cluster center in turn, calculates the distance from the detector to the cluster center and determines whether the cluster center is within the radius, and if the distance is less than the radius, the detector dies; S304、当所有检测器阴性选择过程完成时,记录同批次检测器到自体集的最小距离r供下一代使用,后将其加入成熟检测器集合,并更新其tag=mature,age=age+1。S304. When the negative selection process of all detectors is completed, record the minimum distance r from the same batch of detectors to the self-collection set for the next generation to use, then add it to the mature detector set, and update its tag=mature, age=age+ 1.
6.根据权利要求5所述的基于人工免疫思想的未知威胁检测方法,其特征在于,步骤S3中克隆变异的步骤为:6. The unknown threat detection method based on artificial immunity thought according to claim 5, is characterized in that, the step of cloning mutation in step S3 is: S311、将检测到该威胁的一系列父代检测器集合送入克隆变异模块;S311, sending a series of parent detector sets that have detected the threat into the clone mutation module; S312、依次选择亲和力最大的检测器,并设置每个检测器的初始浓度为N=1;S312, selecting the detector with the greatest affinity in turn, and setting the initial concentration of each detector as N=1; S313、记选择的检测器为A,变异后的检测器为a,定义变异次数阈值x,随机选择变异算子并变异,如果A与a的距离大于A的半径RA且fitness(a)>fitness(A),浓度N=N+1,
Figure FDA0003377189680000031
并生成检测器a,其中Ra=RA,fitness(a)=fitness(A),age(a)初始化为1,age(A)=age(A)+1;
S313. Denote the selected detector as A, the mutated detector as a, define the mutation times threshold x, randomly select the mutation operator and mutate, if the distance between A and a is greater than the radius RA of A and fitness(a)>fitness (A), concentration N=N+1,
Figure FDA0003377189680000031
And generate detector a, where Ra=RA, fitness(a)=fitness(A), age(a) is initialized to 1, age(A)=age(A)+1;
S314、将A与a加入记忆检测器集合与成熟检测器集合。S314. Add A and a to the memory detector set and the mature detector set.
7.根据权利要求6所述的基于人工免疫思想的未知威胁检测方法,其特征在于,步骤S3中mRNA疫苗接种的步骤为:7. the unknown threat detection method based on artificial immunity thought according to claim 6, is characterized in that, the step of mRNA vaccination in step S3 is: S321、当发现未知威胁时,将未知威胁送入mRNA疫苗接种模块;S321. When an unknown threat is found, send the unknown threat into the mRNA vaccination module; S322、构造数据集并使用随机森林算法输出特征重要性;S322. Construct a dataset and use a random forest algorithm to output feature importance; S323、设定阈值为N,取特征值贡献之和达到设定阈值N的特征并加入基因库;S323, set the threshold value as N, take the feature whose sum of contribution of the feature value reaches the set threshold value N and add it to the gene bank; S324、随机生成带重要基因的检测器;S324. Randomly generate detectors with important genes; S325、进入阴性选择步骤;S325, enter the negative selection step; S326、克隆变异生成相应检测器。S326 , the clone mutation generates a corresponding detector. 8.根据权利要求7所述的基于人工免疫思想的未知威胁检测方法,其特征在于,步骤S4中更新方法的步骤为:设定匹配个数阈值为N和age阈值A,则age在A以上且匹配数未达到N的检测器消亡。8. The method for detecting unknown threats based on artificial immune thinking according to claim 7, wherein the step of updating the method in step S4 is: setting the matching number threshold as N and age threshold A, then age is above A And the detector whose matching number does not reach N will die.
CN202111420523.8A 2021-11-26 2021-11-26 Unknown threat detection method based on artificial immunity thought Active CN114065933B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111420523.8A CN114065933B (en) 2021-11-26 2021-11-26 Unknown threat detection method based on artificial immunity thought

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111420523.8A CN114065933B (en) 2021-11-26 2021-11-26 Unknown threat detection method based on artificial immunity thought

Publications (2)

Publication Number Publication Date
CN114065933A true CN114065933A (en) 2022-02-18
CN114065933B CN114065933B (en) 2024-07-12

Family

ID=80276664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111420523.8A Active CN114065933B (en) 2021-11-26 2021-11-26 Unknown threat detection method based on artificial immunity thought

Country Status (1)

Country Link
CN (1) CN114065933B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115037558A (en) * 2022-08-10 2022-09-09 军事科学院系统工程研究院网络信息研究所 Anomaly detection and evolution method for antagonistic driving
CN115296856A (en) * 2022-07-12 2022-11-04 四川大学 Evolutionary Learning Method for Encrypted Traffic Network Threat Detector Based on ResNet-AIS

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101299691A (en) * 2008-06-13 2008-11-05 南京邮电大学 Method for detecting dynamic gridding instruction based on artificial immunity
CN101478534A (en) * 2008-12-02 2009-07-08 广东海洋大学 Network exception detecting method based on artificial immunity principle
CN102750490A (en) * 2012-03-23 2012-10-24 南京邮电大学 Virus detection method based on collaborative immune network evolutionary algorithm
CN104168152A (en) * 2014-09-19 2014-11-26 西南大学 Network intrusion detection method based on multilayer immunization
CN109462578A (en) * 2018-10-22 2019-03-12 南开大学 Threat intelligence use and propagation method based on statistical learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101299691A (en) * 2008-06-13 2008-11-05 南京邮电大学 Method for detecting dynamic gridding instruction based on artificial immunity
CN101478534A (en) * 2008-12-02 2009-07-08 广东海洋大学 Network exception detecting method based on artificial immunity principle
CN102750490A (en) * 2012-03-23 2012-10-24 南京邮电大学 Virus detection method based on collaborative immune network evolutionary algorithm
CN104168152A (en) * 2014-09-19 2014-11-26 西南大学 Network intrusion detection method based on multilayer immunization
CN109462578A (en) * 2018-10-22 2019-03-12 南开大学 Threat intelligence use and propagation method based on statistical learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨义先;李丽香;彭海朋;袁静;陈永刚;张浩;: "群体智能算法及其在信息安全中的应用探索", 信息安全学报, no. 01, 15 January 2016 (2016-01-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115296856A (en) * 2022-07-12 2022-11-04 四川大学 Evolutionary Learning Method for Encrypted Traffic Network Threat Detector Based on ResNet-AIS
CN115296856B (en) * 2022-07-12 2024-04-19 四川大学 Evolutionary learning method for encrypted traffic network threat detector based on ResNet-AIS
CN115037558A (en) * 2022-08-10 2022-09-09 军事科学院系统工程研究院网络信息研究所 Anomaly detection and evolution method for antagonistic driving
CN115037558B (en) * 2022-08-10 2022-10-21 军事科学院系统工程研究院网络信息研究所 Anomaly detection and evolution method for antagonistic driving

Also Published As

Publication number Publication date
CN114065933B (en) 2024-07-12

Similar Documents

Publication Publication Date Title
CN113190699A (en) Remote sensing image retrieval method and device based on category-level semantic hash
CN112087447B (en) A Rare Attack-Oriented Network Intrusion Detection Method
CN110321957A (en) It merges triple loss and generates the multi-tag image search method of confrontation network
CN113657561A (en) Semi-supervised night image classification method based on multi-task decoupling learning
CN103258147B (en) A kind of parallel evolution super-network DNA micro array gene data categorizing system based on GPU and method
CN110110753B (en) An Efficient Hybrid Feature Selection Method Based on Elite Flower Pollination Algorithm and ReliefF
CN111105045A (en) Method for constructing prediction model based on improved locust optimization algorithm
CN110136773A (en) A method for constructing plant-protein interaction network based on deep learning
CN113764034B (en) Method, device, equipment and medium for predicting potential BGC in genome sequence
CN114065933A (en) An Unknown Threat Detection Method Based on Artificial Immune Thought
CN107465664A (en) Intrusion detection method based on parallel more artificial bee colony algorithms and SVMs
CN110991518A (en) Two-stage feature selection method and system based on evolution multitask
CN115296837B (en) A sustainable integrated intrusion detection method based on SSA optimization
CN113179276B (en) Intelligent intrusion detection method and system based on explicit and implicit feature learning
CN112104602A (en) Network intrusion detection method based on CNN transfer learning
CN118116574A (en) Traditional Chinese medicine syndrome classification method and device based on improved Harris eagle optimization algorithm
CN110516599A (en) Group Behavior Recognition Model and Its Training Method Based on Progressive Relational Learning
CN108737429B (en) A method of network intrusion detection
CN118335200B (en) Lung adenocarcinoma subtype classification system, medium and equipment based on causal feature selection
CN119132405A (en) Characteristic gene screening method based on NSGA-ANN and ceRNA network construction method
CN116647409A (en) Invasion detection method based on WK-1DCNN-GRU hybrid model
Wang et al. MAHyNet: Parallel Hybrid Network for RNA-Protein Binding Sites Prediction Based on Multi-Head Attention and Expectation Pooling
CN115472305A (en) Method and system for predicting microorganism-drug association effect
CN119993281B (en) A method and system for inferring neutrophil differentiation trajectory
CN112784948A (en) Hybrid evolution method based on octopus learning memory system bionics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant