CN116192530A - Unknown threat self-adaptive detection method based on deceptive defense - Google Patents

Unknown threat self-adaptive detection method based on deceptive defense Download PDF

Info

Publication number
CN116192530A
CN116192530A CN202310238699.4A CN202310238699A CN116192530A CN 116192530 A CN116192530 A CN 116192530A CN 202310238699 A CN202310238699 A CN 202310238699A CN 116192530 A CN116192530 A CN 116192530A
Authority
CN
China
Prior art keywords
class
minority
data set
cbn
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310238699.4A
Other languages
Chinese (zh)
Inventor
丁旭阳
刘子为
谢盈
韩幸
张小松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202310238699.4A priority Critical patent/CN116192530A/en
Publication of CN116192530A publication Critical patent/CN116192530A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1491Countermeasures against malicious traffic using deception as countermeasure, e.g. honeypots, honeynets, decoys or entrapment

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an unknown threat self-adaptive detection method based on deceptive defense, which is realized by adopting an unknown threat self-adaptive flow detection system based on deceptive defense technology, wherein the system comprises a malicious flow detection module, a honeypot module and a self-adaptive detection upgrading module, the malicious flow detection module discovers attacks in a network, the honeypot module acquires the unknown threat flow in the network, the self-adaptive detection upgrading module analyzes and learns the unknown threat flow in the honeypot module, and a learned model is updated into the malicious flow detection module. The invention combines the deceptive defense technology and the malicious flow detection technology, captures unknown threats through the deceptive defense technology, and expands the unknown threats belonging to few samples by using the improved data set balance algorithm, so that the malicious flow identification algorithm can identify the unknown threats, adaptively expands the attack mode which can be identified by the self-adaptive method, and improves the universality of the malicious flow detection system.

Description

Unknown threat self-adaptive detection method based on deceptive defense
Technical Field
The invention relates to the field of industrial control safety, in particular to an unknown threat self-adaptive detection method based on deceptive defense.
Background
Industrial internet is an important infrastructure and key technical support for realizing the development of industrial digitization, networking and intellectualization as a product of deep integration of new generation network information technology and manufacturing industry, and is widely regarded as an important foundation stone of the fourth industrial revolution. In recent years, the construction of 5G infrastructure is continuously perfected, new technology, new application and industrial Internet technology are continuously researched, developed and popularized and used, great opportunity is brought to the development of the industrial Internet, and meanwhile, the industrial Internet is also challenged seriously.
Many attacks can cause damage to an industrial control system through the industrial internet, and a safety protection system is required to detect traffic in the industrial internet and discover malicious traffic. However, with the development of the age, the attack mode is layered endlessly, and the detected update speed is lower than that of the attack mode. The deceptive defense technology can guide and capture the attack flow, and can capture the attack pertinently while avoiding false triggering of the normal flow. Threats captured by fraudulent defense techniques, but not by existing detection systems, are referred to as unknown threats. The unknown threat is often a potential serious threat of the network, so that the detection capability of the attack on the part is improved, and the overall security of the network can be greatly improved.
The deceptive defense technology is a high deceptive security protection technology and can be operated in various networks and terminal systems with characteristic loopholes deliberately left. The deceptive defense technology can induce an intruder to launch an attack, capture an attack source on the basis, realize security defense and protect important system terminals from being damaged. The means of attack used by unknown threats, which are typically not disclosed, are not captured by the detection system. The deceptive defense technology can be disguised as an important asset, and the threats are attracted to attack, so that the attacks are captured. The unknown threats have the characteristics of small sample number, strong concealment and high hazard.
The unknown threats are automatically analyzed and defended, the hysteresis of manual analysis is made up, and the defending capability of the existing defending technology to unknown attacks is enhanced. And capturing attack traffic in the network through a deceptive defense technology, analyzing and classifying the captured attack traffic, and adding the attack traffic which cannot be analyzed into a data set of a malicious traffic detection system. When an unknown threat is added to the data set, an improved data set sample balancing algorithm is used for the data set, so that the influence of the unbalanced data set on a malicious flow detection system is avoided. The malicious flow detection system updates the original detection model through the updated data set, so that the malicious flow detection system can detect the increase of attack types, can adaptively discover potential attack flow in the network, and reduces human intervention.
The invention relates to the discovery and detection of unknown threats in a local area network in an industrial control system, which utilizes a deceptive defense technology to guide and capture attack traffic, detects and filters captured attack traffic, adaptively detects the unknown threats and automatically upgrades a traffic detection system in the local area network to realize the identification of the unknown threats.
(1) An unknown threat adaptive flow detection system (adaptive detection system for short) based on deceptive defense technology is proposed. The self-adaptive detection system consists of a malicious flow detection module, a honeypot module and a self-adaptive detection upgrading module. The malicious traffic detection module (one module in the adaptive detection system) timely discovers attacks in the network. The honeypot module obtains unknown threat traffic in the network. The self-adaptive detection upgrading module analyzes and learns the unknown threat flow in the honeypot module, and updates the learned model to the malicious flow detection module. The adaptive detection system is capable of analyzing and discovering unknown threats in the local area network.
(2) The method improves the sample balancing algorithm of the data set, and expands the samples of the minority class by the average distance of the neighboring minority class and the number of the neighboring majority class, so that the sample number of each class in the data set is balanced, and the influence of the unbalanced data set on the deep learning algorithm is avoided. The algorithm acts in the adaptive detection upgrade module above as a core algorithm for analyzing and learning unknown threat traffic.
Disclosure of Invention
In order to solve the defects in the field of malicious flow detection in the current industrial control system, the invention provides an unknown threat self-adaptive detection method based on deceptive defense, which is realized by adopting an unknown threat self-adaptive flow detection system based on deceptive defense technology. A schematic of the deployment of the system in an existing local area network is shown in fig. 1. The industrial control equipment, the router and the switch are existing equipment, and the gateway server and the honeypot are equipment on which the system of the invention is installed. The malicious flow detection module and the adaptive detection upgrading module of the adaptive flow detection system are deployed in the gateway server. Each honeypot device is provided with a honeypot module of the system, and can feed back the captured attack flow to the gateway server. At the same time, each honeypot device has the existing deceptive defense technology and can guide and capture unknown threats (the invention acts on the existing local area network device, wherein a gateway server and the honeypot are carriers for bearing the invention, and the invention is software).
One or more honeypot devices should be deployed in the industrial control network. The malicious flow detection module in the system performs threat detection through the existing neural network model. The self-adaptive detection upgrading module realizes upgrading and updating of the malicious flow detection module in a way of retraining the neural network model, improves the detection accuracy of the self-adaptive flow detection system and increases the types of threat types identified by the system.
Attack traffic is less than normal traffic. The principles and means of these unknown attacks are generally undisclosed when capturing the unknown threat, have strong concealment and pertinence, and are generally less, so that when the unknown threat data is integrated into the original dataset, the unknown threat data is isolated and discrete in the dataset sample space. Training a neural network model (which is an artificial intelligence test, which is the subject of the present invention) results in model training to fit, thus requiring sample balancing of the data set when unknown threats are incorporated into the data set. Malicious traffic detection moduleThe data set originally carried is referred to as the original data set and is denoted as D org
The invention provides an unknown threat self-adaptive detection method based on deceptive defense, which specifically comprises the following steps:
s1, constructing the unknown threat self-adaptive flow detection system based on the spoofing defensive technology.
S2, analyzing threat traffic captured by the honeypot module by using the malicious traffic detection module to obtain unknown threat traffic. Tailoring unknown threat traffic to a preprocessed data set D pre
S3, preprocessing the data set D pre And the original data set D org Merging to obtain a merged data set D cbn * . For the merged dataset D cbn * Clustering the sample space to obtain a clustered data set D cbn And determines the data set D cbn The division of the minority class and the majority class of the database. The following is a synthetic dataset D cbn * Formula for clustering and clustered data set D cbn And (3) a process.
D before clustering operation cbn * Is composed of two parts D org And D pre 。D cbn Original belonged to D org Has a complete cluster-like identification set C old
Figure BDA0004123367100000031
Wherein (1)>
Figure BDA0004123367100000032
The k-th class center point position is represented, k=1, 2, …, S, and the total number of classes contained in the original data set is S. D (D) cbn Original belonged to D pre All as independent class clusters. D is completed by the following formula cbn * And get D cbn New cluster identification set C in (1) new . Formula calculation D cbn * And if the distance between all the class clusters is smaller than a preset threshold value eta, merging the two class clusters. />
The cluster-like distance calculation formula is as follows:
Figure BDA0004123367100000033
wherein C is a And C b Representing two arbitrary clusters, x and y are C respectively a And C b Sample points in (C) a I and C b The i indicates the total number of samples in the class cluster. If d (C) a ,C b ) And if the cluster size is smaller than eta, combining the two class clusters, and repeating until the cluster size cannot be combined. The new clusters of several classes are ordered from big to small according to the number of samples
Figure BDA0004123367100000034
The new class clusters are N. From this, clustered dataset D cbn
As can be seen from the calculation flow, the new sample points are aggregated into two places, the first of the existing S categories, and the second of the existing S categories, forming a new category. Due to the small number of unknown threat samples, the first S known categories are the majority categories and the last N-S categories are the minority categories. Where unknown threats are concentrated in a few categories.
S4, clustering the clustered data set D cbn Average distance d by neighbor minority class in sample space sa And the number N of neighbor majority classes ma The learning complexity ld of a few of the classes of samples is measured.
For a certain minority class C i The average distance calculation process for the neighborhood minority class is as follows. Wherein C is i Is a collection
Figure BDA0004123367100000035
The value range of i is an integer from S+1 to N in a certain class in the set. C (C) i Average distance from m adjacent minority classes +.>
Figure BDA0004123367100000041
Is C i And (3) representing the sparseness of the class samples in the minority class space. Wherein the method comprises the steps ofM is a preset constant. m adjacent minority classes represent C i The first m minority classes, ordered from small to large distance from all minority classes.
For a certain minority class C i The local neighbor majority count calculation process is as follows. C (C) i The distance between the m-th neighbor minority class and the m-th neighbor minority class is a radius, and the number of the majority class samples in the formed circle is
Figure BDA0004123367100000042
Represent C i Is the number of local neighbor majority classes.
Average distance of neighbor minority classes
Figure BDA0004123367100000043
The larger the value of (c) indicates that the spatial distribution of the minority class samples is sparse, and that such samples are difficult to learn, the higher the learning weight should be given. Whereas the denser the more easily the study is, the lower the complexity of the study is.
Local neighbor majority number
Figure BDA0004123367100000044
The larger the value of (c) indicates that the class of samples is at the boundary or center region of the majority class of samples, the higher the learning cost of the class, because the class of samples is affected by the majority class of samples. Conversely, the lower the learning complexity, the easier the learning.
The average distance normalization calculation formula of the neighbor minority class is as follows:
Figure BDA0004123367100000045
wherein j is a positive integer from S+1 to N.
Figure BDA0004123367100000046
Representing the average distance of neighboring minority classes of the j-th minority class.
Figure BDA0004123367100000047
And the average distance normalized value of the neighbor minority class of the ith minority class is represented.
The partial neighbor majority number normalization formula is as follows:
Figure BDA0004123367100000048
Figure BDA0004123367100000049
and the j-th local neighbor majority class number is represented. />
Figure BDA00041233671000000410
And (5) representing the normalized value of the ith local neighbor majority class number. />
Then for minority class C i Is the learning complexity ld of (1) i The calculation formula is as follows:
Figure BDA00041233671000000411
where α represents a weighting coefficient, which is a relationship between the average distance of the equilibrium neighbor minority class and the number of the local neighbor majority class. Learning complexity ld i The larger the representation of the sample class C i The more difficult it is to learn, the more neighbor samples need to be generated.
S5, clustering the clustered data set D cbn And calculating the synthesis number of the minority class samples according to the learning complexity ld. Generating new minority class samples through a synthesis formula, and merging the generated minority class samples into D cbn Form a new sample data set D nrb . Data set D nrb The sample space tends to be balanced.
For a certain minority class C i The calculation formula for the number of generated samples is as follows:
Figure BDA0004123367100000051
wherein the method comprises the steps of
Figure BDA0004123367100000052
For the number of samples generated. M is M i Representing the number of samples of class i, i.e. sample class C i Number, M of max For maximum number of samples, is data set D cbn The number of samples of the most abundant class of samples, i.e +.>
Figure BDA0004123367100000053
Number of samples in a category. Since the maximum learning rate is 1, M is used in order to ensure balance in the number of samples generated max As an upper limit on the number of samples generated.
For a certain minority class C i The formula for generating the sample points is as follows:
x new =x p +rand(0,1)×(x q -x p )
wherein x is new Is new sample data synthesized, x p Is a few clusters C i Is the coordinate vector of the center point of each minority class, only one of the center points, p and C i Corresponding, p+1 and C i +1 corresponds to, x q Is a few clusters C i Coordinate vector of a random sample point. p is a natural number. q is 1 to |C i Random integer in I, C i I represents class cluster C i Is generated using the formula
Figure BDA0004123367100000054
Sample points and adding the newly generated sample points to a minority class C i In this case C i Sample balancing is completed;
applying the above procedure to data set D cbn Each minority sample class of the completed data set D cbn Overall sample balancing, resulting in a new dataset D nrb
S6, using the data set D nrb Retraining the neural network model, and updating the retrained neural network model to the malicious flow detection module through an adaptive detection upgrading module in the adaptive flow detection systemIs a kind of medium. From this, the automatic updating of the self-adaptive traffic detection system is completed, and the unknown threat traffic in the local area network can be detected (the detection algorithm is classified as a malicious traffic detection module).
The invention combines the deceptive defense technology and the malicious traffic detection technology, captures unknown threats through the deceptive defense technology, and expands the unknown threats belonging to few samples by using the improved data set balance algorithm, so that the malicious traffic recognition algorithm can recognize the unknown threats, adaptively expands the attack mode which can be recognized by the malicious traffic recognition algorithm, and further improves the universality of the malicious traffic detection system.
Drawings
FIG. 1 is a diagram of an unknown threat adaptive traffic detection system deployment based on a spoofing defensive technique constructed in accordance with the present invention;
FIG. 2 is a schematic flow chart of the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
The self-adaptive malicious flow detection method is realized based on the honeypot, and the method can improve the accuracy and the universality of self-identification by utilizing the attack data in the honeypot while protecting the safety of the industrial control network, so that the malicious flow detection module can self-adaptively detect and discover new network attacks.
The invention provides an unknown threat self-adaptive detection method based on deceptive defense, which is implemented according to the industrial control network deployment condition shown in figure 1. In the industrial control network, the industrial control network has normal industrial control equipment, and the honeypot is deployed in the industrial control network, so that the normal business process cannot be influenced. The malicious flow detection module is deployed at the backbone position of the industrial control network, and is the best choice to be deployed at the gateway server position.
As shown in fig. 2, the unknown threat adaptive detection method based on fraud defense includes the following steps:
step 1: and analyzing threat flow captured by the honeypot by using a malicious flow detection system to obtain unknown flow. Preparing the analyzed unknown flow as a preprocessing data set D pre
In this embodiment, the threat categories in the original dataset include Bot, brute Force-Web, SQL Injection, and Do S tags-GoldenEye. In this embodiment, the partial data in the CIC-IDS-2017 dataset is used, and Friday-work in gHours-Morning. Pcap_ISCX. Csv is used as the original dataset D of the experiment org
Each column in the file corresponds to a session stream, 82-dimensional features are selected, and part of the features are shown in table 1 below:
TABLE 1CIC-IDS-2017 network malicious traffic data set partial characterization
Feature names Description of the invention
Dst Port Destination port
Protocol Protocol type
Flow Duration Duration of stream
Tot Fwd Pkts Total number of forward data packets
Tot Bwd Pkts Total number of reverse data packets
TotLen Fwd Pkts Total length of forward data packet
TotLen Bwd Pkts Total length of reverse data packet
Fwd Pkt Len Max Maximum length of forward data packet
Fwd Pkt Len Min Minimum length of forward data packet
Fwd Pkt Len Mean Average forward packet length
The initial sequence of partial features in the CIC-IDS-2017 network malicious traffic data set can be set to be the sequence of the table from top to bottom (the table is a partial display of all features), and is expressed as Dst Port, protocol, flow Duration, tot Fwd Pkts, tot Bwd Pkts, totLen Fwd Pkts, totLen Bwd Pkts, fwd Pkt Len Max, fwd Pkt L en Min, fwd Pkt Len Mean, ….
And putting the attack traffic of the FTP-BruteForce, SSH-Bruteforce type into the industrial control network. When attack traffic is put in, the honeypot module can collect attack traffic (honeypot collection starts after attack is put in) and feed back to the self-adaptive traffic detection system. A malicious flow detection module in the self-adaptive flow detection system uses a convolutional neural network model to detect attack flow fed back by the honeypot. If attack is returnedIf there is an attack flow detected as normal, the attack flow is extracted and marked as unknown, and a preprocessed data set D is created pre
Step 2: will preprocess data set D pre And the original data set D org Merging to obtain a merged data set D cbn * . For the merged dataset D cbn * Clustering the sample space to obtain a clustered data set D cbn And determining a division of the minority class and the majority class in the dataset.
Through the merging of class clusters, 2 minority class clusters and 4 majority class clusters are found.
Step 3: for clustered data set D cbn Average distance d by neighbor minority class in sample space sa And the number N of neighbor majority classes ma The learning complexity ld of a few of the classes of samples is measured.
The weighting coefficient alpha value is set to 0.6. The learning complexity of the two minority class clusters is 0.9652,0.86521 respectively.
Step 4: for clustered data set D cbn And calculating the synthesis number of the minority class samples according to the learning complexity ld. Generating new minority class samples through a synthesis formula, and merging the generated minority class samples into D cbn Form a new sample data set D nrb
In this embodiment, if the number of Bot samples is the largest, the largest number of samples is M max 16200.D (D) cbn 102 samples for the first minority class cluster and 308 samples for the second minority class. The number of samples that each of the two minority classes needs to generate is 15537, 13749. Mix the generated samples with D cbn Obtaining a new sample dataset D nrb
Step 5: using dataset D nrb And retraining the convolutional neural network model, and updating the retrained neural network model into the malicious flow detection module through an adaptive detection upgrading module in the adaptive flow detection system. From this, the automatic update of the adaptive traffic detection system is completed, enabling unknown threat traffic in the local area network to be carried outAnd (5) detecting.
The adaptive traffic detection system can discover FTP-BruteForce, SSH-Bruteforce threat traffic.
From the above description, it will be apparent to those skilled in the art that embodiments of the present method may be implemented by means of a general purpose hardware platform necessary for the system. Based on the understanding, a technician can use a proper development environment to realize the invention, and can select the cyclic neural network according to the actual model requirement in the realization process, and the type of the honeypot is required to be adjusted according to the actual scene.
In summary, the accuracy of the existing machine learning model can be further improved by the method of the invention. The embodiments described above are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Claims (3)

1. The unknown threat self-adaptive detection method based on the deceptive defense is characterized in that the method is realized by adopting an unknown threat self-adaptive flow detection system based on the deceptive defense technology, and the system is deployed in the existing local area network in the following mode: the industrial control equipment, the router and the switch are existing equipment, and the gateway server and the honeypot equipment are provided with the self-adaptive flow detection system; the malicious flow detection module and the self-adaptive detection upgrading module in the self-adaptive flow detection system are deployed in the gateway server, each honeypot device is provided with the honeypot module of the self-adaptive flow detection system, the captured attack flow can be fed back to the gateway server, and meanwhile, each honeypot device is provided with the existing deceptive defense technology and can guide and capture unknown threats; disposing one or more honeypot devices in an industrial control network, wherein a malicious flow detection module in the self-adaptive flow detection system performs threat detection through an existing neural network model, and the self-adaptive detection upgrading module realizes upgrading and updating of the malicious flow detection module by retraining the neural network model;
the method comprises the following steps:
s1, analyzing threat flow captured by a honeypot module by using a malicious flow detection module to obtain unknown threat flow, and preparing the unknown threat flow as a preprocessing data set D pre Simultaneously, the data set originally carried by the malicious flow detection module is recorded as an original data set D org
S2, preprocessing the data set D pre And the original data set D org Merging to obtain a merged data set D cbn * Combined dataset D cbn * Clustering the sample space to obtain a clustered data set D cbn And determines the data set D cbn A division of minority classes and majority classes in (a);
the following is a synthetic dataset D cbn * Formula for clustering and clustered data set D cbn The process comprises the following steps:
d before clustering operation cbn * Is composed of two parts D org And D pre ;D cbn Original belonged to D org Has a complete cluster-like identification set C old
Figure FDA0004123367080000011
Wherein (1)>
Figure FDA0004123367080000012
Represents the position of the center point of the kth class, k=1, 2, …, S, and the total number of the classes contained in the original data set is S, D cbn Original belonged to D pre All of which are used as independent class clusters; d is completed by the following formula cbn * And get D cbn New cluster identification set C in (1) new D is calculated by the following formula cbn * If the distance between all the class clusters is smaller than a preset threshold value eta, merging the two class clusters;
the cluster-like distance calculation formula is as follows:
Figure FDA0004123367080000013
wherein C is a And C b Representing two arbitrary clusters, x and y are C respectively a And C b Sample points in (C) a I and C b I represents the total number of samples in the class cluster, if d (C a ,C b ) Combining the two class clusters when the number of the class clusters is smaller than eta, repeating until the two class clusters cannot be combined, and sorting the newly obtained class clusters from large to small according to the number of samples
Figure FDA0004123367080000014
The number of new clusters is N, and a clustered data set D is obtained cbn
According to the calculation flow, new sample points are gathered to two places, wherein the first is the S categories in the prior art, the second is the new category, the first S known categories are the majority categories due to the characteristic of small number of unknown threat samples, and the last N-S categories are minority categories, wherein the unknown threats are concentrated in the minority categories;
s3, clustering the clustered data set D cbn Average distance d by neighbor minority class in sample space sa And the number N of neighbor majority classes ma Measuring the learning complexity ld of a few types of samples;
for a certain minority class C i The average distance calculation process for the adjacent minority class is as follows, wherein C i Is a collection
Figure FDA0004123367080000021
A certain class in the set, i has the value range of an integer from S+1 to N:
C i average distance to m adjacent minority classes
Figure FDA0004123367080000022
Is C i Represents the sparseness degree of the class sample in the minority class space, wherein m is a preset constantThe number, m, of adjacent minority classes represent C i The first m minority classes ordered from small to large distances from all minority classes;
for a certain minority class C i The calculation process of the local neighbor majority number is as follows:
C i the distance between the m-th neighbor minority class and the m-th neighbor minority class is a radius, and the number of the majority class samples in the formed circle is
Figure FDA0004123367080000023
Represent C i Is the number of the local neighbor majority classes;
average distance of neighbor minority classes
Figure FDA0004123367080000024
The larger the value of (2) is, the sparse the spatial distribution of a few types of samples is, and the samples are difficult to learn, so that higher learning weight is given; whereas, the denser the learning is, the easier the learning is, and the lower the learning complexity is;
local neighbor majority number
Figure FDA0004123367080000025
The larger the value of (2) is, the higher the learning cost of the class is, and the lower the learning complexity is, the easier the learning is;
the average distance normalization calculation formula of the neighbor minority class is as follows:
Figure FDA0004123367080000026
wherein j is a positive integer from S+1 to N,
Figure FDA0004123367080000027
average distance of neighboring minority classes representing jth minority class,/->
Figure FDA0004123367080000028
Representing the average distance normalized value of the neighbor minority class of the ith minority class;
the partial neighbor majority number normalization formula is as follows:
Figure FDA0004123367080000029
Figure FDA00041233670800000210
represents the number of the j-th local neighbor majority class, < >>
Figure FDA00041233670800000211
Representing the normalized value of the number of the ith local neighbor majority class;
then for minority class C i Is the learning complexity ld of (1) i The calculation formula is as follows:
Figure FDA0004123367080000031
wherein alpha represents a weighting coefficient, which is the relationship between the average distance of the balance neighbor minority class and the number of the local neighbor majority class; learning complexity ld i The larger the representation of the sample class C i The more difficult it is to learn, the more adjacent samples need to be generated;
s4, clustering the clustered data set D cbn According to the learning complexity ld, calculating the synthesis number of the minority class samples, generating new minority class samples through a synthesis formula, and merging the generated minority class samples into D cbn Form a new sample data set D nrb Data set D nrb The sample space tends to be balanced;
for a certain minority class C i The calculation formula for the number of generated samples is as follows:
Figure FDA0004123367080000032
wherein the method comprises the steps of
Figure FDA0004123367080000033
To generate the number of samples, M i Representing the number of samples of class i, i.e. sample class C i Number, M of max For maximum number of samples, is data set D cbn The number of samples of the most abundant class of samples, i.e +.>
Figure FDA0004123367080000034
The number of samples in the category;
for a certain minority class C i The formula for generating the sample points is as follows:
x new =x p +rand(0,1)×(x q -x p )
wherein x is new Is new sample data synthesized, x p Is a few clusters C i Is the coordinate vector of the center point of each minority class, only one, x q Is a few clusters C i Coordinate vector of random sample point, p is natural number, q is 1 to |C i Random integer in I, C i I represents class cluster C i Is generated using the formula
Figure FDA0004123367080000035
Sample points and adding the newly generated sample points to a minority class C i In this case C i Sample balancing is completed;
applying the above procedure to data set D cbn Each minority sample class of the completed data set D cbn Overall sample balancing, resulting in a new dataset D nrb
S5, using the data set D nrb Retraining the neural network model, updating the retrained neural network model into a malicious flow detection module through an adaptive detection upgrading module in the adaptive flow detection system, and self-adapting the neural network modelThe automatic updating of the traffic detection system is completed, and the unknown threat traffic in the local area network can be detected.
2. Unknown threat adaptive detection method based on fraud defense according to claim 1, characterized in that the raw dataset D org Threat categories include Bot, brute Force-Web, SQL Injection, and DoS tags-Go ldenEye.
3. The unknown threat adaptive detection method based on fraud protection of claim 2, wherein the value of α in step S3 is 0.6.
CN202310238699.4A 2023-03-13 2023-03-13 Unknown threat self-adaptive detection method based on deceptive defense Pending CN116192530A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310238699.4A CN116192530A (en) 2023-03-13 2023-03-13 Unknown threat self-adaptive detection method based on deceptive defense

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310238699.4A CN116192530A (en) 2023-03-13 2023-03-13 Unknown threat self-adaptive detection method based on deceptive defense

Publications (1)

Publication Number Publication Date
CN116192530A true CN116192530A (en) 2023-05-30

Family

ID=86442395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310238699.4A Pending CN116192530A (en) 2023-03-13 2023-03-13 Unknown threat self-adaptive detection method based on deceptive defense

Country Status (1)

Country Link
CN (1) CN116192530A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117811841A (en) * 2024-02-29 2024-04-02 深圳市常行科技有限公司 Threat monitoring defense system, method and equipment for internal network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117811841A (en) * 2024-02-29 2024-04-02 深圳市常行科技有限公司 Threat monitoring defense system, method and equipment for internal network

Similar Documents

Publication Publication Date Title
CN109768985B (en) Intrusion detection method based on flow visualization and machine learning algorithm
CN111565205B (en) Network attack identification method and device, computer equipment and storage medium
CN112738015B (en) Multi-step attack detection method based on interpretable convolutional neural network CNN and graph detection
Soltani et al. RF fingerprinting unmanned aerial vehicles with non-standard transmitter waveforms
CN111817982B (en) Encrypted flow identification method for category imbalance
CN109359666A (en) A kind of model recognizing method and processing terminal based on multiple features fusion neural network
CN112381121A (en) Unknown class network flow detection and identification method based on twin network
CN110830490B (en) Malicious domain name detection method and system based on area confrontation training deep network
CN111224994A (en) Botnet detection method based on feature selection
KR102067324B1 (en) Apparatus and method for analyzing feature of impersonation attack using deep running in wireless wi-fi network
CN107483451B (en) Method and system for processing network security data based on serial-parallel structure and social network
CN112003869B (en) Vulnerability identification method based on flow
CN113821793B (en) Multi-stage attack scene construction method and system based on graph convolution neural network
Osman et al. Artificial neural network model for decreased rank attack detection in RPL based on IoT networks
Ganapathy et al. An intelligent intrusion detection system for mobile ad-hoc networks using classification techniques
CN116192530A (en) Unknown threat self-adaptive detection method based on deceptive defense
CN109756467A (en) A kind of recognition methods of fishing website and device
CN112115957A (en) Data stream identification method and device and computer storage medium
CN114500396B (en) MFD chromatographic feature extraction method and system for distinguishing anonymous Torr application flow
CN109840904A (en) A kind of high iron catenary large scale difference parts testing method
CN109525577A (en) Malware detection method based on HTTP behavior figure
CN116915450A (en) Topology pruning optimization method based on multi-step network attack recognition and scene reconstruction
CN110445772A (en) A kind of the internet host scan method and system of Intrusion Detection based on host relationship
RU172615U1 (en) Denial of Service Low Intensity Attack Detection Device
CN113420791B (en) Access control method and device for edge network equipment and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination