CN113660273B - Intrusion detection method and device based on deep learning under super fusion architecture - Google Patents

Intrusion detection method and device based on deep learning under super fusion architecture Download PDF

Info

Publication number
CN113660273B
CN113660273B CN202110948874.XA CN202110948874A CN113660273B CN 113660273 B CN113660273 B CN 113660273B CN 202110948874 A CN202110948874 A CN 202110948874A CN 113660273 B CN113660273 B CN 113660273B
Authority
CN
China
Prior art keywords
bird nest
deep learning
virtual machine
intrusion detection
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110948874.XA
Other languages
Chinese (zh)
Other versions
CN113660273A (en
Inventor
张运厚
任吉媛
刘阜阳
宋阳
王哲
尹路
彭玉怀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeast Branch Of State Grid Corp Of China
Original Assignee
Northeast Branch Of State Grid Corp Of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Branch Of State Grid Corp Of China filed Critical Northeast Branch Of State Grid Corp Of China
Priority to CN202110948874.XA priority Critical patent/CN113660273B/en
Publication of CN113660273A publication Critical patent/CN113660273A/en
Application granted granted Critical
Publication of CN113660273B publication Critical patent/CN113660273B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1458Denial of Service
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1483Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/146Tracing the source of attacks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides an intrusion detection method, an intrusion detection device and a storage medium based on deep learning under a super-fusion architecture, and relates to the technical field of intrusion detection. In particular, in order to reduce redundant information, an improved binary cuckoo algorithm is adopted to select the features, the detection precision is ensured, the size of a deep learning model is reduced, and the extraction time of network flow features is reduced.

Description

Intrusion detection method and device based on deep learning under super fusion architecture
Technical Field
The present invention relates to the field of intrusion detection technologies, and in particular, to an intrusion detection method and apparatus based on deep learning under a super fusion architecture, and a storage medium.
Background
The super fusion architecture is an emerging cloud computing architecture, so as to virtualize the cloud computing architecture as a core, and pre-integrate resources such as computation, storage, network and the like into a standard server (such as an X86 or ARM) to form a standardized super fusion unit. Meanwhile, the virtualization of basic functions such as storage, calculation, network and the like is realized in a software-defined mode. The super-fusion units are converged into the data center through the network to form an IT infrastructure, so that the purposes of quick deployment and simple operation and maintenance of the IT infrastructure are achieved. In contrast to traditional data center architectures, the most fundamental change to super-converged architectures is the manner of storage, the shift from centralized shared storage (SAN/NAS) to software defined storage (e.g., distributed storage), and the adoption of virtual machine controllers for their storage allocation, rather than physical machines CVM (Controller VM). Therefore, when the virtual machine receives/sends a DDoS attack (Distributed Denial of Service attack ), the functional paralysis of the attacked virtual machine can not provide services for users, and the storage resources of the current super-fusion node can be maliciously occupied, so that the normal operation of other virtual machines is affected. Therefore, an intrusion detection module needs to be installed under the super fusion architecture to prevent the malicious impact of DDoS attacks on the super fusion virtual machine.
Traditional intrusion detection protects by monitoring traffic or logs on hosts and networks, and discovering intrusions and malicious operations. Common approaches are misuse-based intrusion detection and anomaly-based intrusion detection. The intrusion detection based on misuse mainly directly detects intrusion behavior according to known attack characteristics. Whereas anomaly-based intrusion detection is primarily the detection of deviations from normal data. For example, chinese patent CN 111464510A discloses a network real-time intrusion detection method based on a rapid gradient lifting tree model, which is to grasp network traffic data in a continuous time window and input feature vectors of the traffic data into a rapid gradient lifting tree classification model to determine whether the behavior is normal or network intrusion. Aiming at the super-fusion architecture, the Chinese patent CN 112165495A provides a DDoS attack prevention method based on the super-fusion architecture, a DDoS attack prevention device based on the super-fusion architecture and a super-fusion cluster. By analyzing the change trend of the data traffic, whether the super-fusion virtual machine has the trend of being attacked by DDoS or not can be judged.
The intrusion detection based on misuse needs to manually update a feature library, extract feature codes and cannot find unknown attacks. The intrusion detection based on abnormality needs to train and establish a normal model, and the detection rate is directly related to the quality of the model. Most students use machine learning to create an intrusion detection model to improve the detection rate, but the problems of excessive model characteristics, high cost and the like are often accompanied. For example, although the Chinese patent CN 111464510A improves the detection accuracy, the feature dimension is overlarge and the model cost is larger. In addition, most of traditional intrusion detection modules are deployed in physical hosts, and intrusion detection based on a virtualized environment has transparency for information acquisition of virtual machines, so that malicious behaviors among all virtual machines in the virtual environment cannot be detected; if the traditional mode is imitated, the intrusion detection module is installed inside each super-fusion virtual machine, the virtual machine resources are occupied to generate larger expenditure, unified management is lacking, the intrusion module of each super-fusion virtual machine cannot be upgraded uniformly at the same time, and potential safety hazards are possibly caused by different protection forces of each virtual machine.
Disclosure of Invention
Accordingly, the present invention provides an intrusion detection method, device and storage medium based on deep learning under a super-fusion architecture, so as to be suitable for high-precision, low-overhead and high-efficiency network DDoS attack detection of a super-fusion data center virtual environment.
For this purpose, the invention provides the following technical scheme:
in one aspect, the present invention provides an intrusion detection method based on deep learning under a super fusion architecture, the method comprising: a detection model training stage and an actual flow analysis stage;
wherein, the detection model training stage comprises the steps of:
s1, acquiring a network intrusion detection data set, and extracting characteristics of the data set;
s2, removing redundant features in the data set features by adopting a binary cuckoo algorithm, and taking the result as the input of deep learning;
s3, carrying out normalization processing on the data of the corresponding characteristics according to the result of the S2, and then inputting the data into a deep learning model for training, wherein the output of the deep learning model is a probability value belonging to normal flow or containing DDoS attack flow;
the actual flow analysis stage comprises the steps of:
s4, acquiring an IP address and an MAC address of the virtual machine through a virtual machine management program, and storing the IP address and the MAC address of the virtual machine in the same super fusion node;
s5, capturing network traffic packets in and out of the virtual machine, and verifying the network traffic packets flowing out of the virtual machine to prevent IP spoofing and MAC spoofing;
s6, carrying out data processing and feature extraction on the flow packets with correct IP addresses and MAC addresses after flow verification, and identifying whether DDoS attack behaviors are contained or not by using a deep learning model after training.
Further, in the binary cuckoo algorithm, tent chaotic mapping is adopted for initialization, and n 0-1 arrays with the size of 1 multiplied by D are generated to represent the positions of bird nests; wherein, the Tent chaotic mapping formula is as follows:
Figure BDA0003217874710000031
wherein q is a random number in (0, 1);
Figure BDA0003217874710000032
for the ith feature of the kth bird nest, when i=0, +.>
Figure BDA0003217874710000033
Is a random number in (0, 1);
after n arrays are created, binarization operation with a threshold value of 0.5 is executed, and initialization is completed; where "1" indicates that the ith feature in the kth bird nest is selected and "0" indicates that the ith feature in the kth bird nest is not selected.
Further, in the binary cuckoo algorithm, an evaluation function of the bird nest position is set as follows:
Figure BDA0003217874710000041
wherein ,
Figure BDA0003217874710000042
is the value of the ith feature of the kth bird nest in the t-th iteration,/-, is->
Figure BDA0003217874710000043
Is the variance of the ith feature of the kth bird's nest in the nth iteration.
Further, in the binary cuckoo algorithm, after the best bird nest position is selected in the present iteration, the occurrence probability P of each feature in the best bird nest position is calculated i If P i More than or equal to 0.5, corresponding features are 1, otherwise, 0, and a group of high-quality feature combination solutions are obtained; and calculating an evaluation function value of the high-quality feature combination solution, and if the evaluation function value is larger than the evaluation function of the current best bird nest position in the best bird nest positions, replacing the current best bird nest positions with the high-quality feature combination solution.
Further, the deep learning model is a BP neural network model, which includes: an input layer, L hidden layers, L >3, and an output layer, all of which are connected by a full connection mode; wherein, the activating functions of the input layer and the hidden layer are all ReLU functions, and the activating function of the output layer is Sigmoid function; the number of neurons of the input layer depends on the outcome of S2; the output layer has 2 neurons corresponding to the probability values of normal traffic and DDoS-containing traffic, respectively.
Further, after the BP neural network model is trained, the method further comprises: and evaluating the performance of the BP neural network model in intrusion detection by using the confusion matrix.
Further, capturing network traffic packets coming in and going out of the virtual machine, and verifying the network traffic packets flowing out of the virtual machine, including:
collecting and acquiring original flow data of each in-out virtual machine through packet capturing software in a kernel layer;
and extracting the source IP address and the MAC address of the traffic data flowing out of the virtual machine, and comparing the source IP address and the MAC address with information stored in the kernel layer, wherein if any matching of the IP address or the MAC address fails, the traffic data is considered to be deception with the MAC address.
Further, S6 further includes:
discarding traffic packets containing DDoS attacks;
if the network traffic is found to contain malicious behaviors, sending an alarm to an administrator, and recording related information of the data packet to form a log; the related information of the data packet at least comprises: generating source and target source, occurrence time of event, duration length, virtual machine port number;
if the virtual machine of the current super fusion node is found to send out malicious behaviors, extracting port detailed information from the data packet header; the application associated with the port will be identified and declared suspicious; the manager alerts the tenant to delete the applications; if the tenant virtual machine malicious traffic exceeds a threshold, it will be isolated until it is found to be benign.
In still another aspect, the present invention further provides an intrusion detection device based on deep learning under a super fusion architecture, where the intrusion detection device includes:
the training data acquisition unit is used for acquiring a network intrusion detection data set and extracting the characteristics of the data set;
the redundant feature removing unit is used for removing redundant features in the data set features by adopting a binary cuckoo algorithm, and taking the result as the input of deep learning;
the training unit is used for carrying out normalization processing on the data of the corresponding characteristics according to the result of the redundant characteristic removing unit, then inputting the data into the deep learning model for training, and outputting the deep learning model as a probability value belonging to normal flow or containing DDoS attack flow;
the address information storage unit is used for acquiring the IP address and the MAC address of the virtual machine through the virtual machine management program and storing the IP address and the MAC address of the virtual machine in the same super fusion node;
the flow verification unit is used for capturing network flow packets which are in and out of the virtual machine, and verifying the network flow packets which are out of the virtual machine so as to prevent IP spoofing and MAC spoofing;
and the intrusion detection unit is used for carrying out data processing and feature extraction on the flow packets with correct IP addresses and MAC addresses after the verification of the flow verification unit, and identifying whether DDoS attack behaviors are contained or not by utilizing the deep learning model trained by the training unit.
In yet another aspect, the present invention further provides a computer readable storage medium, where a computer instruction set is stored, where the computer instruction set, when executed by a processor, implements a deep learning based intrusion detection method under a super fusion architecture as provided above.
The invention has the advantages and positive effects that:
(1) The invention provides an intrusion detection method based on deep learning under a super fusion architecture, which adopts an improved binary cuckoo algorithm to perform feature selection on sample data before model training so as to reduce redundant information, reduce feature dimension, ensure detection precision and achieve the aim of reducing model cost. In addition, the feature dimension is reduced, and the time for extracting the features from the data can be reduced during real-time flow data acquisition, so that the overall time required for detection is reduced.
(2) The invention designs a flow verification method aiming at the super-fusion architecture mode, namely, the IP address/MAC address detection is firstly carried out on network data flowing out by the virtual machine, and then the flow analysis and detection are carried out, so that the flow packet with IP/MAC address deception is prevented from directly detecting and wasting calculation resources, and the detection efficiency is further improved.
(3) According to the method, aiming at the virtual environment under the super fusion architecture, DDoS attack behavior detection is carried out on the flow entering and exiting the virtual machine by the BP neural network of the virtualized kernel layer, and compared with a method for judging whether the flow change condition in the virtual machine is attacked by DDoS or not by monitoring the flow change condition in the virtual machine, the method provided by the invention has higher detection accuracy and provides relevant auxiliary information for a manager to trace the attack source reversely.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.
Fig. 1 is a schematic flow chart of a network intrusion detection method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a BP neural network model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a specific process for removing redundant features by using the modified binary cuckoo algorithm according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, an intrusion detection method based on deep learning under a super fusion architecture provided by an embodiment of the present invention is composed of a detection model training stage and an actual flow analysis stage.
Wherein, the detection model training stage comprises the steps of:
s1, acquiring a CIDS2017 network intrusion detection data set, and extracting data set characteristics by using a CICFlowMeter.
And S2, removing redundant features by adopting a modified binary cuckoo algorithm (Binary Cuckoo Search, BCS), and taking the result as the input of deep learning.
And S3, carrying out normalization processing on the data of the corresponding characteristics according to the result of the S2, and then inputting the data into a deep learning model for training, wherein the data is output as a probability value belonging to normal traffic or containing DDoS attack traffic. In a specific implementation, the deep learning model is a BP neural network model.
The actual flow analysis stage comprises the steps of:
s4, acquiring an IP address and an MAC address of the virtual machine through a virtual machine management program (such as Libvirt and the like), and storing the IP address and the MAC address of the virtual machine in the same super fusion node;
s5, capturing network traffic packets in and out of the virtual machine, and verifying the network traffic packets flowing out of the virtual machine to prevent IP spoofing and MAC spoofing;
s6, carrying out data processing and feature extraction on the traffic packets with correct IP addresses and MAC addresses, and identifying whether DDoS attack behaviors are contained or not by using a trained deep learning model;
and S7, discarding the traffic packet containing the DDoS attack, sending out alarm information, and forming logs of the detailed information of the related virtual machine and the information of the traffic packet.
Preferably, the step S1 specifically includes the following steps:
s101: the CICIDS2017 network intrusion detection data set comprises benign and latest common attacks, is similar to real world data, comprises 8 CSV files, uses Friday-WorkingHours-Afternonon-DDoS.pcap_ISCX.csv for deep learning training, totally comprises data of 97718 normal labels and 128027 DDoS labels, takes 90% of the data as a training set, and the rest is a test set.
S102: the CICFlowMeter software is adopted to analyze the flow data, and the CICFlowMeter software comprises 77 statistical characteristics such as forward and reverse duration, data packet number, byte number, data packet length and the like.
Preferably, in the above step S2, an improved Binary Cuckoo Search (BCS) algorithm is adopted, so that an original initial bird nest mode is optimized, an evaluation function suitable for feature selection is provided, a "high-quality feature retention" link is added, diversity of initial data is ensured, and the convergence speed of an original cuckoo is accelerated. The method specifically comprises the following steps:
s201: inputting the number n of bird nests and finding the probability P a The number of features D, the optimal bird nest position omega (i.e. the optimal feature combination selected), and the number of iterations T.
S202: in order to accelerate algorithm convergence, tent chaotic map initialization is adopted to replace random initialization of the original BCS (improvement one), and n '0-1' arrays with the size of 1 multiplied by D are generated to represent the positions of bird nests. Wherein, the Tent chaotic mapping formula is as follows:
Figure BDA0003217874710000091
/>
where q is a random number in (0, 1), where q=0.7 is set;
Figure BDA0003217874710000092
for the ith feature of the kth bird nest, when i=0, +.>
Figure BDA0003217874710000093
Is a random number in (0, 1).
After n arrays are created, binarization operation with a threshold value of 0.5 is executed, and initialization is completed. Where "1" indicates that the ith feature in the kth bird nest is selected and "0" indicates that the ith feature in the kth bird nest is not selected.
S203: in the original BCS algorithm, the cuckoo adopts Levy to fly and search the bird nest, and the update formula of the position of the cuckoo is expressed as follows:
x k (t+1)=x k (t)+Levy(λ);
Levy(λ)~u ,(1<λ<3);
wherein ,xk (t) represents the position of the kth bird nest in the nth iteration, u being the random step size of the Levy flight. But mainly for continuous type problems, sigmoid functions are used to transform it into discrete data. The specific transformation formula is as follows:
Figure BDA0003217874710000094
Figure BDA0003217874710000095
wherein, sigma-U (0, 1),
Figure BDA0003217874710000096
representing the ith feature of the kth bird nest in the nth iteration.
S204: the method comprises the steps that Levy flight is used for upgrading n bird nest positions which are generated initially, and if the upgraded bird nest position evaluation function is higher than an original position, the upgraded position replaces the original position; otherwise, the original position is kept unchanged. In order to ensure that the selected characteristics are as few as possible and the corresponding data are various, an evaluation function formula is set as follows (improvement II):
Figure BDA0003217874710000101
wherein ,
Figure BDA0003217874710000102
is the value of the ith feature of the kth bird nest in the t-th iteration,/-, is->
Figure BDA0003217874710000103
Is the variance of the ith feature of the kth bird's nest in the nth iteration.
S205: for n bird nests, randomly generating a number a E (0, 1), if a < P a It means that the host bird finds the bird eggs of the cuckoo and gives up the bird nest to reconstruct a new bird nest.
S206: and evaluating all bird nest positions generated in the iteration, selecting the position with the best quality, comparing the bird nest quality in the current omega, and if the bird nest quality is better than the current omega, putting the bird nest into the omega.
S207: high-quality feature preservation and calculation of occurrence probability P of each feature in omega i If P i And if the value is more than or equal to 0.5, the corresponding feature is 1, otherwise, the value is 0, and a group of high-quality feature combination solutions are obtained. And calculating an evaluation function value, and if the evaluation function value is larger than the evaluation function of the current optimal position in omega, replacing the current optimal solution with a high-quality feature combination solution.
The step of preserving the high-quality characteristics can accelerate the convergence rate.
S208: the loop executes S204-S207 until the number of iterations is reached. The last group of bird nest positions in ω is output as the optimal combination feature.
Preferably, the step S3 specifically includes the following steps:
s301: and (3) carrying out normalization processing on the data of the corresponding characteristics by adopting a min-max method, namely normalizing all values to be within the range of [0,1], wherein the specific formula is as follows:
Figure BDA0003217874710000104
wherein ,
Figure BDA0003217874710000105
a z-th value representing a j-th feature in the dataset,>
Figure BDA0003217874710000106
and->
Figure BDA0003217874710000107
The score represents the maximum and minimum of the j-th feature in the dataset.
S302: the number of neurons of the input layer depends on the outcome of S2; the output layer has 2 nerve cells which respectively correspond to the probability values of the normal flow and the DDoS-containing flow;
s303: an N-layer classical BP neural network model is built, and comprises an input layer, L hidden layers (L > 3) and an output layer, wherein a full connection mode is adopted between the layers. The activation functions of the BP neural network model input layer and the hidden layer adopt ReLU functions, so that forward propagation has nonlinear characteristics; the activation function of the output layer adopts a Sigmoid function. The formula of the ReLU function and the Sigmoid function is specifically as follows:
f ReLU (x)=max(x,0);
Figure BDA0003217874710000111
in the steps, a BCS algorithm is combined with deep learning, redundant features of the CICIDS2017 dataset are removed by using a BCS, and the left features are used as input of the deep learning (BP neural network model), so that the accuracy is improved, and the model training time and the model size are reduced.
S304: evaluating performance of the BP neural network model in intrusion detection by using the confusion matrix: TP is normal prediction of data; TN is abnormal data classification; FP is normal for data anomaly prediction; FN is data anomaly classification anomaly. The functionality of the BP neural network model is evaluated in terms of Accuracy (Accuracy), recall (Recall), and disturbance (Disturb).
Figure BDA0003217874710000112
Figure BDA0003217874710000113
Figure BDA0003217874710000114
Preferably, the storing in S4 specifically stores the port number, the domain name, the IP address and the MAC address of the virtual machine in the running state in the same super-converged node.
Preferably, the step S5 specifically includes the following steps:
s501: at the kernel layer, the original traffic data of each in-out virtual machine is collected and acquired through packet capturing software (such as tcpdump, wireshark and the like).
S502: and extracting the source IP address and the MAC address of the traffic data flowing out of the virtual machine, comparing the source IP address and the MAC address with information stored in the kernel layer, and if any matching of the IP address or the MAC address fails, considering that the traffic data is IP/MAC address spoofed, and executing step S7. Step S6 is executed for the traffic having the correct IP address and MAC address.
The flow verification step improves the detection efficiency of DDos attack under the super fusion architecture.
Preferably, the step S6 specifically includes the following steps:
s601, sending the original flow data passing through the S5 to CICFlowMeter software for flow analysis.
And S602, carrying out normalization processing on the relevant characteristic data according to the result of the BCS.
And S603, sending the processed data into a BP neural network for analysis and detection, and obtaining a label attribute with a maximum probability value as a network intrusion detection result. If the current flow belongs to normal behavior, the virtual switch forwards the current flow; if the current flow belongs to DDoS attack, the method proceeds to S7.
Preferably, the step S7 specifically includes the following steps:
s701: if the network traffic is found to contain malicious behaviors, an alarm is sent to an administrator, and related information (such as generation sources and target sources, occurrence time of events, duration time and virtual machine port numbers) of the data packets are recorded to form a log.
S702: if the virtual machine of the current super fusion node is found to send out malicious behaviors, the port detailed information is extracted from the data packet header. The application associated with the port will be identified and declared suspicious. The manager alerts the tenant to delete these applications. If the tenant virtual machine malicious traffic exceeds a threshold, it will be isolated until it is found to be benign.
S703: and discarding the network traffic packet containing the DDoS attack.
TABLE 1 flow extracted features
Figure BDA0003217874710000121
Figure BDA0003217874710000131
The intrusion detection method based on deep learning under the super-fusion architecture is applied to a virtualized kernel layer of the super-fusion architecture, network traffic entering and exiting a virtual machine is captured in the kernel layer, IP/MAC address verification is firstly carried out, relevant traffic characteristics are extracted and sent into a deep learning model which is completed through CICICIDS 2017 data set training for analysis, and therefore low-cost and high-precision DDoS attack detection under the super-fusion architecture is achieved. In particular, in order to reduce redundant information, an improved binary cuckoo algorithm is adopted to select the characteristics, the size of a deep learning model is reduced while the detection precision is ensured, and meanwhile, the time for extracting the network flow characteristics is reduced.
The invention also provides an intrusion detection device based on deep learning under the super-fusion architecture, which comprises:
the training data acquisition unit is used for acquiring a network intrusion detection data set and extracting the characteristics of the data set;
the redundant feature removing unit is used for removing redundant features in the data set features by adopting a binary cuckoo algorithm, and taking the result as the input of deep learning;
the training unit is used for carrying out normalization processing on the data of the corresponding characteristics according to the result of the redundant characteristic removing unit, then inputting the data into a deep learning model for training, and outputting the model as a probability value belonging to normal flow or containing DDoS attack flow;
the address information storage unit is used for acquiring the IP address and the MAC address of the virtual machine through the virtual machine management program and storing the IP address and the MAC address of the virtual machine in the same super fusion node;
the flow verification unit is used for capturing network flow packets which are in and out of the virtual machine, and verifying the network flow packets which are out of the virtual machine so as to prevent IP spoofing and MAC spoofing;
and the intrusion detection unit is used for carrying out data processing and feature extraction on the flow packets with correct IP addresses and MAC addresses after the verification of the flow verification unit, and identifying whether DDoS attack behaviors are contained or not by utilizing the deep learning model trained by the training unit.
For the deep learning based intrusion detection device under the super fusion architecture according to the embodiment of the present invention, since the deep learning based intrusion detection device corresponds to the deep learning based intrusion detection method under the super fusion architecture in the above embodiment, the description is relatively simple, and the relevant similarities are described in the above description of the deep learning based intrusion detection method under the super fusion architecture in the above embodiment, which is not described in detail herein.
The embodiment of the invention also discloses a computer readable storage medium, wherein a computer instruction set is stored in the computer readable storage medium, and when the computer instruction set is executed by a processor, the intrusion detection method based on deep learning under the super fusion architecture provided by any embodiment is realized.
In the several embodiments provided in the present invention, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (7)

1. An intrusion detection method based on deep learning under a super fusion architecture is characterized by comprising the following steps: a detection model training stage and an actual flow analysis stage;
wherein, the detection model training stage comprises the steps of:
s1, acquiring a network intrusion detection data set, and extracting characteristics of the data set;
s2, removing redundant features in the data set features by adopting a binary cuckoo algorithm, and taking the result as the input of deep learning;
s3, carrying out normalization processing on the data of the corresponding characteristics according to the result of the S2, and then inputting the data into a deep learning model for training, wherein the output of the deep learning model is the probability of normal flow or DDoS attack flow;
the actual flow analysis stage comprises the steps of:
s4, acquiring an IP address and an MAC address of the virtual machine through a virtual machine management program, and storing the IP address and the MAC address of the virtual machine in the same super fusion node;
s5, capturing network traffic packets in and out of the virtual machine, and verifying the network traffic packets flowing out of the virtual machine to prevent IP spoofing and MAC spoofing;
s6, carrying out data processing and feature extraction on the flow packets with correct IP addresses and MAC addresses after flow verification, and identifying whether DDoS attack behaviors are contained or not by using a deep learning model after training; in the binary cuckoo algorithm, tent chaotic mapping is adopted for initialization, and n 0-1 arrays with the size of 1 XD are generated to represent the positions of bird nests; wherein, the Tent chaotic mapping formula is as follows:
Figure FDA0004062218080000011
wherein q is a random number in (0, 1);
Figure FDA0004062218080000012
for the ith feature of the kth bird nest, when i=0, +.>
Figure FDA0004062218080000013
Is a random number in (0, 1);
after n arrays are created, binarization operation with a threshold value of 0.5 is executed, and initialization is completed; wherein "1" indicates that the ith feature in the kth bird nest is selected and "0" indicates that the ith feature in the kth bird nest is not selected;
in the binary cuckoo algorithm, the n bird nest positions which are initially generated are updated by using Levy flight, and if the updated bird nest position evaluation function is higher than the original position, the updated position replaces the original position; otherwise, the original position is kept unchanged; setting an evaluation function of bird nest positions as follows:
Figure FDA0004062218080000021
wherein ,
Figure FDA0004062218080000022
is the value of the ith feature of the kth bird nest in the t-th iteration, D is the feature number, < >>
Figure FDA0004062218080000023
Is the variance of the ith feature of the kth bird nest in the nth iteration;
in the binary cuckoo algorithm, after the optimal bird nest position is selected in the iteration, the occurrence probability P of each feature in the optimal bird nest position is calculated i If P i More than or equal to 0.5, corresponding features are 1, otherwise, 0, and a group of high-quality feature combination solutions are obtained; and calculating an evaluation function value of the high-quality feature combination solution, and if the evaluation function value is larger than the evaluation function of the current best bird nest position in the best bird nest positions, replacing the current best bird nest positions with the high-quality feature combination solution.
2. The intrusion detection method based on deep learning under a super fusion architecture according to claim 1, wherein the deep learning model is a BP neural network model, and the BP neural network model comprises: an input layer, L hidden layers, L >3, and an output layer, all of which are connected by a full connection mode; wherein, the activating functions of the input layer and the hidden layer are all ReLU functions, and the activating function of the output layer is Sigmoid function; the number of neurons of the input layer depends on the outcome of S2; the output layer has 2 neurons corresponding to the probability values of normal traffic and DDoS-containing traffic, respectively.
3. The intrusion detection method based on deep learning under the super fusion architecture according to claim 2, wherein after training the BP neural network model, further comprising: and evaluating the performance of the BP neural network model in intrusion detection by using the confusion matrix.
4. The intrusion detection method based on deep learning under the super fusion architecture according to claim 1, wherein capturing network traffic packets coming in and going out of the virtual machine, and verifying the network traffic packets flowing out of the virtual machine, comprises:
collecting and acquiring original flow data of each in-out virtual machine through packet capturing software in a kernel layer;
and extracting the source IP address and the MAC address of the traffic data flowing out of the virtual machine, and comparing the source IP address and the MAC address with information stored in the kernel layer, wherein if any matching of the IP address or the MAC address fails, the traffic data is considered to be deception with the MAC address.
5. The intrusion detection method based on deep learning under the super fusion architecture according to claim 1, wherein after S6, further comprises:
discarding traffic packets containing DDoS attacks;
if the network traffic is found to contain malicious behaviors, sending an alarm to an administrator, and recording related information of the data packet to form a log; the related information of the data packet at least comprises: generating source and target source, occurrence time of event, duration length, virtual machine port number;
if the virtual machine of the current super fusion node is found to send out malicious behaviors, extracting port detailed information from the data packet header; the application associated with the port will be identified and declared suspicious; the manager alerts the tenant to delete the applications; if the tenant virtual machine malicious traffic exceeds a threshold, it will be isolated until it is found to be benign.
6. An intrusion detection device based on deep learning under a super fusion architecture, the device comprising:
the training data acquisition unit is used for acquiring a network intrusion detection data set and extracting the characteristics of the data set;
the redundant feature removing unit is used for removing redundant features in the data set features by adopting a binary cuckoo algorithm, and taking the result as the input of deep learning;
the training unit is used for carrying out normalization processing on the data of the corresponding characteristics according to the result of the redundant characteristic removing unit, then inputting the data into the deep learning model for training, and outputting the deep learning model as a probability value belonging to normal flow or containing DDoS attack flow;
the address information storage unit is used for acquiring the IP address and the MAC address of the virtual machine through the virtual machine management program and storing the IP address and the MAC address of the virtual machine in the same super fusion node;
the flow verification unit is used for capturing network flow packets which are in and out of the virtual machine, and verifying the network flow packets which are out of the virtual machine so as to prevent IP spoofing and MAC spoofing;
the intrusion detection unit is used for carrying out data processing and feature extraction on the flow packets with correct IP addresses and MAC addresses after the verification of the flow verification unit, and identifying whether DDoS attack behaviors are contained or not by utilizing the deep learning model trained by the training unit;
in the binary cuckoo algorithm, tent chaotic mapping is adopted for initialization, and n 0-1 arrays with the size of 1 XD are generated to represent the positions of bird nests; wherein, the Tent chaotic mapping formula is as follows:
Figure FDA0004062218080000041
wherein q is a random number in (0, 1);
Figure FDA0004062218080000042
for the ith feature of the kth bird nest, when i=0, +.>
Figure FDA0004062218080000043
Is a random number in (0, 1);
after n arrays are created, binarization operation with a threshold value of 0.5 is executed, and initialization is completed; wherein "1" indicates that the ith feature in the kth bird nest is selected and "0" indicates that the ith feature in the kth bird nest is not selected;
in the binary cuckoo algorithm, the n bird nest positions which are initially generated are updated by using Levy flight, and if the updated bird nest position evaluation function is higher than the original position, the updated position replaces the original position; otherwise, the original position is kept unchanged; setting an evaluation function of bird nest positions as follows:
Figure FDA0004062218080000044
wherein ,
Figure FDA0004062218080000045
is the value of the ith feature of the kth bird nest in the t-th iteration, D is the feature number, < >>
Figure FDA0004062218080000046
Is the variance of the ith feature of the kth bird nest in the nth iteration; in the binary cuckoo algorithm, after the optimal bird nest position is selected in the iteration, the occurrence probability P of each feature in the optimal bird nest position is calculated i If P i More than or equal to 0.5, corresponding features are 1, otherwise, 0, and a group of high-quality feature combination solutions are obtained; and calculating an evaluation function value of the high-quality feature combination solution, and if the evaluation function value is larger than the evaluation function of the current best bird nest position in the best bird nest positions, replacing the current best bird nest positions with the high-quality feature combination solution.
7. A computer readable storage medium, wherein a computer instruction set is stored in the computer readable storage medium, and when the computer instruction set is executed by a processor, the method for intrusion detection based on deep learning under a super fusion architecture is provided in any one of claims 1 to 5.
CN202110948874.XA 2021-08-18 2021-08-18 Intrusion detection method and device based on deep learning under super fusion architecture Active CN113660273B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110948874.XA CN113660273B (en) 2021-08-18 2021-08-18 Intrusion detection method and device based on deep learning under super fusion architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110948874.XA CN113660273B (en) 2021-08-18 2021-08-18 Intrusion detection method and device based on deep learning under super fusion architecture

Publications (2)

Publication Number Publication Date
CN113660273A CN113660273A (en) 2021-11-16
CN113660273B true CN113660273B (en) 2023-06-02

Family

ID=78480984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110948874.XA Active CN113660273B (en) 2021-08-18 2021-08-18 Intrusion detection method and device based on deep learning under super fusion architecture

Country Status (1)

Country Link
CN (1) CN113660273B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114500102B (en) * 2022-03-09 2024-02-13 绍兴文理学院 Sampling-based edge computing architecture Internet of things intrusion detection system and method
CN116319427A (en) * 2023-05-22 2023-06-23 北京国信蓝盾科技有限公司 Safety evaluation method, device, electronic equipment and medium based on equipment network
CN116866008A (en) * 2023-06-15 2023-10-10 北京志凌海纳科技有限公司 System network security guarantee device and method under super fusion architecture

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104936186A (en) * 2015-07-21 2015-09-23 桂林电子科技大学 Cognitive radio network spectrum allocation method based on cuckoo search algorithm

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102970306B (en) * 2012-12-18 2015-07-15 中国科学院计算机网络信息中心 Intrusion detection system under Internet protocol version 6 (IPv6) network environment
CN107222333A (en) * 2017-05-11 2017-09-29 中国民航大学 A kind of network node safety situation evaluation method based on BP neural network
CN108040073A (en) * 2018-01-23 2018-05-15 杭州电子科技大学 Malicious attack detection method based on deep learning in information physical traffic system
CN108387820A (en) * 2018-03-20 2018-08-10 东北电力大学 Fault Section Location of Distribution Network containing distributed generation resource
CN109309680A (en) * 2018-10-09 2019-02-05 山西警察学院 Network security detection method and guard system based on neural network algorithm
CN109639481B (en) * 2018-12-11 2020-10-27 深圳先进技术研究院 Deep learning-based network traffic classification method and system and electronic equipment
CN110138759A (en) * 2019-05-06 2019-08-16 华东师范大学 The lightweight self-adapting detecting method and system of Packet-In injection attacks are directed under SDN environment
CN110261735B (en) * 2019-06-18 2021-07-20 西华大学 Power distribution network fault positioning method based on improved quantum cuckoo algorithm
CN110717525B (en) * 2019-09-20 2022-03-08 浙江工业大学 Channel adaptive optimization anti-attack defense method and device
CN111031009A (en) * 2019-11-25 2020-04-17 杭州安恒信息技术股份有限公司 Multilayer-based NOSQL injection attack detection method and device
CN113162919A (en) * 2021-03-22 2021-07-23 国网河北省电力有限公司信息通信分公司 Intrusion detection method based on network abnormal flow identification
CN113051769B (en) * 2021-04-09 2022-09-30 中南大学 Power curve modeling method based on asymmetric loss and hybrid intelligent optimization algorithm

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104936186A (en) * 2015-07-21 2015-09-23 桂林电子科技大学 Cognitive radio network spectrum allocation method based on cuckoo search algorithm

Also Published As

Publication number Publication date
CN113660273A (en) 2021-11-16

Similar Documents

Publication Publication Date Title
CN113660273B (en) Intrusion detection method and device based on deep learning under super fusion architecture
CN106790186B (en) Multi-step attack detection method based on multi-source abnormal event correlation analysis
CN109391602B (en) Zombie host detection method
US9916447B2 (en) Active defense method on the basis of cloud security
Kayacik et al. Selecting features for intrusion detection: A feature relevance analysis on KDD 99 intrusion detection datasets
CN111355697B (en) Detection method, device, equipment and storage medium for botnet domain name family
Sultana et al. Intelligent network intrusion detection system using data mining techniques
Niu et al. Identifying APT malware domain based on mobile DNS logging
US11544575B2 (en) Machine-learning based approach for malware sample clustering
CN108055228B (en) A kind of smart grid intruding detection system and method
CN111726342B (en) Method and system for improving alarm output accuracy of honeypot system
US10931706B2 (en) System and method for detecting and identifying a cyber-attack on a network
CN114726557A (en) Network security protection method and device
Haltaş et al. An automated bot detection system through honeypots for large-scale
Brandao et al. Log Files Analysis for Network Intrusion Detection
CN111935185A (en) Method and system for constructing large-scale trapping scene based on cloud computing
CN113965393B (en) Botnet detection method based on complex network and graph neural network
Eldos et al. On the KDD'99 Dataset: Statistical Analysis for Feature Selection
US11805140B2 (en) Systems and methods for utilizing a machine learning model to detect anomalies and security attacks in software-defined networking
CN112070161B (en) Network attack event classification method, device, terminal and storage medium
WO2024007615A1 (en) Model training method and apparatus, and related device
Khan et al. Lightweight testbed for cybersecurity experiments in scada-based systems
CN117336033A (en) Traffic interception method and device, storage medium and electronic equipment
Haseeb et al. Iot attacks: Features identification and clustering
CN117391214A (en) Model training method and device and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant