CN115664784A

CN115664784A - Network attack immune defense method and system adopting multi-module learning

Info

Publication number: CN115664784A
Application number: CN202211298612.4A
Authority: CN
Inventors: 陈玉强; 秦峰; 吴昊; 陆月明; 韩道岐; 高佳琪; 王成月; 樊明睿; 王秦君; 王大明; 徐文杰; 陆文强; 王占峰
Original assignee: Beijing Guoxin Blue Shield Technology Co ltd
Current assignee: Beijing Guoxin Blue Shield Technology Co ltd
Priority date: 2022-10-20
Filing date: 2022-10-20
Publication date: 2023-01-31

Abstract

The invention provides a multi-module learning-based network attack immune defense method, which aims to immediately capture attack characteristics when an attack occurs, detect corresponding attacks and defend the attacks, thereby ensuring the network security of the whole system. The primary functions are discovery, collection, reporting, planning to block or prevent malicious activity. The intrusion prevention system is an extension of an intrusion detection system, and the intrusion prevention system and the intrusion detection system are integrated; the invention combines multi-source data of information side and physical side to sense multi-mode data, and the self-learning method makes the multi-mode data detect abnormity in cognitive learning continuously; finally, corresponding prediction and early warning are provided, well-designed hidden attack behaviors are excavated, and support is provided for attack tracing/blocking and network resource adjustment of the information side.

Description

Network attack immune defense method and system adopting multi-module learning

Technical Field

The invention relates to the field of network technology and information security, in particular to a network attack immune defense method and system adopting multi-module learning.

Background

With the great improvement of communication and computing capabilities, the number of network digital devices is greatly increased, and meanwhile, the relationship between information and physics is increasingly coupled, however, more network attacks are introduced, so that people have great difficulty in maintaining the network security of the information and communication technology of the system. In a known popular network security system, a firewall technology is a first defense line for defending against intrusion behaviors, an intrusion detection system is a supplement to the firewall technology, and the combination of the firewall technology and the intrusion detection system defends network attacks to a certain extent. An Intrusion Detection and Prevention System (IDPS) is based on intrusion detection, can protect a network comprehensively, deeply and actively, and has good performance in defending network attacks, so that research on the intrusion prevention system has great significance in protecting the security of the system.

Intrusion Detection (IDS) refers to the process of identifying and reacting to the behavior of a malicious attack on a computer or network resource. IDS schemes are largely divided into misuse detection and anomaly detection. Misuse detection, i.e. signature-based systems, rely to a large extent on signatures of attacks and malicious behavior and support multi-class classification, methods based on misuse detection are more accurate in identifying known malicious behavior and variants thereof. However, a new attack cannot be detected because the signature of the new attack is not available. On the other hand, IDS based on anomaly detection can rely on the normal behavior characteristics of the user to detect new attacks and only support two classes. However, in dynamic organizations where user roles occasionally change, their profiles should be updated accordingly. In addition, anomaly detection schemes may have false-positive issues.

Intrusion Detection and Prevention Systems (IDPS) are systems that monitor suspicious activity or abnormal behavior in a system or network and take appropriate action on it. Intrusion Detection Systems (IDS) only detect malicious activity in the network and issue alerts to administrators so that the administrators must decide how to handle the alerts. In addition to generating alarms, intrusion Prevention System (IPS) systems can automatically react to abnormal activity, preventing attacks from being made or resetting connections. In all types of these systems, it is most often considered that they generate a large number of false alarms. One way to reduce these false alarms is to use a hybrid system that combines the advantages of two or more technologies. The goal of IDPS systems is to block attacks before they succeed and take a number of measures to protect the network from different types of attacks, such as DoS attacks and FTP attacks.

Traditional machine learning lacks a labeled training data set and relies heavily on human-extracted features, which makes it difficult to deploy on large platforms, while deep learning is a subset of machine learning, which has short training time, high accuracy, and higher performance than traditional methods, and aims to find suitable advanced features from raw input data using a hierarchical structure, rather than using manual features. Therefore, deep learning has received a wide range of attention in studying intrusion detection and defense methods.

The existing intrusion detection and defense system based on deep learning has weak perception capability, high false alarm rate and insufficient self-adaptive learning capability.

Disclosure of Invention

In view of the above, the present invention is proposed to provide a network attack immune defense method and system using multi-module learning, which overcomes or at least partially solves the above problems.

According to one aspect of the invention, a network attack immune defense method adopting multi-module learning is provided, and the defense method comprises the following steps:

collecting information side data and physical side data;

preprocessing the information side data and the physical side data to obtain multi-source data;

performing feature learning on the multi-source data;

detecting the abnormality by using a self-learning model;

and processing the detected abnormal data packet, and then taking measures to recover the normal communication of the host.

Optionally, the collecting information-side data specifically includes: network flow, segment data, the physical side data includes time, location, device id, network segment, area, device type feature, user organization.

Optionally, the preprocessing the information side data and the physical side data to obtain multi-source data specifically includes:

data preprocessing: the intrusion detection of the system firstly requires a comprehensive and general understanding of a target system, and general statistics and understanding of the source and the structure of data;

optionally, the performing feature learning on the multi-source data specifically includes:

pre-training based feature representation: with the development of single-mode technologies based on deep learning, in deep learning, it is expected that a machine-learned representation can contain complete semantic information of data as well as human cognition, and it is not enough to use only single-mode data in order to make a model construct a representation with rich and comprehensive semantic information for a specific sample, so that multi-mode technologies gradually attract more attention;

the method is characterized and learned in physical layer information, original byte codes of packets and network flow, and joint defense is achieved on an information side and a physical side.

Optionally, the detecting the abnormality by using the self-learning model specifically includes:

the self-supervision cognitive memory network is adopted to realize the characterization learning;

and optimizing the model by utilizing feature sparseness, feature reconstruction and sample reconstruction so as to better extract potential spatial features.

The invention also provides a network attack immune defense system adopting multi-module learning, which comprises:

the data collection module is used for collecting information side data and physical side data;

the characteristic calculation module is used for preprocessing the information side data and the physical side data to obtain multi-source data; performing feature learning on the multi-source data;

the attack detection module is used for detecting the abnormality by utilizing the self-learning model;

the attack judging module is used for judging whether an attack needs to be initiated or not;

the defense module is used for handling the detected abnormal data packet;

and the recovery module is used for taking measures to recover the normal communication of the host.

The invention provides a network attack immune defense method adopting multi-module learning, which comprises the following steps: collecting information side data and physical side data; preprocessing the information side data and the physical side data to obtain multi-source data; performing feature learning on the multi-source data; detecting the abnormality by using a self-learning model; and processing the detected abnormal data packet, and then taking measures to recover the normal communication of the host. The attack characteristics are immediately captured when the attack occurs, and the corresponding attack is detected and defended, so that the network security of the whole system is ensured.

The above description is only an overview of the technical solutions of the present invention, and the present invention can be implemented in accordance with the content of the description so as to make the technical means of the present invention more clearly understood, and the above and other objects, features, and advantages of the present invention will be more clearly understood.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a system architecture diagram of an intrusion detection and defense model according to an embodiment of the present invention;

FIG. 2 is a flow chart of intrusion detection and prevention provided by an embodiment of the present invention;

FIG. 3 is a flowchart of a method for token learning based on a BEiT-3 pre-training model according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a BEiT-3 pre-training model according to an embodiment of the present invention;

fig. 5 is a schematic diagram of an anomaly detection process based on a cognitive memory network according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The terms "comprises" and "comprising," and any variations thereof, in the described embodiments of the invention and in the claims and drawings, are intended to cover a non-exclusive inclusion, such as, for example, a list of steps or elements.

The technical solution of the present invention is further described in detail with reference to the accompanying drawings and embodiments.

The invention provides a multi-module learning-based network attack immune defense method, which aims to immediately capture attack characteristics when an attack occurs, detect corresponding attacks and defend the attacks, thereby ensuring the network security of the whole system. The primary functions are discovery, collection, reporting, planning to block or prevent malicious activity. The intrusion prevention system is an extension of an intrusion detection system, and the intrusion prevention system and the intrusion detection system are integrated; the invention combines the multi-source data of the information side and the single side of the physical side to sense the multi-mode data, so that the representation obtained by the robot can contain the complete semantic information of the data as the cognition of people; then, a self-supervision cognitive memory network based on deep learning is used, and the self-learning method can continuously detect abnormality in cognitive learning; finally, corresponding prediction and early warning are provided, well-designed hidden attack behaviors are excavated, and support is provided for attack tracing/blocking and network resource adjustment of the information side.

As shown in fig. 1, the model of the present invention comprises the following two main modules: the intrusion detection module and the intrusion prevention module. At the intrusion detection module, a deep learning model is created to detect intrusion and any possible network threats, which is done through a series of small modules, so that the model has the greatest possible accuracy and negligible loss. Specifically, the intrusion detection module comprises data preprocessing, feature representation based on pre-training and an anomaly detection model based on a cognitive memory network. First starting with the appropriate data set collection. After the data set acquisition, pre-processing is performed. Then, inputting data into a pre-training-based feature representation module to create a pre-training model, and performing characterization learning on multi-modal data to capture complete semantic information contained in the multi-modal data. And finally, detecting the non-compliant behavior by adopting an anomaly detection module based on the cognitive memory network, and testing the test data by using the model.

At the intrusion prevention module, intrusions are blocked by using scripts that all administrator privileges run in the background. The script is developed in a way that blocks any malicious requests, such as DOS attacks, by terminating the connection and notifying the administrator of the occurrence of malicious events. Both the detection and defense phases are integrated and deployed as software.

For a network attack immune defense method using multi-module learning, details of each module in the flow of each link of the invention are further explained below with reference to the drawings and examples.

The network attack immune defense adopting multi-module learning provided by the invention comprises two parts: the device comprises an intrusion detection module and an intrusion prevention module. The focus of the invention is on the first module. The intrusion detection module mainly comprises three steps of data preprocessing, feature representation based on pre-training and anomaly detection based on a cognitive memory network, which will be described in detail later. In the intrusion prevention module, linux-based command operations are mainly used to prevent attacks detected by the intrusion detection module. If a probe attack occurs, the IP addresses of the sender and attacker are obtained and all packets from the attacker's IP are discarded or blocked, similarly if a DOS attack occurs, the system records the port number of the attacked port and simultaneously blocks all packets passing through that port number and does nothing if the output of the detection section is normal. Thus, by using all administrator privileges, it is possible to use scripts running in the background to prevent intrusions, protecting the system from any type of intrusion that may occur in the network, and thus preventing any type of attack.

The specific intrusion detection and defense flow is shown in fig. 2. The data collection module is mainly used for regularly collecting some information side data, such as network flow and segmentation data, and physical side data, such as time, position, equipment id, network segment, area, equipment type feature, user organization and other associated data. The feature calculation module has the main function of performing feature learning on multi-source data obtained after preprocessing collected data, uses a pre-trained model which has good generalization capability and can learn the representation of rich semantic information under large-scale data, and can be applied to fine tuning on a small data set of a downstream task. The attack detection module has the main function of detecting the abnormality by using a self-learning model and is an unsupervised method. The defense module and the recovery module are used for processing the abnormal data packet detected in the previous step and then taking certain measures to recover the normal communication of the host.

Intrusion detection module

The intrusion detection module comprises three processes of data preprocessing, feature representation based on pre-training and an anomaly detection model based on a cognitive memory network. The module is mainly used for determining data required to be collected by a system, performing feature learning by using a pre-training model aiming at multi-source data to obtain features with more representative meanings on safety, and detecting problem data packets from the existing data according to a cognitive memory network. Warns it and provides it to the next module for specific defense strategy selection.

1) Data preprocessing:

intrusion detection on a system requires a relatively comprehensive understanding of a target system, and a general statistics and understanding of the source and structure of data. In this scenario, the information-side data, the physical-side data, and the association relationship therebetween in the system are focused on. After the information is obtained, corresponding data preprocessing is performed according to a required model, and generally, continuous data discretization, redundant feature removal, important feature screening, feature value standardization, data set category imbalance solving and the like are performed on network stream data.

2) Pre-training based feature representation:

with the development of the monomodal technology based on deep learning, the representation learning is the most core part of the technology, and in the deep learning, the representation learned by a machine is expected to contain complete semantic information of data as well as human cognition. In order for a model to construct a representation with rich and comprehensive semantic information for a specific sample, it is not enough to use only single-modality data, and thus multi-modality techniques are gradually attracting more attention.

The physical layer information, the original byte codes of the packets and the network flow are characterized and learned, so that better abnormal detection and defense effects can be achieved respectively, however, complete semantic information representation of specific samples is not extracted, the complementary information among data can be effectively utilized, defense blind areas can be effectively reduced and a defense system can be enhanced by utilizing combined defense of an information side and a physical side, and more robust classification or prediction can be realized.

The criteria to be followed for multimodal learning are: the complementarity criterion is that complementary information among a plurality of modes is utilized to enhance the model, and the target object can be more comprehensively expressed by integrating the multi-mode information; the consistency criterion is based on the assumption that multi-modal data share certain consistent semantic information, here BEiT-3, and utilizes a shared multi-channel Transformer structure to complete pre-training by performing mask data modeling on single-modal and multi-modal data, so that the multi-modal data can be migrated to various downstream tasks. As shown in FIG. 3, a characterization learning method based on the BEiT-3 pre-training model.

Converting bit stream information of a physical side and associated data such as time, position, equipment id, network segment, area, equipment type characteristic, user organization and the like into image information by using a tool, wherein the network stream characteristic of the information side can be regarded as text information; then, performing characterization learning in a BEiT-3 pre-training model, and using a Bayesian parameter optimization method when selecting the mask number in the step so as to use less exposure information to support abnormal recognition; and finally, the output of the previous step is used for completing the downstream task, namely the abnormality detection task.

The structure of BEiT-3 is shown in FIG. 4:

a shared multi-channel Transformer is used as a backbone network to perform mask data modeling on single-mode (namely image and text) data and multi-mode (namely image-text pair) data. The innovation of BEiT-3 includes three aspects:

the first, backbone network uses a multi-channel Transformer. Multiple transforms are used as a backbone network to encode different modalities. Each multi-channel Transformer consists of a shared self-attention module (self-attention) and a plurality of modality experts (modality experts), and each modality expert is a feed-forward neural network (feed-forward network). The shared self-attention module can effectively learn the alignment of different modal information and perform deep fusion coding on the different modal information, so that the information can be better applied to multi-modal understanding tasks. Depending on the currently input modality class, multiple transformers select different modality experts to encode to learn more modality-specific information. The multipath transformers for each layer contain one visual expert and one linguistic expert, while the first three layers of Multiway transformers possess visual-linguistic experts designed for the fusion encoder.

Second, the pre-training task, modeling with masked data modeling. BEiT-3 pre-training is performed on both single-modality (i.e., image-to-text) and multi-modality data (i.e., image-to-text pairs) through a unified mask-prediction task. During pre-training, a certain percentage of text characters or pixel blocks are randomly masked, and the model learns the representation of different modes and the alignment among different modes by training to recover the masked text characters or visual symbols thereof. Unlike previous visual-language models, which typically employ multiple pre-training tasks, BEiT-3 uses only one unified pre-training task, which is more friendly to training of larger models. Because the generative task is used for pre-training, BEiT-3 does not need mass training compared with a model based on contrast learning, and therefore the problems that the GPU video memory occupies too much are relieved.

Thirdly, the scale of the model is enlarged, BEiT-3 is composed of 40 layers of multi-channel transformers, and the model contains 19 hundred million parameters. On the pre-training data, BEiT-3 is pre-trained based on a plurality of single-modality and multi-modality data, which collected approximately 1500 million images and 2100 million image-text pairs from five public data sets; the single-modality data used 1400 million images and 160GB text corpus.

And then, bayesian optimization is introduced and combined with an automatic supervision pre-training framework for multiple paths of transformers, so that the number of open mask regions is minimized, but the same pre-training effect can be realized, and the model is more robust. In particular, bayesian optimization may be more efficient in gradually selecting non-masked regions, optimizing mask selection so that exposure information can be minimized to support identification.

One of the most difficult parts of the ML workflow is to find the best hyper-parameters for the model. The performance of the ML model is directly related to the hyper-parameters. Bayesian optimization belongs to a class of optimization algorithms, which are called optimization (SMBO) algorithms based on sequence models. In the mathematical process of Bayesian optimization, the following steps are mainly executed:

(1) Defining the fields of f (x) and x to be estimated

(2) Taking out the limited n values of x, solving f (x) corresponding to x (solving observed value)

(3) From the limited observations, the function is estimated (this assumption is called a priori knowledge in bayesian optimization) to arrive at a target value (maximum or minimum) on the estimate f

(4) Defining a rule to determine the next observation point to be calculated

And continues looping through steps 2-4 until the target values on the hypothetical profile reach our criteria, or all computational resources are exhausted (e.g., a maximum of m observations, or a maximum of t minutes allowed to run).

In the actual operation process, especially the hyper-parameter optimization process, the following specific details need to be noted:

when bayesian optimization is not used for HPO, generally f (x) can be a complete black box function (also translated into a black box function), that is, only the corresponding relationship between x and f (x) is known, but no internal rule of the function is known at all, and a function of a specific expression cannot be written out at the same time), so bayesian optimization is also considered as a classic method capable of acting on black box function estimation. However, in the HPO process, f (x) to be defined is generally the result of cross-validation/loss function, and we often know the expression of the loss function very clearly, but we do not know the specific rule inside the loss function, so f (x) in HPO cannot be calculated as a black box function in the strict sense.

In HPO, the argument x is the hyperparameter space. In the two-dimensional image representation, x is one-dimensional, but in practice, the hyper-parameter space is often a high-dimensional and extremely complex space when optimization is performed.

The initial observation value number n and the final maximum observation number m which can be obtained are hyper-parameters of Bayesian optimization, and the maximum observation number m also determines the iteration number of the whole Bayesian optimization.

In step 3, the tool that estimates the distribution of the function from the limited observations is called probabilistic proxy model (probabilistic proxy model) which has certain assumptions and can estimate the distribution f of the objective function from several observations at the polygon (including the value of each point at f and the confidence corresponding to that point). In practical use, the probabilistic proxy model is often a powerful algorithm, such as a gaussian process, a gaussian mixture model, and the like. The gaussian process is often used in traditional mathematical derivation, but the most popular optimization libraries now basically use the TPE process based on the gaussian mixture model by default.

The rule used in step 4 to determine the next observation point is called acquisition Function (acquisition Function), which measures the influence of the observation point on the fitting f and selects the point with the largest influence to perform the next observation, so we usually pay attention to the point with the largest acquisition Function value. The most common acquisition functions are mainly Probability increment PI (Probability of Improvement, such as computation frequency), expectation increment (Expectation Improvement), upper Confidence Bound (Upper Confidence Bound), information Entropy (Entropy), and so on.

3) Anomaly detection based on cognitive memory networks, as shown in fig. 5.

And realizing characterization learning by using a self-supervision cognitive memory network. And optimizing the model by utilizing feature sparseness, feature reconstruction and sample reconstruction so as to better extract potential spatial features. A new intrusion detection method based on a deep neural network, namely cognitive memory oriented automatic encoder (CMAE), is used. Firstly, a Memory-model is introduced to strengthen the Memory capability of normal sample characteristics on the basis of keeping the Auto Encoder architecture. In order to obtain better intrusion detection performance, feature reconstruction loss and feature sparsity loss are proposed to constrain the proposed Memory module besides reconstruction loss, so that the Memory-model distinguishing capability and the normal data representation capability are improved.

The network adopts a convolution neural network to construct a main body structure, and effectively processes structured data. In order to obtain efficient storage modules, feature reconstruction loss and feature sparsity loss are proposed. Loss of feature reconstruction improves the representation capabilities of the storage module and eliminates the differences between query features and retrieval features. Feature sparsity loss requires that memory entries have good distinctiveness and records normal mode diversity.

Definition x ∈ R ^N Where N represents the dimensionality of the input data, representing the input, en (-) represents the encoder, and De (-) represents the decoder. Then, inputting x in En (-) to obtain z _query ∈R ^D Where D is the dimension, then obtaining a z _retrieve ∈R ^D The data is read via the Memory module. Finally, the decoder De (-) converts z _retrieve And reconstructing back to x. The formula is as follows:

z _query ＝En(x；θ _En )

x _rec ＝De(z _retrieve ；θ _De )

wherein theta is _En And theta _De Representing the parameters of the encoder En (-) and decoder De (-) respectively. In particular, convolutional Neural Networks (CNNs) are used as the basic module of encoders and decoders.

The Memory module is used to memorize and locate the potential spatial feature representation of the normal data. The proposed Memory module is formed by a matrix M ∈ R ^K×D And (3) representing, wherein D is the dimension of the memory feature, and K represents the number of memory items.

Each memory item m _i Representing the feature representations stored in the Memory module. The feature representation is obtained by combining with a specific memory term. Memory module M represents z by a specific combination _retrieve As follows:

wherein w _i Denotes z _query And memory item m _i The similarity between them. Each similarity w _i The calculation formula of (2) is as follows:

and multiplying each similarity with the corresponding memory item to obtain a combination mode of the memory items, and regarding the operation of obtaining the specific combination as the attention-based query operation. Then z _retrieve Is reconstructed back to the original input data.

The loss function is improved to obtain:

L＝L _rec +λ _r L _{fea_rec} +λ _s L _{fea_spa}

wherein L is _rec 、L _{fea_rec} And L _{fea_spa} Respectively representing reconstruction loss, characteristic reconstruction loss and characteristic sparsity loss. Further, λ _r And λ s are parameters that balance the loss functions.

Three loss functions are explained below:

loss in reconstruction: the loss function is the basic loss function for training the AE model, and the loss function is the basic loss function for converging the model, so that the data reconstructed by the Decoder is closer to the input data, and the formula is as follows:

meanwhile, the method is also a standard for judging whether the input data is abnormal or not, because if the input data is abnormal, the reconstruction loss is larger than normal.

Loss of feature reconstruction: to minimize z _query And z _retrieve The representation error between the two is more accurate to the feature representation memorized by the Memory module, and the formula is as follows, wherein D is z _query And z _retrieve The dimension (c) of (a) is,

for a value of a certain dimension of the vector:

loss of feature sparsity: guarantee memory item m _i The discrimination between the characteristics improves the capability of memorizing various normal characteristic representations. The Memory module is prevented from memorizing a large number of similar feature representations, and abnormal samples are prevented from being memorized by a large number of Memory items m _i Expressed by the formula shown below, wherein d is a cosine similarity function and K is a memory term m _i The number of (2):

invasion defense module

The intrusion prevention system is a system which is more perfect than intrusion detection, and can carry out multi-layer, deep and active protection on the network so as to effectively ensure the network security. The intrusion prevention system can detect the network threat and actively start a prevention mechanism to block. The intrusion detection module is used for detecting external communication data, allowing normal data to enter the interior through the firewall for interaction, blocking abnormal data in the interior, and ensuring that the network is not threatened by safety.

Has the advantages that: intrusion detection and defense systems play a crucial role in the field of network security in order to prevent attacks on the network. In order to improve the versatility of the system, it is necessary to implement the system as anomaly detection with a deep learning framework, and after implementing an intrusion detection model using deep learning, a script for defense will be generated.

Specifically, the heterogeneous data isomorphism problem, the continuous data and discrete event time scale unification problem, the multi-source data association analysis problem and the strong coupling characteristic of information and physics exist between the physical side and the information side, one side changes the other side and reflects the change of the information and the physics, and the information and the physics are complementary and consistent.

Anomaly detection is then performed using a self-supervised cognitive memory network based on deep learning. Based on the cognitive theoretical basis, the intrusion prevention system has strong self-learning capacity and is updated by utilizing continuous iterative feedback learning. Through certain learning ability, the unknown data is continuously extracted to form meaningful information, the problem that a large amount of missing reports and false reports can occur in the traditional intrusion detection and defense system is solved to a certain extent, the database of the intrusion detection and defense system with the learning ability is continuously dynamic and more variable, and a large amount of missing reports and false reports can be gradually improved along with the continuous improvement of the knowledge base.

In conclusion, through performance analysis, the method provided by the invention improves the intrusion detection and attack blocking efficiency while ensuring normal data to pass, effectively reduces the influence on the network and application caused by false alarm and missing report, and improves the intellectualization and comprehensive defense capability of an intrusion defense system.

The above embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above embodiments are only examples of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A network attack immune defense method adopting multi-module learning is characterized by comprising the following steps:

collecting information side data and physical side data;

performing feature learning on the multi-source data;

detecting the abnormality by using a self-learning model;

2. The method for defending against network attack by multi-module learning according to claim 1, wherein the collecting information-side data specifically comprises: network flow, segment data, the physical side data includes time, location, device id, network segment, area, device type feature, user organization.

3. The method for defending against network attack immunity using multi-module learning according to claim 1, wherein the preprocessing the information-side data and the physical-side data to obtain multi-source data specifically includes:

data preprocessing: intrusion detection on a system requires a comprehensive and general understanding of a target system, and general statistics and understanding of the source and structure of data.

4. The method for defending against network attack immunity using multi-module learning according to claim 1, wherein the feature learning of the multi-source data specifically comprises:

pre-training based feature representation: with the development of the single-mode technology based on deep learning, in the deep learning, it is expected that the machine-learned representation can contain complete semantic information of data as well as human cognition, and it is not enough to use only single-mode data in order to make the model construct a representation with rich and comprehensive semantic information for a specific sample, so that the multi-mode technology gradually draws more attention;

the method is characterized and learned in physical layer information, original byte codes of packets and network flows, and joint defense is achieved by using an information side and a physical side.

5. The method for defending against network attack by multi-module learning according to claim 1, wherein the detecting anomalies by using the self-learning model specifically comprises:

the self-supervision cognitive memory network is adopted to realize the representation learning;

6. A cyber-attack immune defense system using multi-module learning, the defense system comprising:

the attack judging module is used for judging whether to initiate an attack or not;

a defense module for disposing the detected abnormal data packet