CN117527295A - Self-adaptive network threat detection system based on artificial intelligence - Google Patents

Self-adaptive network threat detection system based on artificial intelligence Download PDF

Info

Publication number
CN117527295A
CN117527295A CN202311254098.9A CN202311254098A CN117527295A CN 117527295 A CN117527295 A CN 117527295A CN 202311254098 A CN202311254098 A CN 202311254098A CN 117527295 A CN117527295 A CN 117527295A
Authority
CN
China
Prior art keywords
data
data set
module
network traffic
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311254098.9A
Other languages
Chinese (zh)
Inventor
王文佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Information Security Evaluation Center
Original Assignee
Guangdong Information Security Evaluation Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Information Security Evaluation Center filed Critical Guangdong Information Security Evaluation Center
Priority to CN202311254098.9A priority Critical patent/CN117527295A/en
Publication of CN117527295A publication Critical patent/CN117527295A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the technical field of network threat detection, in particular to an artificial intelligence-based self-adaptive network threat detection system, which comprises: a data collection module configured to collect network traffic data and generate a first data set; the data preprocessing module is used for carrying out denoising and standardization operation to generate a second data set; the feature selection module is used for performing feature selection on the second data set according to a preset feature selection algorithm to generate a third data set; an artificial intelligence model module that generates a fourth dataset; the self-adaptive module is used for comparing and analyzing the threat judgment in the fourth data set with the original network flow data in the first data set and automatically adjusting the parameters of the artificial intelligent model module; the prediction correction module generates a fifth dataset. The invention not only improves the accuracy of threat detection, but also greatly improves the response speed and flexibility of the system because the data flow and parameter adjustment between the modules are all automatically carried out.

Description

Self-adaptive network threat detection system based on artificial intelligence
Technical Field
The invention relates to the technical field of network threat detection, in particular to an artificial intelligence-based self-adaptive network threat detection system.
Background
Network security has become an integral part of today's information society. With the continuous upgrading and diversification of cyber attack means, conventional cyber threat detection methods, such as rule-based and signature-based detection methods, have been difficult to cope with increasingly complex and variable cyber threats. These conventional methods often rely on predefined rules or known attack signatures, and lack sufficient detection capability for new or unknown attack means. Meanwhile, due to the complexity of network environment and data traffic, false alarm and missing report rate are relatively high.
Artificial intelligence techniques, particularly machine learning and deep learning, have shown powerful performance in many fields of image recognition, natural language processing, and the like. However, the application of these advanced artificial intelligence techniques to the field of cyber threat detection still faces a number of challenges. One of the key issues is how to effectively extract features useful for threat detection from massive, multidimensional network traffic data, and how to build an accurate and real-time threat detection model based on these features.
Most existing network threat detection systems based on artificial intelligence are static and lack the ability to adapt in real time. This results in the possibility that these systems may degrade detection performance in the face of changing network environments and threat patterns. Moreover, these systems often also do not take into account threat predictions for future network traffic, and thus it is difficult to provide comprehensive and prospective network security protection.
In order to solve the problems, the invention provides an artificial intelligence-based self-adaptive network threat detection system, which aims to realize network threat detection with high accuracy and high response speed through integration and self-adaptive optimization of multiple modules.
Disclosure of Invention
Based on the above object, the present invention provides an artificial intelligence based adaptive cyber threat detection system.
An artificial intelligence based adaptive cyber threat detection system comprising:
a data collection module configured to collect network traffic data and generate a first data set;
the data preprocessing module is connected to the data collecting module, receives the first data set, performs denoising and standardization operation, and generates a second data set;
the feature selection module is connected to the data preprocessing module, receives the second data set, performs feature selection on the second data set according to a preset feature selection algorithm, and generates a third data set;
the artificial intelligent model module is connected to the feature selection module, receives the third data set, performs threat detection on the third data set by adopting an artificial intelligent algorithm, and generates a fourth data set, wherein the fourth data set comprises judgment on whether the network traffic has threat or not;
the self-adaptive module is connected with the artificial intelligent model module and the data collection module, receives the fourth data set and the first data set, compares and analyzes the threat judgment in the fourth data set with the original network flow data in the first data set, automatically adjusts parameters of the artificial intelligent model module, and feeds back the parameters to the artificial intelligent model module;
the prediction correction module is connected to the self-adaptive module and the artificial intelligent model module, threat prediction is carried out on future network traffic based on the artificial intelligent model adjusted by the self-adaptive module, a fifth data set is generated, and the prediction correction module adjusts a feature selection algorithm of the feature selection module according to a prediction result of the fifth data set so as to improve the accuracy of threat detection.
Further, the data collection module specifically includes:
the network switch or router is accessed, and network data packets of a transmission layer and an application layer are captured in real time through a deep data packet inspection or flow mirroring technology;
carrying out protocol analysis and classification on the captured data packets, and sequencing and indexing the analyzed and classified network data packets according to the source address, the destination address, the port number and the protocol type;
grouping the sequenced and indexed network data packets according to time periods by utilizing a time window, and calculating the statistical characteristics of each group on each preset field; the statistical properties of the generated packets are integrated into a matrix or data frame form as a first data set.
Further, the data preprocessing module performs smoothing processing on each statistical characteristic field in the first data set through a gaussian filtering denoising algorithm, eliminates noise or abnormal values, applies Z-Score standardization to denoised data to convert the values of each statistical characteristic field into values with uniform dimension or range, and then re-integrates the values into a new matrix or data frame form to serve as the second data set.
Further, the feature selection module specifically includes:
applying a recursive feature elimination algorithm or feature ordering based on information gain to evaluate the importance of the statistical characteristic fields in the second data set, specifically removing the last feature field through a series of iterative processes each time until the preset feature quantity K is reached;
according to a recursive feature elimination algorithm or a feature ordering evaluation and ordering result based on information gain, selecting a statistics characteristic field of the top 10 of the ranks as a first feature, and extracting all data corresponding to the 10 fields from a second data set;
reconstructing a new data set by using the extracted 10 first characteristic fields, wherein each row of data only comprises the 10 selected fields, and re-integrating the data according to the arrangement sequence of the data in the original second data set, and generating a new matrix with 10 columns as a third data set after integration;
further, the artificial intelligence model module is internally provided with a pre-trained neural network model, the neural network model receives each row of data in the third data set, and each row of data comprises 10 first characteristic fields selected by the characteristic selection module;
the neural network model performs forward propagation operation on each line of data and outputs a numerical value in a [0,1] interval, wherein the numerical value represents the probability of whether the corresponding network traffic data is threatening network traffic;
for network traffic with an output probability value greater than or equal to a preset threshold, marking the network traffic as 'threatening network traffic'; otherwise, it is marked as "non-threatening network traffic";
the neural network model output and threat signature of each row of data are integrated into a new data set to generate a fourth data set, wherein the fourth data set comprises 10 first characteristic fields in the original third data set and a second characteristic field, and the second characteristic field records the judgment of whether the corresponding network traffic is threat network traffic or not.
Further, the self-adaptive module firstly executes a data alignment operation to pair each row of data in the fourth data set with the corresponding network traffic data in the first data set;
for each paired data, the adaptation module compares the threat determination in the fourth data set with the actual network traffic label in the first data set, and calculates a false positive rate for data labeled "threatening network traffic" but in practice "non-threatening network traffic" or data labeled "non-threatening network traffic" but in practice "threatening network traffic";
based on the calculated misjudgment rate, the self-adaptive module generates a correction factor which directly influences parameters of a neural network model in the artificial intelligent model module;
the self-adaptive module applies the generated correction factors to the neural network model of the artificial intelligent model module, adjusts parameters of the neural network model, and reduces the threshold value of the neural network model or improves the learning rate if the misjudgment rate exceeds a preset upper limit; if the misjudgment rate is lower than the preset lower limit, the threshold value of the neural network model is increased or the learning rate is reduced.
Further, the false positive rate is calculated as follows:
the self-adaptive module maintains a counter set, including a true example TP counter, a true negative example TN counter, a false positive example FP counter and a false negative example FN counter;
for data in the fourth dataset marked as "threatening network traffic" and the first dataset is actually also "threatening network traffic", increment the "true instance TP" counter by one;
for data in the fourth dataset marked as "non-threatening network traffic" and the first dataset is also actually "non-threatening network traffic", increment the "true negative TN" counter by one;
for data in the fourth dataset marked as "threatening network traffic" but in the first dataset actually being "non-threatening network traffic", incrementing a "false positive FP" counter;
for data in the fourth dataset marked as "non-threatening network traffic" but in the first dataset actually "threatening network traffic", incrementing a "false negative example FN" counter;
calculating the misjudgment rate according to the four counters: misjudgment rate
The misjudgment rate is used for evaluating threat detection accuracy of the system and is used as a basis for generating a correction factor by the self-adaptive module.
Further, the prediction correction module receives the artificial intelligent model parameters adjusted by the self-adaptive module, wherein the artificial intelligent model parameters comprise learning rate, threshold value and activation function parameters;
the prediction correction module is embedded with a flow prediction sub-module, and predicts the network flow in a future period of time by using a time sequence analysis algorithm to generate a prediction data set of future network flow;
the prediction correction module transmits the prediction data set to the artificial intelligent model module adjusted by the self-adaptive module, threat detection is carried out on the prediction data set by utilizing the adjusted parameters, an adjusted neural network model is applied to each line of data in the prediction data set, forward propagation operation is carried out, a numerical value in a [0,1] interval is output, and the numerical value represents the probability of whether the corresponding prediction network flow data is threat network flow or not;
marking the network traffic in the predicted data set as 'threatening network traffic' or 'non-threatening network traffic' according to the output probability value and the threshold value adjusted by the self-adapting module;
the neural network model output and threat signature of each row of predicted data are integrated into a new data set to generate a fifth data set, the fifth data set comprising statistical characteristic fields of the predicted network traffic, and third characteristic fields recording a determination of whether the corresponding predicted network traffic is threatening network traffic.
The invention has the beneficial effects that:
the invention realizes high network security protection through multi-module integration. The data collection module, the data preprocessing module, the feature selection module, the artificial intelligent model module, the self-adaptive module and the prediction correction module work cooperatively, so that the whole process from the original network flow to the final threat judgment is ensured to be carried out under a unified frame, the threat detection accuracy is improved, and the response speed and the flexibility of the system are greatly improved because the data flow and the parameter adjustment between the modules are carried out automatically.
According to the invention, the self-adaptive module can automatically adjust the parameters of the artificial intelligent model module according to the misjudgment rate, so that the system can be self-optimized, the prediction correction module further utilizes the adjusted parameters and the characteristics to select correction factors, more accurate threat prediction is carried out on the future network flow, and the prospective protection of the system is realized.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only of the invention and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a system module according to an embodiment of the invention.
Detailed Description
The present invention will be further described in detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent.
It is to be noted that unless otherwise defined, technical or scientific terms used herein should be taken in a general sense as understood by one of ordinary skill in the art to which the present invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
As shown in fig. 1, an adaptive cyber threat detection system based on artificial intelligence, comprising:
a data collection module configured to collect network traffic data and generate a first data set;
the data preprocessing module is connected to the data collecting module, receives the first data set, performs denoising and standardization operation, and generates a second data set;
the feature selection module is connected to the data preprocessing module, receives the second data set, performs feature selection on the second data set according to a preset feature selection algorithm, and generates a third data set;
the artificial intelligent model module is connected to the feature selection module, receives the third data set, performs threat detection on the third data set by adopting an artificial intelligent algorithm, and generates a fourth data set, wherein the fourth data set comprises judgment on whether the network traffic has threat or not;
the self-adaptive module is connected with the artificial intelligent model module and the data collection module, receives the fourth data set and the first data set, compares and analyzes the threat judgment in the fourth data set with the original network flow data in the first data set, automatically adjusts parameters of the artificial intelligent model module, and feeds back the parameters to the artificial intelligent model module;
the prediction correction module is connected to the self-adaptive module and the artificial intelligent model module, threat prediction is carried out on future network traffic based on the artificial intelligent model adjusted by the self-adaptive module, a fifth data set is generated, and the prediction correction module adjusts a feature selection algorithm of the feature selection module according to a prediction result of the fifth data set so as to improve the accuracy of threat detection.
The data collection module specifically comprises:
the network switch or router is accessed, and network data packets of a transmission layer and an application layer are captured in real time through a deep data packet inspection or flow mirroring technology;
carrying out protocol analysis and classification on the captured data packets, and sequencing and indexing the analyzed and classified network data packets according to the source address, the destination address, the port number and the protocol type;
grouping the sequenced and indexed network data packets according to time periods by utilizing a time window, and calculating the statistical characteristics of each group on each preset field, such as average data packet size, data transmission rate and the like, for each group;
integrating the statistical characteristics of each generated group into a matrix or data frame form to be used as a first data set;
the data collection module captures network data packets in real time, and generates a first data set with rich content and structuring through protocol analysis, classification, sequencing, indexing and statistical characteristic calculation, so that high-quality input data is provided for subsequent modules.
The data preprocessing module carries out smoothing treatment on each statistical characteristic field in the first data set through a Gaussian filtering denoising algorithm, eliminates noise or abnormal values, applies Z-Score standardization to the denoised data to convert the values of each statistical characteristic field into values with uniform dimension or range, and then re-integrates the values into a new matrix or data frame form to serve as a second data set;
the data preprocessing module generates a second data set by receiving the first data set from the data collecting module and performing specially designed denoising and standardization steps on the first data set, so as to ensure that the subsequent modules can process data with uniform and reliable quality.
The feature selection module specifically comprises:
applying a recursive feature elimination algorithm or feature ordering based on information gain, performing importance assessment on statistical characteristic fields (data packet size, data transmission rate, protocol type, etc.) in the second data set, specifically removing the last feature field through a series of iterative processes each time until a preset feature number K (K is a positive integer, e.g., k=10) is reached;
according to a recursive feature elimination algorithm or a feature ordering evaluation and ordering result based on information gain, selecting a statistics characteristic field of the top 10 of the ranks as a first feature, and extracting all data corresponding to the 10 fields from a second data set;
reconstructing a new data set by using the extracted 10 first characteristic fields, wherein each row of data only comprises the 10 selected fields, and re-integrating the data according to the arrangement sequence of the data in the original second data set, and generating a new matrix with 10 columns as a third data set after integration;
the feature selection module receives the second data set in an explicit and deterministic manner and, by precisely applying a recursive feature elimination algorithm, selects the 10 statistical property fields most relevant to network threat detection, generating a third data set. Therefore, the data processing efficiency is improved, and the accuracy of threat detection by the follow-up module, particularly the artificial intelligent model module is enhanced.
The artificial intelligence model module is internally provided with a pre-trained neural network model, the neural network model receives each row of data in the third data set, and each row of data comprises 10 first characteristic fields selected by the characteristic selection module;
the neural network model performs forward propagation operation on each line of data and outputs a numerical value in a [0,1] interval, wherein the numerical value represents the probability of whether the corresponding network traffic data is threatening network traffic;
for network traffic with an output probability value greater than or equal to a preset threshold (e.g., the threshold is set to 0.8), it is labeled as "threatening network traffic"; otherwise, it is marked as "non-threatening network traffic";
integrating the neural network model output and threat markers of each row of data into a new data set to generate a fourth data set, wherein the fourth data set comprises 10 first characteristic fields in the original third data set and second characteristic fields, and the second characteristic fields record the judgment of whether the corresponding network traffic is threat network traffic or not;
the artificial intelligent model module accurately detects the threat to the third data set through a built-in pre-trained neural network model, and generates a fourth data set according to the detection result. The fourth data set not only comprises the statistical characteristic field in the original third data set, but also is additionally provided with a second characteristic field for recording whether the network traffic has threat or not, thereby realizing more comprehensive and accurate network threat detection.
The self-adaptive module firstly executes data alignment operation once, so that each row of data in the fourth data set is paired with corresponding network traffic data in the first data set;
for each paired data, the adaptation module compares the threat determination in the fourth data set with the actual network traffic label in the first data set, and calculates a false positive rate for data labeled "threatening network traffic" but in practice "non-threatening network traffic" or data labeled "non-threatening network traffic" but in practice "threatening network traffic";
based on the calculated misjudgment rate, the self-adaptive module generates a correction factor which directly influences parameters of a neural network model in the artificial intelligent model module;
the self-adaptive module applies the generated correction factors to the neural network model of the artificial intelligent model module, adjusts parameters of the neural network model, and reduces the threshold value of the neural network model or improves the learning rate if the misjudgment rate exceeds a preset upper limit (for example, the upper limit is set to be 5 percent); if the misjudgment rate is lower than a preset lower limit (for example, the lower limit is set to be 1%), the threshold value of the neural network model is increased or the learning rate is reduced;
the self-adaptive module can analyze network flow data and threat detection results in real time through the tight integration with the artificial intelligent model module and the data collection module, and automatically adjust parameters of the artificial intelligent model module according to actual performances, so that threat detection accuracy of the system is improved.
The false positive rate is calculated as follows:
the self-adaptive module maintains a counter set, including a true example TP counter, a true negative example TN counter, a false positive example FP counter and a false negative example FN counter;
for data in the fourth dataset marked as "threatening network traffic" and the first dataset is actually also "threatening network traffic", increment the "true instance TP" counter by one;
for data in the fourth dataset marked as "non-threatening network traffic" and the first dataset is also actually "non-threatening network traffic", increment the "true negative TN" counter by one;
for data in the fourth dataset marked as "threatening network traffic" but in the first dataset actually being "non-threatening network traffic", incrementing a "false positive FP" counter;
for data in the fourth dataset marked as "non-threatening network traffic" but in the first dataset actually "threatening network traffic", incrementing a "false negative example FN" counter;
calculating the misjudgment rate according to the four counters: misjudgment rate
The misjudgment rate is used for evaluating threat detection accuracy of the system and is used as a basis for generating a correction factor by the self-adaptive module;
the false positive rate calculation mode provides a quantization method for evaluating the accuracy of the artificial intelligent model module on threat detection tasks, and provides a basis for automatically adjusting parameters of the artificial intelligent model module for the self-adaptive module.
The prediction correction module receives the artificial intelligent model parameters adjusted by the self-adaptive module, wherein the artificial intelligent model parameters comprise learning rate, threshold value and activation function parameters;
the prediction correction module is embedded with a flow prediction sub-module, and predicts the network flow in a future period of time by using a time sequence analysis algorithm to generate a prediction data set of future network flow;
the prediction correction module transmits the prediction data set to the artificial intelligent model module adjusted by the self-adaptive module, threat detection is carried out on the prediction data set by utilizing the adjusted parameters, an adjusted neural network model is applied to each line of data in the prediction data set, forward propagation operation is carried out, a numerical value in a [0,1] interval is output, and the numerical value represents the probability of whether the corresponding prediction network flow data is threat network flow or not;
marking the network traffic in the predicted data set as 'threatening network traffic' or 'non-threatening network traffic' according to the output probability value and the threshold value adjusted by the self-adapting module;
the neural network model output and threat signature of each row of predicted data are integrated into a new data set to generate a fifth data set, the fifth data set comprising statistical characteristic fields of the predicted network traffic, and third characteristic fields recording a determination of whether the corresponding predicted network traffic is threatening network traffic.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the invention is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the invention, the steps may be implemented in any order and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
The present invention is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the present invention should be included in the scope of the present invention.

Claims (8)

1. An artificial intelligence based adaptive cyber threat detection system, comprising:
a data collection module configured to collect network traffic data and generate a first data set;
the data preprocessing module is connected to the data collecting module, receives the first data set, performs denoising and standardization operation, and generates a second data set;
the feature selection module is connected to the data preprocessing module, receives the second data set, performs feature selection on the second data set according to a preset feature selection algorithm, and generates a third data set;
the artificial intelligent model module is connected to the feature selection module, receives the third data set, performs threat detection on the third data set by adopting an artificial intelligent algorithm, and generates a fourth data set, wherein the fourth data set comprises judgment on whether the network traffic has threat or not;
the self-adaptive module is connected with the artificial intelligent model module and the data collection module, receives the fourth data set and the first data set, compares and analyzes the threat judgment in the fourth data set with the original network flow data in the first data set, automatically adjusts parameters of the artificial intelligent model module, and feeds back the parameters to the artificial intelligent model module;
the prediction correction module is connected to the self-adaptive module and the artificial intelligent model module, threat prediction is carried out on future network traffic based on the artificial intelligent model adjusted by the self-adaptive module, a fifth data set is generated, and the prediction correction module adjusts a feature selection algorithm of the feature selection module according to a prediction result of the fifth data set so as to improve the accuracy of threat detection.
2. The adaptive cyber threat detection system based on artificial intelligence of claim 1, wherein the data collection module specifically comprises:
the network switch or router is accessed, and network data packets of a transmission layer and an application layer are captured in real time through a deep data packet inspection or flow mirroring technology;
carrying out protocol analysis and classification on the captured data packets, and sequencing and indexing the analyzed and classified network data packets according to the source address, the destination address, the port number and the protocol type;
grouping the sequenced and indexed network data packets according to time periods by utilizing a time window, and calculating the statistical characteristics of each group on each preset field;
the statistical properties of the generated packets are integrated into a matrix or data frame form as a first data set.
3. The adaptive cyber threat detection system of claim 2, wherein the data preprocessing module performs smoothing on each of the statistical characteristic fields in the first data set by a gaussian filter denoising algorithm, eliminates noise or outliers, and applies Z-Score normalization to the denoised data to convert the values of each of the statistical characteristic fields into values having a uniform dimension or range, and then re-integrates the values into a new matrix or data frame form as the second data set.
4. The adaptive cyber threat detection system based on artificial intelligence of claim 3, wherein the feature selection module specifically comprises:
applying a recursive feature elimination algorithm or feature ordering based on information gain to evaluate the importance of the statistical characteristic fields in the second data set, specifically removing the last feature field through a series of iterative processes each time until the preset feature quantity K is reached;
according to a recursive feature elimination algorithm or a feature ordering evaluation and ordering result based on information gain, selecting a statistics characteristic field of the top 10 of the ranks as a first feature, and extracting all data corresponding to the 10 fields from a second data set;
and reconstructing a new data set by using the extracted 10 first characteristic fields, wherein each row of data only comprises the data of the 10 selected fields, re-integrating the data according to the arrangement sequence of the data in the original second data set, and generating a new matrix with 10 columns as a third data set after integration.
5. The adaptive network threat detection system based on artificial intelligence of claim 4, wherein the artificial intelligence model module has built-in a pre-trained neural network model that receives each row of data in the third dataset, each row of data comprising 10 first feature fields selected by the feature selection module;
the neural network model performs forward propagation operation on each line of data and outputs a numerical value in a [0,1] interval, wherein the numerical value represents the probability of whether the corresponding network traffic data is threatening network traffic;
for network traffic with an output probability value greater than or equal to a preset threshold, marking the network traffic as 'threatening network traffic'; otherwise, it is marked as "non-threatening network traffic";
the neural network model output and threat signature of each row of data are integrated into a new data set to generate a fourth data set, wherein the fourth data set comprises 10 first characteristic fields in the original third data set and a second characteristic field, and the second characteristic field records the judgment of whether the corresponding network traffic is threat network traffic or not.
6. The adaptive network threat detection system of claim 5, wherein the adaptation module first performs a data alignment operation to pair each row of data in the fourth data set with corresponding network traffic data in the first data set;
for each paired data, the adaptation module compares the threat determination in the fourth data set with the actual network traffic label in the first data set, and calculates a false positive rate for data labeled "threatening network traffic" but in practice "non-threatening network traffic" or data labeled "non-threatening network traffic" but in practice "threatening network traffic";
based on the calculated misjudgment rate, the self-adaptive module generates a correction factor which directly influences parameters of a neural network model in the artificial intelligent model module;
the self-adaptive module applies the generated correction factors to the neural network model of the artificial intelligent model module, adjusts parameters of the neural network model, and reduces the threshold value of the neural network model or improves the learning rate if the misjudgment rate exceeds a preset upper limit; if the misjudgment rate is lower than the preset lower limit, the threshold value of the neural network model is increased or the learning rate is reduced.
7. The adaptive network threat detection system of claim 6, wherein the false positive rate is calculated as follows:
the self-adaptive module maintains a counter set, including a true example TP counter, a true negative example TN counter, a false positive example FP counter and a false negative example FN counter;
for data in the fourth dataset marked as "threatening network traffic" and the first dataset is actually also "threatening network traffic", increment the "true instance TP" counter by one;
for data in the fourth dataset marked as "non-threatening network traffic" and the first dataset is also actually "non-threatening network traffic", increment the "true negative TN" counter by one;
for data in the fourth dataset marked as "threatening network traffic" but in the first dataset actually being "non-threatening network traffic", incrementing a "false positive FP" counter;
for data in the fourth dataset marked as "non-threatening network traffic" but in the first dataset actually "threatening network traffic", incrementing a "false negative example FN" counter;
calculating the misjudgment rate according to the four counters:
the misjudgment rate is used for evaluating threat detection accuracy of the system and is used as a basis for generating a correction factor by the self-adaptive module.
8. The adaptive network threat detection system based on artificial intelligence of claim 7, wherein the prediction modification module receives artificial intelligence model parameters adjusted by the adaptation module, including learning rate, threshold, activation function parameters;
the prediction correction module is embedded with a flow prediction sub-module, and predicts the network flow in a future period of time by using a time sequence analysis algorithm to generate a prediction data set of future network flow;
the prediction correction module transmits the prediction data set to the artificial intelligent model module adjusted by the self-adaptive module, threat detection is carried out on the prediction data set by utilizing the adjusted parameters, an adjusted neural network model is applied to each line of data in the prediction data set, forward propagation operation is carried out, a numerical value in a [0,1] interval is output, and the numerical value represents the probability of whether the corresponding prediction network flow data is threat network flow or not;
marking the network traffic in the predicted data set as 'threatening network traffic' or 'non-threatening network traffic' according to the output probability value and the threshold value adjusted by the self-adapting module;
the neural network model output and threat signature of each row of predicted data are integrated into a new data set to generate a fifth data set, the fifth data set comprising statistical characteristic fields of the predicted network traffic, and third characteristic fields recording a determination of whether the corresponding predicted network traffic is threatening network traffic.
CN202311254098.9A 2023-09-26 2023-09-26 Self-adaptive network threat detection system based on artificial intelligence Pending CN117527295A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311254098.9A CN117527295A (en) 2023-09-26 2023-09-26 Self-adaptive network threat detection system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311254098.9A CN117527295A (en) 2023-09-26 2023-09-26 Self-adaptive network threat detection system based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN117527295A true CN117527295A (en) 2024-02-06

Family

ID=89761447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311254098.9A Pending CN117527295A (en) 2023-09-26 2023-09-26 Self-adaptive network threat detection system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN117527295A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117951695A (en) * 2024-03-27 2024-04-30 南京中科齐信科技有限公司 Industrial unknown threat detection method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117951695A (en) * 2024-03-27 2024-04-30 南京中科齐信科技有限公司 Industrial unknown threat detection method and system
CN117951695B (en) * 2024-03-27 2024-06-11 南京中科齐信科技有限公司 Industrial unknown threat detection method and system

Similar Documents

Publication Publication Date Title
CN102420723A (en) Anomaly detection method for various kinds of intrusion
CN111107102A (en) Real-time network flow abnormity detection method based on big data
CN112134862B (en) Coarse-fine granularity hybrid network anomaly detection method and device based on machine learning
CN109951462B (en) Application software flow anomaly detection system and method based on holographic modeling
Zhang et al. Anomaly-based network intrusion detection using SVM
CN117527295A (en) Self-adaptive network threat detection system based on artificial intelligence
CN110661802A (en) Low-speed denial of service attack detection method based on PCA-SVM algorithm
CN111600878A (en) Low-rate denial of service attack detection method based on MAF-ADM
CN111191720A (en) Service scene identification method and device and electronic equipment
CN111935064A (en) Industrial control network threat automatic isolation method and system
CN117220920A (en) Firewall policy management method based on artificial intelligence
Mohamed et al. Denoising autoencoder with dropout based network anomaly detection
CN117892102B (en) Intrusion behavior detection method, system, equipment and medium based on active learning
CN116614313A (en) Network intrusion protection system and method based on data identification
CN117350368A (en) Federal learning defense method, apparatus, device and storage medium
CN117857088A (en) Network traffic abnormality detection method, system, equipment and medium
CN117749409A (en) Large-scale network security event analysis system
CN115514581B (en) Data analysis method and equipment for industrial internet data security platform
CN112968891B (en) Network attack defense method and device and computer readable storage medium
CN111343205B (en) Industrial control network security detection method and device, electronic equipment and storage medium
CN112884069A (en) Method for detecting confrontation network sample
CN114615056B (en) Tor malicious flow detection method based on robust learning
CN115277177B (en) Police cloud security data fusion method, system, device and storage medium
CN115996133B (en) Industrial control network behavior detection method and related device
Gottwalt et al. Analysis of feature selection techniques for correlation-based network anomaly detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination