CN111800389A

CN111800389A - Port network intrusion detection method based on Bayesian network

Info

Publication number: CN111800389A
Application number: CN202010519908.9A
Authority: CN
Inventors: 王成; 汤文韬
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2020-10-20

Abstract

The invention relates to the field of industrial internet, and provides a Bayesian network-based port network intrusion detection method, which comprises the following steps: s1: collecting and preprocessing an abnormal port network flow data set to obtain a network flow characteristic set; s2: constructing and obtaining a Bayesian network model by using the network data packet feature set; s3: inputting a training set and training parameters of the Bayesian network model, and simultaneously obtaining conditional probability parameters of the Bayesian network model by using Bayesian theorem; s4: and detecting an input prediction set by using the conditional probability parameters and the Bayesian theorem to obtain a detection result. The network intrusion detection method based on the Bayesian network model is based on the Bayesian network model, realizes network intrusion detection by modeling network flow behaviors and characteristic attributes, and can perform online dynamic adjustment on the detection model to deal with changes of a network environment, so that the accuracy of detecting and protecting network intrusion and the robustness of the model are improved, and a remarkable effect is finally obtained.

Description

Port network intrusion detection method based on Bayesian network

Technical Field

The invention relates to the field of intrusion detection of industrial internet network security.

Background

In recent years, intrusion detection has become a research hotspot in the industry and academia, and many new technologies, algorithms and systems related to intrusion detection have appeared and will be continuously appeared. According to 2016 ICS-CERT industrial internet security situation report analysis, more than 80% of national key infrastructures depend on industrial internet to realize production process automation, but the existing industrial internet intrusion detection has many problems. With the rise of intelligent electronic terminal equipment, network traffic is increasing explosively. Huge network traffic promotes the convergence of internet economy and entity economy, and a series of network security challenges are faced while enjoying internet dividends. Especially in the field of industrial internet, network security intrusion detection is also important in national guidelines. As the TCP/IP protocol widely used by the current Internet is not designed with pertinence to the security problem at the beginning of design, the security events of the current Internet are layered endlessly. Intrusion detection is becoming the subject of research as an active security technology. At present, some anomaly detection models based on machine learning and even deep learning exist, wherein most learning models are discriminant models based on expectation maximization, and for online network intrusion detection models, the method using models such as deep learning as network intrusion detection is superior to other methods in effects such as numerical indicators, but the deep learning model is a typical black box model whose process is difficult to visualize and interpret, and the result has no good interpretability and does not have enough confidence.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a basic Bayesian network model-based port network intrusion detection method, which is based on a Bayesian network model, realizes intrusion attack detection by synthesizing and modeling port network flow data, can dynamically adjust the detection model on line, and improves the accuracy of intercepting intrusion attacks and the robustness of the model.

In order to achieve the above object, the present invention provides a basic port network intrusion detection method based on a bayesian network model, which comprises the following steps:

s1: firstly, collecting, acquiring and preprocessing a network data packet in a port data interface through packet capturing software (such as tcpdump, wireshark and the like) to obtain a network flow characteristic set;

s2: establishing and obtaining a Bayesian network model by using the port network data packet feature set;

s3: inputting a training set and training parameters of the Bayesian network model, and simultaneously obtaining conditional probability parameters of the Bayesian network model by using Bayesian theorem;

s4: and carrying out intrusion detection on an input prediction set by utilizing the conditional probability parameters and the Bayesian theorem to obtain a detection result.

Preferably, the step of S1 further comprises the steps of:

s11: a cleaning step of the data set is carried out,

solving data inconsistency by filling missing values, discretizing and sign numerating the network flow data to realize data formatting, abnormal data clearing error correction and repeated data deduplication;

s12: a step of data integration, namely a step of data integration,

the network flow data packets captured by the plurality of port data receiving interfaces are unified and stored to form a database;

s13: and normalizing the port network flow data set in the database to form the network flow characteristic set.

Preferably, the step of S2 further comprises the steps of:

s21: acquiring the port network flow characteristic set S, and inputting a candidate characteristic set S', a relation set R, a label attribute Y and a threshold lambda; wherein S' belongs to phi, R belongs to phi, and phi represents an empty set.

S22: calculating and obtaining the characteristic X of the port network flow characteristic set S according to a formula (1)_iMutual information amount I with tag attribute Y:

wherein, X_iDenoted as the ith feature; i is a natural number greater than or equal to 1; y represents a label attribute; x representsIs X_iTaking the value of (A); y represents the value of Y; p (x, y) is expressed as the joint probability of x and y; p (x) edge probability expressed as x; p (y) edge probability expressed as y; i (X)_i(ii) a Y) is represented by X_iAnd the amount of mutual information between Y;

s23: determine the mutual information quantity I (X)_i(ii) a Y) whether the value is larger than or equal to a preset threshold value lambda or not; if yes, continuing the subsequent steps;

s24: updating the candidate feature set S' according to equation (2):

S’∶＝S’+X_i(2)；

updating the network traffic feature set S according to formula (3):

S:＝S-X_i(3)；

s25: according to the obtained dependency relationship r, r: X_i→Y；

S26: updating the relation set R according to the formula (4);

s27: judging whether the number of the features in the current candidate feature set S' is greater than or equal to 2; if yes, continuing the subsequent steps, otherwise returning to the step S23;

s28: calculating the mutual information quantity between every two features in the current selected feature set S' according to a formula (5):

wherein, X_iDenoted as the i-th feature, X, in S_jExpressed as the jth feature in S', i, j are natural numbers greater than 0; x is represented by X_iTaking the value of (A); x' is represented by X_jTaking the value of (A); p (x, x ') is represented as the joint probability of x and x'; p (x) represents the edge probability of x; p (x ') represents the edge probability of x'; i (X)_i；X_j) Is represented by X_iAnd X_jThe mutual information quantity between; updating the relation set R through formula (4);

s29: assigning the current S 'to S and clearing the set S'; calculating the mutual information quantity between every two characteristics of the set S through the formula (5), if I (X)_i；X_j) ≧ λ, then based on a priori knowledge,determining a dependency relationship R between every two characteristics, and updating the relationship set R through a formula (4);

s30: repeat step S28 until S is null or all features of I (X)_i；X_j) And (5) lambda is less than or equal to lambda, and the Bayesian network model is obtained according to the current relation set R.

Preferably, the step of S3 further comprises the steps of:

s31: inputting a training set, wherein the training set comprises characteristic attributes and label attributes;

s32: the conditional probability parameter is obtained by calculation according to the formula (6):

wherein A is_iAn ith parent node representing the Bayesian network model; b is represented by A_iA child node of (1); p is a radical of_train(A_i| B) is represented as A_iA conditional probability parameter between B and B; p (A)_i) Is shown as A_iThe edge probability of (1); p (B | A)_i) Expressed under the condition A_iProbability of occurrence of the next B event; a. the_jDenoted as jth parent node; p (A)_j) Is shown as A_jThe edge probability of (1); p (B | A)_j) Is shown in condition A_jProbability of occurrence of temporal event B;

s33: the judgment formula (6) can judge whether to converge, if so, the subsequent steps are continued, otherwise, the step S31 is returned to.

Preferably, the step of S4 further comprises the steps of:

s41: inputting a test set, wherein the test set comprises a characteristic attribute Y';

s42: calculating to obtain posterior probability according to formula (7), and outputting the prediction result according to the posterior probability;

wherein, p (Y' | X)₁,…,X_n) Is shown under condition X₁,…,X_nProbability of occurrence of the lower Y' event; p (X)₁,…,X_nY') is represented as event X at condition Y₁,…,X_nA joint probability of occurrence; p (Y ') is represented as the edge probability of event Y'; p (X)₁,…,X_n) Represents an event X₁,…,X_nThe joint probability of (c).

Preferably, the step S4 is followed by the step of:

s5: and verifying the prediction result.

Preferably, the step of S5 further comprises the steps of:

s51: counting according to the detection result to obtain a total number TP of positive classes, a total number FP of positive classes, a total number FN of negative classes and a total number TN of negative classes of the model of formula (7);

s52: precision is calculated according to a formula (8):

calculating the recall recalls according to a formula (9):

the disturbance ratio disturbance is calculated according to a formula (10):

s53: and evaluating the detection result according to the accuracy, the recall rate and the disturbance rate.

Due to the adoption of the technical scheme, the method has the following beneficial effects:

compared with the black box property of deep learning, the Bayesian network model based on the Bayesian theorem often has strong interpretability and reasoning property when predicting port network data; the Bayesian network model trains the model by using the training set to obtain conditional probability parameters, and when the test set is predicted, the prior knowledge and the conditions of the test set can be used to obtain the conditional probability and finally deduce the posterior probability, so that the result has strong confidence and obedience; and the bayesian network model can handle situations where hidden variables are present. The Bayesian network model is used as a basis, so that the interpretability and the reasoning of the model are improved, the parameter adjustment of the model can be dynamically made according to the change of a specific network environment, and the detection of abnormal flow and intrusion detection and the protection of the network security of industrial internet enterprises are better guaranteed.

Based on the data characteristics of port network flow, the Bayesian network model is also the most suitable high-robustness model with strong reasoning capability and interpretation capability.

Drawings

Fig. 1 is a general flowchart of a network intrusion detection method based on a bayesian network model according to an embodiment of the present invention;

fig. 2 is a diagram illustrating a bayesian network model obtained by modeling network traffic data according to an embodiment of the present invention;

Detailed Description

The following description of the preferred embodiment of the present invention will be given in conjunction with the accompanying drawings 1-2, and will be given in detail to better understand the functions and features of the present invention. In this embodiment, the algorithm environment is based on: python, Pgmpy bayes network model, Pandas analytical library and Numpy library.

In this embodiment, the multiple data sources specifically refer to data sets obtained by multiple port data receiving interfaces; the network flow data specifically refers to network data packets captured from the port data interface through tcpdump and wireshark packets.

Referring to fig. 1 to fig. 2, a method for detecting an intrusion in a port network based on a bayesian network according to an embodiment of the present disclosure includes the steps of:

s1: firstly, network data packets in a port data interface are collected, acquired and preprocessed through packet capturing software (such as wireshark), and a network traffic characteristic set S is obtained.

Wherein the step of S1 further comprises the steps of:

s11: a data cleaning step, namely, the formatting of data, the removal error correction of abnormal data and the removal of repeated data are realized by filling missing values, smoothing noise and identifying and solving data inconsistency on network flow data;

s12: a data integration step, namely uniformly storing the network traffic data of a plurality of data sources to form a database;

s13: and normalizing the network traffic data in the database to form a required network traffic characteristic set.

The traffic occurring in the real network environment needs to be captured first, which requires tcpdump. tcpdump is a tool for intercepting network packets and outputting the contents of the packet. By means of powerful functions and flexible interception strategies, the method becomes a preferred tool for network analysis and problem troubleshooting under a UNIX-like system. The specific information is shown in the library official website https:// www.tcpdump.org/. Although a large number of network traffic data packets can be collected through the network traffic monitoring device and the packet capturing software of the port, the data in the real port network environment is mostly incomplete and inconsistent dirty data, and the characteristics and data formats necessary for establishing a Bayesian network model are lacked, so that the original data cannot be directly involved in the calculation of the Bayesian network, and therefore, the original data must be preprocessed. (1) Data cleaning: the data is cleaned up by filling in missing values, smoothing out noisy data, identifying or resolving inconsistencies. The following objectives are mainly achieved: formatting standard (such as time and the like) of data, clearing abnormal data, correcting errors and removing duplicate data; (2) data integration: the data integration mainly comprises the steps of merging and uniformly storing data in a plurality of port data receiving interfaces and establishing a database; (3) extracting characteristics: the characteristic form required by the learning model is extracted from the original network packet data by the modes of original network packet data, data calculation, characteristic extraction and the like. After the flow data of the port is simply preprocessed, the data is used for training a model to classify, and whether the flow data is abnormal or not is judged. Since the data set used by the training model must have some standard features, the captured traffic information needs to be converted to the format of the KDD99 data set according to the international standard format. Here, an open source software package kdd99_ feature _ extra is used, whose Github address is: https:// github. com/AI-IDS/kdd99_ feature _ extra. The file name of a pcap file generated by the tcpdump capturing flow is transmitted to a software package as a parameter of KDD99_ feature _ extra, and then the flow information conforming to the KDD99 data set format can be output. It is worth noting that kdd99_ feature _ extra is built in with a traffic capture tool based on libpcap, and tcpdump is not needed to be used separately for traffic capture, which simplifies the operation flow. (4) Data transformation: the data are converted into a form required by a learning model through modes of smooth aggregation, data generalization, normalization, symbol numeralization, discretization and the like. The data set fields after port flow characteristic extraction and the preprocessed types are shown in table 1.

TABLE 1 Port traffic feature extracted dataset fields

Name of field	Data type	Field description	Type after pretreatment
				Duration	Continuous	Duration of connection	Dispersing
protocol_type	Dispersing	Type of protocol	Dispersing
				service	Dispersing	Service types of a network	Dispersing
flag	Dispersing	Connected normal or wrong states	Dispersing
				src_bytes	Continuous	Number of source bytes	Dispersing
dst_bytes	Continuous	Target number of bytes	Dispersing
				wrong_fragment	Continuous	Number of erroneous segments	Dispersing
IP	Dispersing	Whether it is a white list IP	Dispersing
				urgent	Continuous	Label for transaction	Dispersing

As can be seen from table 1, most of available original fields are continuous variables, and the bayesian network model itself requires that only discrete variables can be processed, so that the preprocessing includes data cleaning and data integration, and in the data transformation process, the continuous floating point number is converted into the discrete variables that the bayesian network model supports computation, so as to construct a network traffic feature set, and the network traffic feature set is divided into a training set S and a test set T.

S2: and establishing and obtaining a Bayesian network model by using the port network flow characteristic set.

And (3) constructing a complete Bayesian network by analyzing the dependency and independence among the features. The Bayesian network is constructed to construct a joint distribution among random variables of data characteristics, and the dependency and the independence are two main properties of the distribution. The independence property is important in answering queries and can be used to radically reduce computational costs of inferences.

Wherein the step of S2 further comprises the steps of:

s21: acquiring a network traffic feature set S from S1, and inputting a candidate feature set S', a relation set R, a label attribute Y and a threshold lambda; wherein S' belongs to phi, R belongs to phi, and phi represents an empty set.

S22: calculating and obtaining the characteristic X of the port network flow data set characteristic set S according to the formula (1)_iMutual information amount I with tag attribute Y:

wherein, X_iRepresents the ith feature; i is a natural number greater than or equal to 1; y represents a label attribute; x represents X_iTaking the value of (A); y represents the value of Y; p (x, y) represents the joint probability of x and y; p (x) is the edge probability of x; p (y) is the edge probability of y; i (X)_i(ii) a Y) represents X_iThe amount of mutual information with Y;

s23: judgment of I (X)_i(ii) a Y) is greater thanIs equal to a preset threshold lambda; if yes, continuing the subsequent steps;

s24: updating the candidate feature set S' according to equation (2):

S’∶＝S’+X_i(2)；

updating a network traffic feature set S according to formula (3):

S:＝S-X_i(3)；

s25: according to the obtained dependency relationship r, r: X_i→Y (4)；

S26: updating the relation set R according to the formula (4);

s27: judging whether the number of the features in the current candidate feature set S' is more than or equal to 2; if yes, continuing the subsequent steps, otherwise returning to the step S23;

s28: calculating the mutual information quantity between every two characteristics in the current candidate characteristic set S' according to a formula (5):

wherein, X_iDenoted as the i-th feature in S', X_jExpressed as the jth feature in S', i, j are positive integers greater than 0; x is represented by X_iTaking the value of (A); x' is then represented by X_jTaking the value of (A); p (x, x ') distribution is represented as the joint probability of x and x'; p (x) is the edge probability denoted x; p (x ') represents the edge probability of x'; i (X)_i；X_j) Is represented by X_iAnd X_jThe mutual information quantity between; updating the relation set R through the formula (4);

s29: assigning the current S 'to the set S, and clearing the set S'; calculating the mutual information quantity between every two characteristics of the set S through the formula (5), if I (X)_i；X_j) The method comprises the following steps of (1) determining the dependency relationship R between every two characteristics according to prior knowledge, and updating a relationship set R through a formula (4);

s30: repeat step S28 until set S is empty or all features of I (X)_i；X_j) And (5) lambda is less than or equal to lambda, and at the moment, a Bayesian network graph model is obtained according to the current relation set R.

S3: and inputting a training set and training parameters of the Bayesian network model, and obtaining conditional probability parameters of the Bayesian network graph model by using Bayesian theorem.

The main role of this step is to train the parameters in the model. The essence of the Bayesian network model training is that the conditional probability in the Bayesian network, namely the parameter of the model, is deduced by calculating the joint probability of the features, namely the posterior probability, as the condition through counting the edge probability of each feature in the training set and taking the edge probability as the condition and utilizing the Bayesian theorem.

Wherein the step of S3 further comprises the steps of:

s31: inputting a training set T provided by S1, wherein the training set comprises characteristic attributes and label attributes;

s32: and (3) calculating to obtain a conditional probability parameter according to the formula (6):

wherein A is_iRepresented as the ith parent node of the bayesian network model; b is represented by A_iA child node of (1); p is a radical of_train(A_i| B) is represented as A_iA conditional probability parameter between B and B; p (A)_i) Is shown as A_iThe edge probability of (1); p (B | A)_i) Expressed under the condition A_iProbability of occurrence of the lower event B; a. the_jThen it is denoted as the jth parent node; p (A)_j) Is shown as A_jThe edge probability of (1); p (B | A)_j) Represents the condition A_jProbability of occurrence of the lower event B;

s33: and (4) judging whether the formula (6) converges, if so, continuing the subsequent steps, and otherwise, returning to the step S31.

S4: and detecting an input test set by using the Bayesian network model trained in the S3 to obtain a detection result.

The main function of this step is to make a judgment on the unknown record conveniently, that is, for a real-time access record, the model should give a detection result, that is, judge whether the data packet is normal or abnormal attack type. The detection process mainly uses Bayes 'theorem, that is, features in the access record are used as conditions, the conditional probability in the model is used, and the Bayes' theorem is used to infer the posterior probability of the record.

Wherein the step of S4 further comprises the steps of:

s41: inputting the test set T obtained in the S1, wherein the test set comprises a characteristic attribute Y';

s42: the inference by utilizing the Bayesian network is that the posterior probability is deduced by utilizing the conditional probability distribution obtained in the training process and the conditions in the test set; calculating according to a formula (7) to obtain a posterior probability, and outputting a prediction result according to the posterior probability;

wherein, p (Y' | X)₁,…,X_n) Is shown under condition X₁,…,X_nProbability of occurrence of the lower event Y'; p (X)₁,…,X_nY') represents event X when the condition is Y₁,…,X_nA joint probability of occurrence; p (Y ') represents the edge probability of Y'; p (X)₁,…,X_n) Represents X₁,…,X_nThe joint probability between them.

S5: and verifying the detection result.

Wherein the step of S5 further comprises the steps of:

s51: counting the total number TP of the positive classes, the total number FP of the negative classes, the total number FN of the positive classes and the total number TN of the negative classes according to the detection results, wherein the total number TP of the positive classes is judged as the positive classes, the total number FP of the negative classes is judged as the positive classes, and the total number TN of the negative classes is judged as the negative classes;

s52: obtaining precision through calculation according to the formula (8):

calculating and obtaining a recall rate call according to the formula (9):

calculating the disturbance ratio disturbance according to the formula (10):

s53: and evaluating the detection result according to the accuracy rate, the recall rate and the disturbance rate.

For example, the detection on the data set collected by the port flow detection device proves that the recall Rate (interception Rate) when the disturbance Rate (disturbance) is less than 1%, 0.5%, 0.1% and 0.05% is obtained, and the performance of the method is evaluated accordingly. According to the verification result of the method on the ocean mountain harbor network traffic data set, the performance of the port network intrusion detection classification method based on the Bayesian network is remarkable.

Referring to the bayesian network model example constructed based on the port traffic data set in fig. 2, in practical use, the method of this embodiment describes a joint probability model between an intrusion type and different data packet attribute characteristics in different port network environments, and when a port network data interface is accessed by different device ip addresses, the request times of the device ip may be a fixed number of times, so that different device ip addresses may present different request modes, and if the protocol and the request times of a certain socket are very frequent, the device ip may be determined as DDOS (denial of service attack); and the access frequency and the requested packet size may exhibit a relatively high correlation (with the label of the DDOS); meanwhile, the host side close to the ip address may also present higher correlation; whether the access process is a common IP also represents the fixed distribution correlation formed by the user when making a network request. In the port network environment, the request habits of different equipment hosts form the network behavior distribution of the different hosts, and if a behavior pattern which is not matched with the previous port equipment request behavior occurs, the port equipment request behavior is judged to be a network attack with a high probability. Compared with the black box property of the traditional deep learning model, the method of the embodiment combines the knowledge related to network security, intrusion detection and abnormal traffic and the assumption of similar network connection to construct the Bayesian network model for describing traffic behavior distribution, and the model has very good interpretable logic.

In addition, the Bayesian network model is used as a prediction model, the situation that hidden variables exist in characteristics can be well processed, a conventional priori assumption can be given through priori knowledge of a port network environment based on the Bayesian network model, namely when the model has unobserved variables, a reasonable estimation can be given through the Bayesian estimation by the state space model, and therefore the method has better robustness.

According to the port network intrusion detection method based on the Bayesian network, the Bayesian network model based on the Bayesian theorem usually has strong interpretability and confidence when data is predicted, and particularly has good prior judgment in an application scene with a stable network environment, such as a port; the Bayesian network model trains the model by using the port network flow training set after feature extraction to obtain conditional probability parameters, and when a test set is detected, the posterior probability is finally deduced by using the prior probability and the condition of the test set to obtain the conditional probability by using the stable characteristic of the port network environment, and the result has strong persuasion; the Bayesian network model can process the situation with hidden variables and deal with various new attacks in an unknown network environment, which cannot be achieved by the existing method based on the discrimination model; therefore, the port network intrusion detection method based on the bayesian network model in the research embodiment has advantages that the existing discrimination model does not have for the port internet. The method overcomes the defects of reasoning and interpretability of the traditional anomaly detection method based on deep learning, improves the interpretability, reasoning and robustness of the model, and better guarantees the detection of flow anomaly and intrusion detection and the protection of network security of port machinery enterprises.

While the present invention has been described in detail and with reference to the embodiments thereof as illustrated in the accompanying drawings, it will be apparent to one skilled in the art that various changes and modifications can be made therein. Therefore, certain details of the embodiments are not to be interpreted as limiting, and the scope of the invention is to be determined by the appended claims.

Claims

1. A network intrusion detection method based on a Bayesian network model applied to an industrial Internet is characterized by comprising the following steps:

s1: firstly, collecting and acquiring and preprocessing a network data packet in a port data interface through packet capturing software (such as wireshark) to obtain a network flow characteristic set;

s2: establishing and obtaining a Bayesian network model by using the network data packet feature set;

2. The bayesian network model-based network intrusion detection method for industrial internet according to claim 1, wherein the step of S1 further comprises the steps of:

s11: a step of cleaning a data set, which is to solve data inconsistency by filling missing values, discretizing and sign numerating the network flow data to realize data formatting, abnormal data clearing error correction and repeated data deduplication;

s12: integrating data, namely unifying and storing the network flow data of a plurality of data sources to form a database;

s13: and carrying out normalization processing on the network traffic data set in the database to form the network traffic characteristic set.

3. The bayesian network model-based network intrusion detection method for industrial internet according to claim 1, wherein the step of S2 further comprises the steps of:

s21: acquiring the network traffic characteristic set S, and inputting a candidate characteristic set S', a relation set R, a label attribute Y and a threshold lambda; wherein S' belongs to phi, R belongs to phi, and phi represents a null set;

s22: calculating and obtaining the characteristics X of the network flow characteristic set S according to the formula (1)_iMutual information amount I with tag attribute Y:

wherein, X_iDenoted as the ith feature; i is a natural number greater than or equal to 1; y represents a label attribute; x is represented by X_iTaking the value of (A); y represents the value of Y; p (x, y) is expressed as the joint probability of x and y; p (x) edge probability expressed as x; p (y) edge probability expressed as y; i (X)_i(ii) a Y) is represented by X_iAnd the amount of mutual information between Y;

s24: updating the candidate feature set S' according to equation (2):

S’∶＝S’+X_i(2)；

updating the network traffic feature set S according to formula (3):

S:＝S-X_i(3)；

s25: according to the obtained dependency relationship r, r: X_i→Y (4)；

S26: updating the relation set R according to the formula (4);

s29: assigning the current S 'to S and clearing the set S'; calculating the mutual information quantity between every two characteristics of the set S through the formula (5), if I (X)_i；X_j) The method comprises the following steps of (1) determining a dependency relationship R between every two characteristics according to prior knowledge, and updating a relationship set R through a formula (4);

4. The bayesian network model-based network intrusion detection method for industrial internet according to claim 1, wherein the step of S3 further comprises the steps of:

wherein A is_iRepresenting the above Bayesian networkThe ith parent node of the network model; b is represented by A_iA child node of (1); p is a radical of_train(A_i| B) is represented as A_iA conditional probability parameter between B and B; p (A)_i) Is shown as A_iThe edge probability of (1); p (B | A)_i) Expressed under the condition A_iProbability of occurrence of the next B event; a. the_jDenoted as jth parent node; p (A)_j) Is shown as A_jThe edge probability of (1); p (B | A)_j) Is shown in condition A_jProbability of occurrence of temporal event B;

5. The bayesian network model-based network intrusion detection method for industrial internet according to claim 1, wherein the step of S4 further comprises the steps of:

6. The method for detecting network intrusion based on the bayesian network model on the industrial internet as claimed in claim 1, wherein the step of S4 is followed by the step of:

s5: and verifying the prediction result.

7. The Bayesian network model-based network intrusion detection method for industrial Internet application according to claim 6, wherein the S5 further comprises the steps of:

s52: precision is calculated according to a formula (8):

calculating the recall recalls according to a formula (9):

the disturbance ratio disturbance is calculated according to a formula (10):