CN115345236A - Industrial control intrusion detection method and device fusing neighborhood rough set and optimized SVM - Google Patents

Industrial control intrusion detection method and device fusing neighborhood rough set and optimized SVM Download PDF

Info

Publication number
CN115345236A
CN115345236A CN202210981877.8A CN202210981877A CN115345236A CN 115345236 A CN115345236 A CN 115345236A CN 202210981877 A CN202210981877 A CN 202210981877A CN 115345236 A CN115345236 A CN 115345236A
Authority
CN
China
Prior art keywords
training
intrusion detection
svm
industrial control
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210981877.8A
Other languages
Chinese (zh)
Inventor
赵国新
张博伦
张磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Petrochemical Technology
Original Assignee
Beijing Institute of Petrochemical Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Petrochemical Technology filed Critical Beijing Institute of Petrochemical Technology
Priority to CN202210981877.8A priority Critical patent/CN115345236A/en
Publication of CN115345236A publication Critical patent/CN115345236A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N10/00Quantum computing, i.e. information processing based on quantum-mechanical phenomena
    • G06N10/60Quantum algorithms, e.g. based on quantum optimisation, quantum Fourier or Hadamard transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Security & Cryptography (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Condensed Matter Physics & Semiconductors (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to an industrial control intrusion detection method and device fusing a neighborhood rough set and an optimized SVM, wherein the method comprises the steps of generating a data set based on historical data of industrial control intrusion detection, dividing the data set into a training set and a test set, and preprocessing the training set and the test set; sending the preprocessed training set and test set into a pre-constructed SVM algorithm model, taking the opposite number of classification accuracy of 5-fold cross validation as fitness, carrying out iterative training and testing on the SVM algorithm model by adopting a quantum training sample group optimization algorithm, obtaining the optimal value of an objective function to determine the hyperplane of fault classification, and obtaining an industrial control intrusion detection model according to the optimal value of the objective function; and inputting the data to be detected into an industrial control intrusion detection model to obtain an intrusion detection result. The invention optimizes parameters of the SVM classifier by a quantum particle group optimization algorithm, and utilizes the optimized SVM to construct an industrial control intrusion detection model, and the industrial control intrusion detection model improves the accuracy of intrusion detection.

Description

Industrial control intrusion detection method and device fusing neighborhood rough set and optimized SVM (support vector machine)
Technical Field
The invention belongs to the technical field of industrial control information security, and particularly relates to an industrial control intrusion detection method and device fusing a neighborhood rough set and an optimized SVM.
Background
In the prior art, more than 80% of key infrastructures all use some types of Industrial Control Systems (ICS), and it is seen that the ICS normally guarantees the normal of national life. In recent years, the more severe the information security form of ICS is, the more demanding an effective solution is needed. Intrusion detection can actively defend network intrusion, and is an effective protection means, so that intrusion detection aiming at ICS becomes a hotspot of information security research.
Intrusion detection actually classifies abnormal data and normal data, and common classification algorithms include decision trees, neural networks, bayes, support Vector Machines (SVMs) and the like. Among them, SVM is one of the most commonly used algorithms for constructing an intrusion detection system due to its unique advantages. Whether the intrusion detection based on the SVM can correctly classify the data mainly depends on whether the selection of the penalty parameter c and the kernel function parameter g is proper or not. In the prior art, SVM parameters are optimized by using a PSO (Particle Swarm Optimization) algorithm, and an ICS (interference cancellation system) intrusion detection model based on a PSO-SVM (Particle Swarm Optimization) is designed for anomaly detection, so that a good effect is achieved. Parameters of the SVM are optimized by using an improved Bat Algorithm (BA), an ICS intrusion detection framework based on the IBA-SVM is provided, and the effectiveness of the Algorithm is verified through simulation experiments.
However, when the redundant attribute is deleted from the neighborhood rough set, the algorithm reduces the classification precision of the neighborhood rough set, and the accuracy of intrusion detection is affected.
Disclosure of Invention
In view of the above, the present invention aims to overcome the defects of the prior art, and provides an industrial intrusion detection method and an industrial intrusion detection device for fusing a neighborhood rough set and an optimized SVM, so as to solve the problem that when a redundancy attribute is deleted from the neighborhood rough set in the prior art, the classification accuracy of the neighborhood rough set is reduced by the above algorithm, and the accuracy of the intrusion detection effect is affected.
In order to realize the purpose, the invention adopts the following technical scheme: an industrial control intrusion detection method fusing a neighborhood rough set and an optimized SVM comprises the following steps:
generating a data set based on historical data of industrial control intrusion detection, dividing the data set into a training set and a test set, and preprocessing the training set and the test set;
sending the preprocessed training set and test set into a pre-constructed SVM algorithm model, taking the opposite number of classification accuracy of 5-fold cross validation as fitness, carrying out iterative training and testing on the SVM algorithm model by adopting a quantum training sample group optimization algorithm, obtaining the optimal value of an objective function to determine the hyperplane of fault classification, and obtaining an industrial control intrusion detection model according to the optimal value of the objective function;
and inputting the data to be detected into the industrial control intrusion detection model to obtain an intrusion detection result.
Further, the preprocessing the training set and the test set includes:
carrying out normalization processing on the training set and the test set;
and (5) performing attribute reduction on the data by using the training set and the test set after the normalization processing and acquiring a neighborhood rough set.
Further, the iteratively training and testing the SVM algorithm model includes:
inputting the preprocessed training set into a pre-constructed SVM algorithm model, and judging whether the output reaches a precision target or whether the iteration frequency is greater than the maximum iteration frequency;
if the output does not reach the precision target and the iteration times are less than the maximum iteration times, updating the target function of the SVM algorithm model by using the fitness, and continuing to carry out iterative training on the SVM algorithm model until the output of the SVM algorithm model reaches the precision target or the iteration times are greater than the maximum iteration times;
and taking the SVM algorithm model output after training as an industrial control intrusion detection model.
Further, the objective function includes: penalty parameters and parameters of kernel functions; the basis model of SVM algorithm is
Figure BDA0003799438810000031
s.t.y i (w T x i +b)≥1-ε ii ≥0,i=1,2,…,m
Deforming the SVM algorithm basic model by utilizing a Lagrange multiplier method to obtain
Figure BDA0003799438810000032
Figure BDA0003799438810000033
0≤α i ≤c,i=1,2,…,m
Figure BDA0003799438810000034
Wherein c is a penalty parameter for identifying the penalty degree of the error classification. Epsilon i For relaxation variables, when training sample x i Is correctly classified and outside the classification interval, then epsilon i =0; when training sample x i Is correctly classified and in the classification interval, 0 < epsilon i <1; when training sample x i If not correctly classified, then ε i ≥0;a i Is a Lagrange multiplier, k (x) i ,x j ) Is a Gaussian kernel function, g is a parameter of the kernel function, sigma is a bandwidth, controls the local action range of the Gaussian kernel function, y i For class labeling, w is the slope vector of the linear function and b is the intercept.
Further, the iterative training and testing of the SVM algorithm model by using the quantum training sample group optimization algorithm includes:
initializing the number of training sample groups, the maximum iteration times and the position of each training sample in the training sample groups in a search space, and initializing the individual optimal position of the training samples to be the current optimal position;
calculating the average position of training samples in a training set;
calculating the fitness value of the training sample, and updating the individual optimal position of the training sample and the global optimal position of the population according to the fitness minimum principle;
step four, calculating the position of the random sample;
step five, calculating a new position of the training sample;
and step six, repeatedly executing the step two to the step five until a preset precision target is met or the maximum iteration number is reached.
Further, attribute reduction reduces the 26-item attributes of the training set and test set to 11-item attributes.
Further, the training set and the test set are normalized in the following way,
Figure BDA0003799438810000041
wherein the content of the first and second substances,
Figure BDA0003799438810000042
is a training sample after normalization processing, x is a training sample to be processed, x min As a minimum training sample, x max Is the largest training sample.
The embodiment of the application provides an industrial control intrusion detection device for fusing a neighborhood rough set and an optimized SVM, which comprises:
the acquisition module is used for generating a data set based on historical data of industrial control intrusion detection, dividing the data set into a training set and a test set, and preprocessing the training set and the test set;
the training module is used for sending the preprocessed training set and the preprocessed test set into a pre-constructed SVM algorithm model, carrying out iterative training and testing on the SVM algorithm model by using a quantum training sample group optimization algorithm by taking the opposite number of classification accuracy of 5-fold cross validation as fitness, solving the optimal value of an objective function to determine a hyperplane of fault classification, and obtaining an industrial control intrusion detection model according to the optimal value of the objective function;
and the detection module is used for inputting data to be detected into the industrial control intrusion detection model and obtaining an intrusion detection result.
An embodiment of the present application provides a computer device, including: a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of any of the generation methods or the steps of any of the query methods described above.
An embodiment of the present application further provides a computer storage medium, which stores a computer program, and when the computer program is executed by a processor, the processor executes any of the above steps of the industrial intrusion detection method for fusing the neighborhood rough set and optimizing the SVM.
By adopting the technical scheme, the invention can achieve the following beneficial effects:
the invention provides an industrial control intrusion detection method and device for fusing a neighborhood rough set and an optimized SVM (support vector machine). An SVM algorithm model is iteratively trained and tested by constructing a training set and a test set acquisition quantum training sample group optimization algorithm to obtain an industrial control intrusion detection model. The method and the device have the advantages that parameter optimization is carried out on the SVM classifier through the quantum particle swarm optimization algorithm, so that an industrial control intrusion detection model is constructed by the optimized SVM, and the accuracy of intrusion detection is improved through the industrial control intrusion detection model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of steps of an industrial control intrusion detection method for fusing a neighborhood rough set and an optimized SVM of the present invention;
FIG. 2 is a schematic flow chart of an industrial control intrusion detection method for fusing a neighborhood rough set and an optimized SVM of the present invention;
FIG. 3 is a diagram illustrating training accuracy curves of different algorithm optimization SVM's provided by the present invention;
FIG. 4 is a schematic diagram of a detection accuracy curve for 8 attack forms provided by the present invention;
FIG. 5 is a schematic diagram of intrusion detection classification results provided by the present invention;
fig. 6 is a schematic structural diagram of an industrial intrusion detection device fusing a neighborhood rough set and an optimized SVM according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It should be apparent that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
A specific industrial intrusion detection method and apparatus for fusing a coarse neighborhood set and an optimized SVM provided in the embodiments of the present application will be described below with reference to the accompanying drawings.
As shown in fig. 1, the industrial intrusion detection method for fusing a coarse neighborhood set and an optimized SVM provided in the embodiment of the present application includes:
s101, generating a data set based on historical data of industrial control intrusion detection, dividing the data set into a training set and a testing set, and preprocessing the training set and the testing set;
in some embodiments, the preprocessing the training set and the test set includes:
carrying out normalization processing on the training set and the test set;
and (5) performing attribute reduction on the data by using the training set and the test set after the normalization processing and acquiring a neighborhood rough set.
The training set and test set are normalized in the following way,
Figure BDA0003799438810000061
wherein the content of the first and second substances,
Figure BDA0003799438810000062
is a training sample after normalization processing, x is a training sample to be processed, x min As a minimum training sample, x max Is the largest training sample;
attribute reduction reduces the 26 attributes of the training set and test set to 11 attributes.
It can be understood that for large and high-dimensional data in an industrial control network, the application adopts a neighborhood rough set to perform attribute reduction on the data, so that the original industrial control data set is reduced from 26 attributes to 11 attributes. The data sets before and after attribute reduction are adopted to carry out comparison experiments, and the results show that: the attribute reduction can effectively improve the performance of the SVM intrusion detection model.
S102, sending the preprocessed training set and test set into a pre-constructed SVM algorithm model, carrying out iterative training and testing on the SVM algorithm model by using a quantum training sample group optimization algorithm and taking the opposite number of classification accuracy rates of 5-fold cross validation as fitness, solving the optimal value of a target function to determine a hyperplane of fault classification, and obtaining an industrial control intrusion detection model according to the optimal value of the target function;
it will be appreciated that the SVM algorithm model is a support vector machine, the basic idea of which is to maximize the classification interval, i.e. to find a hyperplane in the feature space to separate the positive and negative samples, and to expect the points closest to this hyperplane, i.e. to ensure that they are as far away as possible from this hyperplane, the support vectors being those points closest to the hyperplane.
According to the basic idea, in the case of linear inseparability, the basic formula of the support vector machine is:
Figure BDA0003799438810000071
s.t.y i (w T x i +b)≥1-ε ii ≥0,i=1,2,…,m (1)
wherein c is a penalty parameter for identifying the penalty degree of the error classification. Epsilon i For relaxing variables, when training sample x i Is correctly classified and outside the classification interval, then i =0; when training sample x i Is correctly classified and in the classification interval, 0 < epsilon i <1; when training sample x i If not correctly classified, then ε i ≥0。
By the lagrange multiplier method, equation (1) can be rewritten as:
Figure BDA0003799438810000072
Figure BDA0003799438810000073
0≤α i ≤c,i=1,2,…,m (2)
wherein, a i Is Lagrange multiplier, k (x) i ,x j ) Is the kernel function mentioned above. The kernel function used in this application is a gaussian kernel function, i.e.:
Figure BDA0003799438810000074
wherein g is the parameter of the kernel function, σ is the bandwidth, the local action range of the Gaussian kernel function is controlled, y i For class labels, w is the slope vector of the linear function and b is the intercept.
In the application, the optimization problem of transforming to the dual variable through Lagrange duality is used, namely the optimal solution of the original problem is obtained by solving the dual problem which is equivalent to the original problem, because the dual problem is easier to solve, and a kernel function can be introduced, so that the nonlinear classification problem is popularized.
Specifically, as shown in fig. 2, after a training set and a test set are constructed, normalization processing is performed on the training set and the test set, attribute reduction is performed by using a neighborhood rough set, then a penalty parameter c of an SVM and a parameter g of a kernel function are used as optimization objects, the opposite number of classification accuracy under 5-fold cross validation is selected as fitness, an optimal target function is found through iteration by using a quantum training sample group optimization algorithm, and an industrial control intrusion detection model is obtained according to the optimal value of the target function.
In some embodiments, the iteratively training and testing the SVM algorithm model includes:
inputting the preprocessed training set into a pre-constructed SVM algorithm model, and judging whether the output reaches a precision target or the iteration frequency is greater than the maximum iteration frequency;
if the output does not reach the precision target and the iteration times are less than the maximum iteration times, updating the target function of the SVM algorithm model by using the fitness, and continuing to carry out iterative training on the SVM algorithm model until the output of the SVM algorithm model reaches the precision target or the iteration times are greater than the maximum iteration times;
and taking the SVM algorithm model output after training as an industrial control intrusion detection model.
The iterative training and testing of the SVM algorithm model by adopting the quantum training sample group optimization algorithm comprises the following steps:
initializing the number of training sample groups, the maximum iteration times and the position of each training sample in the training sample groups in a search space, and initializing the individual optimal position of the training samples to be the current optimal position;
calculating the average position of training samples in a training set;
calculating the fitness value of the training sample, and updating the individual optimal position of the training sample and the global optimal position of the population according to the fitness minimum principle;
step four, calculating the position of the random sample;
step five, calculating a new position of the training sample;
and step six, repeatedly executing the step two to the step five until a preset precision target is met or the maximum iteration number is reached.
It should be noted that the quantum-behaved particle swarm optimization (QPSO) used in this patent is an optimization algorithm with quantum behavior, which represents one particle in each individual quantum space, but in one quantum space, the velocity and position of the particle cannot be determined simultaneously, so the state of the particle must be described by using the wave function ψ, and then the position probability distribution of the particle is determined by schrodinger equation, in which case, the Monte Carlo method is used to obtain the position update formula of the particle, which is as follows:
x id (t+1)=p i (t)±α(t)C d (t)-x id (t)×ln[1/u id (t)] (4)
in the formula (4), x id (t + 1) is the position of the particle i in the d-dimension at the t +1 th iteration; is determined by the size of u, which is a random number uniformly distributed between (0, 1) when u is>When 0.5 hour, the plus sign is taken, and the minus sign is taken in other cases; alpha is called contraction-expansion coefficient and is a unique control parameter except the population size and the iteration number; p is a radical of i (t) is the random position of the particles at the t-th iteration, and the calculation formula is as follows:
Figure BDA0003799438810000091
in the formula, p id For the optimal position of the individual, p gd Is the best position of the population,
Figure BDA0003799438810000092
as with the value of u, are random numbers evenly distributed between (0, 1).
In the formula (4), C (t) is the average best position, and the calculation formula is as follows:
Figure BDA0003799438810000093
the main characteristic of the DE algorithm is a difference strategy, the patent is combined with the DE/rand/1/bin difference strategy to improve the update of the random position of the particles, and the improvement formula is as follows:
P i (t+1)=P (r0)gd (t)+F(P (r1)gd (t)-P (r2)gd (t)) (7)
in the formula, F is a scaling factor, and is inspired by a self-adaptive weight PSO algorithm, a method for adaptively changing the parameter F is provided, and the formula is as follows:
Figure BDA0003799438810000094
the maximum and minimum values of F were chosen according to the literature to be 0.9 and 0.4, respectively. f is the current best individual fitness value of the particle, f avg And f min Respectively, the average fitness value and the minimum fitness value of all the current particles. The parameter α is called adaptation of the parameter because it is adaptively adjusted according to the change of the fitness value of the particle.
The change of the parameter α in equation (4) directly affects the behavior of the particle. The control of alpha usually adopts a fixed value and a linear reduction method. The same control method is adopted to adjust alpha as the scaling factor F is adaptively changed, and the formula is as follows:
Figure BDA0003799438810000101
wherein alpha is max And alpha min The maximum and minimum values of alpha are respectively selected as 1 and 0.5 according to the literature.
The Levy flight strategy is an ideal strategy for simulating animals to search for food in unfamiliar environments, and is a non-gaussian random process combining frequent short-range local searches and occasional long-range global searches. In view of the characteristics of the Levy flight strategy, many scholars add the Levy flight strategy in an evolution equation of a bionics intelligent algorithm, so that the performance of the algorithm is improved and a good optimization effect is achieved.
Using the Levy flight path expression proposed by Mantegna:
Levy(λ)=μ/|v| 1/β (10)
wherein, the parameter β is usually β =1.5; the parameters mu and v are respectively
Figure BDA0003799438810000102
While the standard deviation σ of the positive Taiwan distribution corresponds to the parameters μ and ν μ And σ ν The value satisfies formula (11):
Figure BDA0003799438810000103
the Levy flight path Levy (λ) can be obtained by substituting expression (11) for expression (10). The particle evolution equation after the Levy flight strategy is added to the QPSO algorithm becomes:
Figure BDA0003799438810000104
wherein, gbest id The position of the particle with the best fitness value in the current population is taken as the position of the particle; x is a radical of a fluorine atom id (t) the position of the particle i in the d-dimension at the tth iteration; u. of id (t) is the random number of particles i uniformly distributed between (0, 1) in the d-dimension at the tth iteration; p is a radical of formula id (t) is the optimal position of the individual at the t-th iteration, C d (t) is the best position in dimension d.
In the second step of the application, the average best position of the training samples in the training set is calculated by adopting the formula (6), in the fourth step, the position of the random sample is calculated by adopting the formula (7), and in the fifth step, the new position of the training sample is calculated by adopting the formula (12), so that the target function which meets the preset precision target or reaches the maximum iteration number is finally obtained.
S103, inputting the data to be detected into the industrial control intrusion detection model to obtain an intrusion detection result.
According to the method, for huge and high-dimensional data in an industrial control network, a neighborhood rough set is adopted to reduce the attributes of the data, so that the original industrial control data set is reduced from 26 attributes to 11 attributes. In order to improve the accuracy of an industrial control intrusion detection model, the method adopts a hybrid self-adaptive quantum particle swarm optimization algorithm to optimize parameters c and g of an SVM algorithm model, constructs the intrusion detection model by using the found optimal parameters c and g, and then verifies the model by using test data. The data sets before and after attribute reduction are adopted to carry out comparison experiments, and the results show that: the attribute reduction can effectively improve the performance of the SVM intrusion detection model, each performance index of the SVM intrusion detection model after HAQPSO optimization is basically superior to that of SVM models optimized by other algorithms, and the NRS-HAQPSO-SVM intrusion detection algorithm can be effectively applied to an actual engineering scene.
In some embodiments, in terms of an experimental usage data set, the industrial intrusion detection standard data set used in the present patent is a data set proposed by key infrastructure protection center of mississippi state university in 2014, which is network layer data extracted by attacking a natural gas pipeline and is subjected to numerical processing. Different attack types correspond to different attack forms. Each piece of data in the dataset contains 26 attributes and a category label. The attack form and the corresponding class label are shown in table 1.
TABLE 1 attack forms and corresponding class labels
Figure BDA0003799438810000111
In order to verify the performance of the HAQPSO-SVM in the intrusion detection of the industrial control system, a simulation test experiment is carried out, 6000 groups of data are uniformly and randomly extracted from an original data set, the 6000 groups of data are divided into a training set and a testing set, and the number of the data is respectively as follows: 4000 and 2000. The neighborhood radius in the neighborhood rough set algorithm is set to 0.125, and the parameters required to be set by the HAQPSO algorithm are as follows: the maximum number of iterations is 50, the population size is 30, the search dimension d =2, the contraction-expansion coefficient α is [0.5,1], and the parameter β for Levy flight is 1.5. The maximum iteration number, the population size and the search dimension of other algorithms in the patent are the same as those of the HAQPSO algorithm. And (5) carrying out iterative optimization on parameters c and g of the SVM by using an optimization algorithm, wherein the search range is [0.001, 1000].
A pretreatment is required before the experiment. Normalization treatment: to remove the data from the unit constraint so that indices of different magnitudes can be easily compared, the data of the dataset is mapped to the [0,1] interval.
Attribute reduction: according to the importance of each attribute pair to the decision attribute, the attribute reduction is carried out by using the neighborhood rough set, the attribute is reduced from 26 items to 11 items, and the attribute and the corresponding attribute importance are shown in the table 2 after the reduction is carried out, wherein the attribute is more than 50%.
TABLE 2 reduced feature attributes
Figure BDA0003799438810000121
In order to verify the optimization effect of the algorithm, the method compares the optimization results of the SVM parameters by the HAQPSO algorithm, the QPSO algorithm, the PSO algorithm and the GA algorithm. The intrusion detection model is trained by using the training sets before and after attribute reduction respectively, and the running time and the training precision of the algorithm are shown in table 3 in the training process.
TABLE 3 training time and training accuracy
Figure BDA0003799438810000131
As can be seen from Table 3, the training time of each algorithm is much reduced after attribute reduction, wherein the training time of the NRS-HAQPSO-SVM is reduced from 2832 seconds to 2052 seconds, and the time reduction reaches 27.5%. In the aspect of training precision, the training precision is reduced a little, the precision of the NRS-HAQPSO-SVM algorithm is even improved by 0.03%, and the efficiency can be effectively improved under the condition that the training precision is not reduced by the neighborhood rough set algorithm.
The relationship curve between the fitness value and the iteration times of the training process of the SVM optimized by different algorithms is shown in FIG. 3.
As can be seen from table 3 and fig. 3, the optimization accuracy of the HAQPSO algorithm for the SVM is the highest, reaching 98.81%, and the accuracy of the GA algorithm is the lowest, only 97.65%. From the convergence rate of the algorithm, the convergence rate of the QPSO is the fastest, and the QPSO converges to the optimal value around the 15 th generation; second, HAQPSO, converges to optimum around generation 18; whereas GA converges the slowest.
(1) And (4) analyzing the overall detection effect. And (4) analyzing the overall detection effect. The SVM model obtained by training is tested by using 2000 groups of test set data, and the classification performance of the model is evaluated by using the indexes of accuracy, false alarm rate and missing report rate. The simulation experiments were performed using data without attribute reduction and data after reduction, respectively, and the overall results of the experiments are shown in tables 4 and 5.
TABLE 4 detection results of unreduced data set
Figure BDA0003799438810000132
Figure BDA0003799438810000141
TABLE 5 reduction of the test results of the data set
Figure BDA0003799438810000142
Comparing table 4 with table 5, after attribute reduction, the accuracy of each algorithm is improved to a certain extent, wherein the accuracy of the HAQPSO-SVM is 98.6%, the highest accuracy is obtained in all algorithms, and the false alarm rate and the false missing report rate are generally reduced, wherein the false alarm rate of the HAQPSO-SVM is only 0.5%, and the false missing report rate is only 0.9%. By integrating 3 indexes, the intrusion detection model constructed based on the HAQPSO-SVM has a good detection effect and good generalization capability, and the property reduction strategy based on the neighborhood rough set can effectively improve the performance of the intrusion detection model.
And analyzing the detection effect of each attack type data. The MSU industrial control intrusion detection standard data set includes 4 attack classes, which are subdivided into 8 attack forms (including normal data), and the detection accuracy of each algorithm for the 8 attack forms is shown in fig. 4.
As can be seen from fig. 4, the detection accuracy of the HAQPSO-SVM is substantially the highest for each form of attack, and especially when detecting 2 attacks, namely MSCI and Dos, is significantly higher than that of SVM intrusion detection models optimized by other algorithms.
Fig. 5 is a comparison graph of the result of the prediction classification and the result of the theoretical classification of the test set by the industrial control intrusion detection model, and the overall distribution of the test set data and the situation of the misclassification point can be observed from the graph.
In conclusion, the dimension reduction needs to be performed on the data set during preprocessing, and the neighborhood rough set can delete the redundancy attributes without reducing the classification precision. The patent provides a Hybrid Adaptive Quantum training sample group Optimization algorithm (HAQPSO) for optimizing parameters of an SVM classifier and constructing an industrial control system intrusion detection model by the optimized SVM. The experimental research is carried out by adopting an industrial control system standard data set proposed by Mississippi State University (MSU), and the superiority of the model obtained by the algorithm is verified.
As shown in fig. 6, the present application provides an industrial intrusion detection device fusing a neighborhood rough set and an optimized SVM, which includes:
an obtaining module 201, configured to generate a data set based on historical data of industrial control intrusion detection, divide the data set into a training set and a test set, and perform preprocessing on the training set and the test set;
the training module 202 is used for sending the preprocessed training set and the preprocessed test set into a pre-constructed SVM algorithm model, performing iterative training and testing on the SVM algorithm model by using a quantum training sample group optimization algorithm by taking the opposite number of classification accuracy rates of 5-fold cross validation as fitness, solving the optimal value of a target function to determine a hyperplane of fault classification, and obtaining an industrial control intrusion detection model according to the optimal value of the target function;
and the detection module 203 is used for inputting the data to be detected into the industrial control intrusion detection model to obtain an intrusion detection result.
The working principle of the industrial control intrusion detection device fusing the neighborhood rough set and the optimized SVM provided by the application is that the acquisition module 201 generates a data set based on historical data of industrial control intrusion detection, divides the data set into a training set and a test set, and preprocesses the training set and the test set; the training module 202 sends the preprocessed training set and test set into a pre-constructed SVM algorithm model, iterative training and testing are carried out on the SVM algorithm model by adopting a quantum training sample group optimization algorithm by taking the opposite number of classification accuracy of 5-fold cross validation as fitness, the optimal value of an objective function is obtained to determine the hyperplane of fault classification, and an industrial control intrusion detection model is obtained according to the optimal value of the objective function; the detection module 203 inputs the data to be detected into the industrial control intrusion detection model to obtain an intrusion detection result.
The present application provides a computer device comprising: a memory, which may include volatile memory in a computer readable medium, random Access Memory (RAM), and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The computer device stores an operating system, and the memory is an example of a computer-readable medium. The computer program, when executed by the processor, causes the processor to perform an industrial intrusion detection method that fuses coarse neighborhood sets and optimizes SVMs, the structure of which is a block diagram of only a portion of the structure associated with the present solution and does not constitute a limitation on the computing device to which the present solution applies, and a particular computing device may include more or less components than shown in the figures, or may combine certain components, or have a different arrangement of components.
In one embodiment, the method for generating an intellectual property state provided by the present application may be implemented in the form of a computer program that is executable on a computer device.
In some embodiments, the computer program, when executed by the processor, causes the processor to perform the steps of: generating a data set based on historical data of industrial control intrusion detection, dividing the data set into a training set and a test set, and preprocessing the training set and the test set; sending the preprocessed training set and test set into a pre-constructed SVM algorithm model, taking the opposite number of classification accuracy rates of 5-fold cross validation as fitness, carrying out iterative training and testing on the SVM algorithm model by adopting a quantum training sample group optimization algorithm, solving the optimal value of a target function to determine a hyperplane of fault classification, and obtaining an industrial control intrusion detection model according to the optimal value of the target function; and inputting the data to be detected into the industrial control intrusion detection model to obtain an intrusion detection result.
It is to be understood that the embodiments of the method provided above correspond to the embodiments of the apparatus described above, and the corresponding specific contents may be referred to each other, which is not described herein again.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (9)

1. An industrial control intrusion detection method fusing a neighborhood rough set and an optimized SVM is characterized by comprising the following steps:
generating a data set based on historical data of industrial control intrusion detection, dividing the data set into a training set and a test set, and preprocessing the training set and the test set;
sending the preprocessed training set and test set into a pre-constructed SVM algorithm model, taking the opposite number of classification accuracy rates of 5-fold cross validation as fitness, carrying out iterative training and testing on the SVM algorithm model by adopting a quantum training sample group optimization algorithm, solving the optimal value of a target function to determine a hyperplane of fault classification, and obtaining an industrial control intrusion detection model according to the optimal value of the target function;
and inputting the data to be detected into the industrial control intrusion detection model to obtain an intrusion detection result.
2. The method of claim 1, wherein preprocessing the training set and test set comprises:
carrying out normalization processing on the training set and the test set;
and (5) performing attribute reduction on the data by using the training set and the test set after the normalization processing to acquire a neighborhood rough set.
3. The method of claim 1, wherein iteratively training and testing the SVM algorithm model comprises:
inputting the preprocessed training set into a pre-constructed SVM algorithm model, and judging whether the output reaches a precision target or whether the iteration frequency is greater than the maximum iteration frequency;
if the output does not reach the precision target and the iteration times are less than the maximum iteration times, updating the target function of the SVM algorithm model by using the fitness, and continuing to carry out iterative training on the SVM algorithm model until the output of the SVM algorithm model reaches the precision target or the iteration times are greater than the maximum iteration times;
and taking the SVM algorithm model output after training as an industrial control intrusion detection model.
4. The method of claim 1, wherein the objective function comprises: penalty parameters and parameters of kernel functions; the basis model of SVM algorithm is
Figure FDA0003799438800000021
s.t.y i (w T x i +b)≥1-ε ii ≥0,i=1,2,…,m
Deforming the SVM algorithm basic model by using a Lagrange multiplier method to obtain
Figure FDA0003799438800000022
Figure FDA0003799438800000023
0≤α i ≤c,i=1,2,…,m
Figure FDA0003799438800000024
Wherein, c is a punishment parameter used for marking the punishment degree of the error classification; epsilon i For relaxing variables, when training sample x i Is correctly classified and outside the classification interval, then i =0; when training sample x i Is correctly classified and in the classification interval, 0 < epsilon i <1; when training sample x i If not correctly classified, then ε i ≥0;a i Is Lagrange multiplier, k (x) i ,x j ) Is a Gaussian kernel function, g is a parameter of the kernel function, sigma is a bandwidth, controls the local action range of the Gaussian kernel function, y i For class labels, w is the slope vector of the linear function and b is the intercept.
5. The method of claim 3, wherein the iterative training and testing of the SVM algorithm model using a quantum training sample group optimization algorithm comprises:
initializing the number of training sample groups, the maximum iteration times and the position of each training sample in the training sample groups in a search space, and initializing the individual optimal position of the training samples to be the current optimal position;
calculating the average position of training samples in a training set;
calculating the fitness value of the training sample, and updating the individual optimal position of the training sample and the global optimal position of the population according to the fitness minimum principle;
step four, calculating the position of the random sample;
step five, calculating a new position of the training sample;
and step six, repeatedly executing the step two to the step five until a preset precision target is met or the maximum iteration number is reached.
6. The method of claim 2,
attribute reduction reduces the 26 attributes of the training set and test set to 11 attributes.
7. The method of claim 2, wherein the training set and test set are normalized in the following manner,
Figure FDA0003799438800000031
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003799438800000032
is a training sample after normalization processing, x is a training sample to be processed, x min As a minimum training sample, x max Is the largest training sample.
8. An industrial control intrusion detection device fusing a neighborhood rough set and an optimized SVM is characterized by comprising:
the system comprises an acquisition module, a data acquisition module and a data analysis module, wherein the acquisition module is used for generating a data set based on historical data of industrial control intrusion detection, dividing the data set into a training set and a test set, and preprocessing the training set and the test set;
the training module is used for sending the preprocessed training set and the preprocessed test set into a pre-constructed SVM algorithm model, carrying out iterative training and testing on the SVM algorithm model by using a quantum training sample group optimization algorithm by taking the opposite number of classification accuracy of 5-fold cross validation as fitness, solving the optimal value of an objective function to determine a hyperplane of fault classification, and obtaining an industrial control intrusion detection model according to the optimal value of the objective function;
and the detection module is used for inputting data to be detected into the industrial control intrusion detection model and obtaining an intrusion detection result.
9. A computer device, comprising: a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the industrial intrusion detection method of fusing a coarse neighborhood set and an optimized SVM according to any one of claims 1 to 7.
CN202210981877.8A 2022-08-16 2022-08-16 Industrial control intrusion detection method and device fusing neighborhood rough set and optimized SVM Pending CN115345236A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210981877.8A CN115345236A (en) 2022-08-16 2022-08-16 Industrial control intrusion detection method and device fusing neighborhood rough set and optimized SVM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210981877.8A CN115345236A (en) 2022-08-16 2022-08-16 Industrial control intrusion detection method and device fusing neighborhood rough set and optimized SVM

Publications (1)

Publication Number Publication Date
CN115345236A true CN115345236A (en) 2022-11-15

Family

ID=83951906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210981877.8A Pending CN115345236A (en) 2022-08-16 2022-08-16 Industrial control intrusion detection method and device fusing neighborhood rough set and optimized SVM

Country Status (1)

Country Link
CN (1) CN115345236A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886398A (en) * 2023-08-03 2023-10-13 中国石油大学(华东) Internet of things intrusion detection method based on feature selection and integrated learning
CN117743955A (en) * 2023-12-21 2024-03-22 广东人信工程咨询有限公司 BIM (building information modeling) acquired data processing method, system, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101594361A (en) * 2009-06-02 2009-12-02 浙江大学 Network Intrusion Detection System based on shortcut calculation of support vector machine
CN105760888A (en) * 2016-02-23 2016-07-13 重庆邮电大学 Neighborhood rough set ensemble learning method based on attribute clustering
CN107016416A (en) * 2017-04-12 2017-08-04 中国科学院重庆绿色智能技术研究院 The data classification Forecasting Methodology merged based on neighborhood rough set and PCA

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101594361A (en) * 2009-06-02 2009-12-02 浙江大学 Network Intrusion Detection System based on shortcut calculation of support vector machine
CN105760888A (en) * 2016-02-23 2016-07-13 重庆邮电大学 Neighborhood rough set ensemble learning method based on attribute clustering
CN107016416A (en) * 2017-04-12 2017-08-04 中国科学院重庆绿色智能技术研究院 The data classification Forecasting Methodology merged based on neighborhood rough set and PCA

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈志炼: "基于机器学习的工业控制系统入侵检测技术研究", 《中国优秀硕士学位论文全文数据库》, 15 June 2020 (2020-06-15), pages 2 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886398A (en) * 2023-08-03 2023-10-13 中国石油大学(华东) Internet of things intrusion detection method based on feature selection and integrated learning
CN116886398B (en) * 2023-08-03 2024-03-29 中国石油大学(华东) Internet of things intrusion detection method based on feature selection and integrated learning
CN117743955A (en) * 2023-12-21 2024-03-22 广东人信工程咨询有限公司 BIM (building information modeling) acquired data processing method, system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110070141B (en) Network intrusion detection method
CN108520272B (en) Semi-supervised intrusion detection method for improving Cantonese algorithm
Sarker CyberLearning: Effectiveness analysis of machine learning security modeling to detect cyber-anomalies and multi-attacks
Lin et al. Particle swarm optimization for parameter determination and feature selection of support vector machines
CN115345236A (en) Industrial control intrusion detection method and device fusing neighborhood rough set and optimized SVM
Althubiti et al. Applying long short-term memory recurrent neural network for intrusion detection
Liu et al. Adaptive intrusion detection via GA-GOGMM-based pattern learning with fuzzy rough set-based attribute selection
Sengan et al. The optimization of reconfigured real-time datasets for improving classification performance of machine learning algorithms.
Wang et al. Local feature selection based on artificial immune system for classification
Astorino et al. DC models for spherical separation
Borah et al. Robust twin bounded support vector machines for outliers and imbalanced data
Al-Daweri et al. A homogeneous ensemble based dynamic artificial neural network for solving the intrusion detection problem
Minervini et al. Scalable learning of entity and predicate embeddings for knowledge graph completion
CN113839926A (en) Intrusion detection system modeling method, system and device based on gray wolf algorithm feature selection
Wen et al. Personal loan fraud detection based on hybrid supervised and unsupervised learning
Meng et al. A network threat analysis method combined with kernel PCA and LSTM-RNN
Al Duhayyim et al. Optimized stacked autoencoder for IoT enabled financial crisis prediction model
Ji et al. A network intrusion detection approach based on asymmetric convolutional autoencoder
Wang et al. Design of network intrusion detection system based on parallel DPC clustering algorithm
Thomas et al. Introduction to machine learning
Kaliraj et al. Intrusion Detection Using Krill Herd Optimization Based Weighted Extreme Learning Machine
Manokaran et al. A Novel Set Theory Rule based Hybrid Feature Selection Techniques for Efficient Anomaly Detection System in IoT Edge
Lu et al. “How Does It Detect A Malicious App?” Explaining the Predictions of AI-based Malware Detector
Kumar et al. Security Testing of Android Apps Using Malware Analysis and XGboost Optimized by Adaptive Particle Swarm Optimization
Mukeri et al. Towards Query Efficient and Derivative Free Black Box Adversarial Machine Learning Attack

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination