CN110825068A - Industrial control system anomaly detection method based on PCA-CNN - Google Patents

Industrial control system anomaly detection method based on PCA-CNN Download PDF

Info

Publication number
CN110825068A
CN110825068A CN201911029762.3A CN201911029762A CN110825068A CN 110825068 A CN110825068 A CN 110825068A CN 201911029762 A CN201911029762 A CN 201911029762A CN 110825068 A CN110825068 A CN 110825068A
Authority
CN
China
Prior art keywords
data set
control system
industrial control
training
intrusion detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911029762.3A
Other languages
Chinese (zh)
Inventor
林晔篁
刘海洋
钟雪辉
刘蕾蕾
彭纬伟
杜伟
陈云云
陈旭腾
崔钰
王思杰
陈晓锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HUIZHOU STORAGE POWER GENERATION CO Ltd
Original Assignee
HUIZHOU STORAGE POWER GENERATION CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HUIZHOU STORAGE POWER GENERATION CO Ltd filed Critical HUIZHOU STORAGE POWER GENERATION CO Ltd
Publication of CN110825068A publication Critical patent/CN110825068A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0224Process history based detection method, e.g. whereby history implies the availability of large amounts of data
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/20Pc systems
    • G05B2219/24Pc safety
    • G05B2219/24065Real time diagnostics

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application relates to an intrusion detection method, an intrusion detection device, computer equipment and a computer readable storage medium of an industrial control system, wherein the method comprises the following steps: extracting an original data set of the industrial control system from a network data set of a communication protocol of the industrial control system; acquiring a training data set and a testing data set from an original data set; performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction; training the training data set subjected to dimensionality reduction based on the intrusion detection model to obtain a classification model; and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system. According to the scheme, the feature dimension reduction is carried out by using a principal component analysis method, so that redundant information is removed, the calculated amount is reduced, and the technical problem that the intrusion detection method of the industrial control system in the traditional technology is long in training time is solved.

Description

Industrial control system anomaly detection method based on PCA-CNN
Technical Field
The present application relates to the field of industrial control technologies, and in particular, to an intrusion detection method for an industrial control system, an intrusion detection apparatus for an industrial control system, a computer device, and a computer-readable storage medium.
Background
The industrial control system is spread in the industries of electric power, chemical industry, petroleum and the like, and along with the mutual integration of informatization and industrialization, a communication network inside the industrial control system is gradually interconnected and intercommunicated with the internet. Therefore, the original sealing performance of the industrial control system is broken, and the industrial control system is easy to be attacked. The intrusion detection system can detect the external attack before the external attack damages the system and send out an alarm. The intrusion detection technology in the traditional IT network is mature, but the requirement of the industrial control system on safety is different from that of the traditional IT system.
The current method for intrusion detection of the industrial control system is to acquire Modbus TCP data in real time as a characteristic vector, obtain a detection result through a support vector machine two-classification model, and give an alarm if abnormal flow is found, and the method has the advantages that the abnormal flow which cannot be identified by some firewalls can be detected.
However, the intrusion detection method of the industrial control system in the traditional technology has the problem of long training time.
Disclosure of Invention
Therefore, it is necessary to provide an intrusion detection method for an industrial control system, an intrusion detection apparatus for an industrial control system, a computer device, and a computer-readable storage medium, aiming at the technical problem that the intrusion detection method for the industrial control system in the conventional technology has a long training time.
An intrusion detection method of an industrial control system comprises the following steps:
extracting an original data set of an industrial control system from a network data set of a communication protocol of the industrial control system;
acquiring a training data set and a testing data set from the original data set;
performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction;
training the training data set subjected to dimensionality reduction based on an intrusion detection model to obtain a classification model;
and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system.
An intrusion detection device of an industrial control system, comprising:
the system comprises an original data set extraction module, a data processing module and a data processing module, wherein the original data set extraction module is used for extracting an original data set of an industrial control system from a network data set of a communication protocol of the industrial control system;
the system comprises an original data set classification module, a training data set and a test data set, wherein the original data set classification module is used for acquiring a training data set and a test data set from the original data set;
the characteristic dimension reduction module is used for performing characteristic dimension reduction on the training data set and the test data set by utilizing a principal component analysis method to obtain the training data set after the characteristic dimension reduction and the test data set after the characteristic dimension reduction;
the model training module is used for training the training data set subjected to dimensionality reduction based on an intrusion detection model to obtain a classification model;
and the data classification module is used for inputting the test data set subjected to feature dimension reduction into the classification model for classification processing to obtain an intrusion detection result of the industrial control system.
A computer device comprising a processor and a memory, the memory storing a computer program that when executed by the processor performs the steps of: extracting an original data set of the industrial control system from a network data set of a communication protocol of the industrial control system; acquiring a training data set and a testing data set from an original data set; performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction; training the training data set subjected to dimensionality reduction based on the intrusion detection model to obtain a classification model; and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of: extracting an original data set of the industrial control system from a network data set of a communication protocol of the industrial control system; acquiring a training data set and a testing data set from an original data set; performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction; training the training data set subjected to dimensionality reduction based on the intrusion detection model to obtain a classification model; and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system.
The intrusion detection method, the intrusion detection device, the computer equipment and the storage medium of the industrial control system extract an original data set of the industrial control system from a network data set of a communication protocol of the industrial control system; acquiring a training data set and a testing data set from an original data set; performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction; training the training data set subjected to dimensionality reduction based on the intrusion detection model to obtain a classification model; and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system. According to the scheme, the feature dimension reduction is carried out by using a principal component analysis method, so that redundant information is removed, the calculated amount is reduced, and the technical problem that the intrusion detection method of the industrial control system in the traditional technology is long in training time is solved.
Drawings
FIG. 1 is a flow diagram illustrating an intrusion detection method for an industrial control system according to one embodiment;
FIG. 2 is a flow diagram illustrating a method for extracting a raw data set of an industrial control system from a network data set of a communication protocol of the industrial control system in one embodiment;
FIG. 3 is a flowchart illustrating a method for performing feature dimension reduction on a training data set and a test data set by using a principal component analysis method to obtain a training data set after feature dimension reduction and a test data set after feature dimension reduction in one embodiment;
FIG. 4 is a diagram of a convolutional neural network architecture in one embodiment;
FIG. 5 is a flow diagram illustrating an intrusion detection method for an industrial control system according to one embodiment;
FIG. 6 is a block diagram of an intrusion detection device of the industrial control system in one embodiment;
FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In an embodiment, an intrusion detection method of an industrial control system is provided, referring to fig. 1, where fig. 1 is a schematic flow chart of the intrusion detection method of the industrial control system in an embodiment, the intrusion detection method of the industrial control system may include the following steps:
step S101, an original data set of the industrial control system is extracted from a network data set of a communication protocol of the industrial control system.
The communication protocol of the industrial control system is generally a Modbus protocol. Specifically, a network data set of the industrial system based on the Modbus protocol can be collected, and variables which are possibly affected when the industrial control system is invaded are extracted from the network data set to serve as selected features and serve as original data sets.
Step S102, a training data set and a testing data set are obtained from the original data set.
The training data set is used for inputting an intrusion detection model so as to obtain a classification model, and the testing data set is used for inputting the classification model for classification so as to obtain a classification result. Specifically, after the raw data set is obtained, the raw data set may be divided into a training data set and a testing data set according to a certain proportion. For example: the raw data set may be divided into a training data set and a test data set on a 4:1 scale as desired.
After the original data set is divided into a training data set and a testing data set, the training data set and the testing data set can be normalized to ensure that all values in the feature vector are in the same order of magnitude.
And S103, performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method, and acquiring the training data set subjected to feature dimensionality reduction and the test data set subjected to feature dimensionality reduction.
The principal component analysis method is a statistical method, and can be implemented by recombining a plurality of original indexes or variables with certain correlation into a group of new indexes or variables which are not related to each other. Specifically, through principal component analysis, the problems of large quantity of characteristics and possible existence of correlation and redundancy of a training data set and a testing data set can be solved, the original characteristics are replaced by a small number of new characteristics, the correlation of a plurality of variables with correlation existing in original data is eliminated, and a group of variables with small quantity and mutual independence are formed again, so that redundant information is removed, the calculated amount is reduced, and the training time is shortened.
And step S104, training the training data set subjected to dimensionality reduction based on the intrusion detection model to obtain a classification model.
Specifically, the training data set after dimensionality reduction can be trained through an intrusion detection model, so that a classification model of the support vector machine is obtained.
And S105, inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system.
Specifically, after the classification model of the support vector machine is obtained in step S104, the test data set with the reduced feature dimension is input into the classification model of the support vector machine, the test data set is classified, and whether or not the industrial control system is invaded, and even the type of the invasion can be determined according to the classification result.
The intrusion detection method of the industrial control system extracts an original data set of the industrial control system from a network data set of a communication protocol of the industrial control system; acquiring a training data set and a testing data set from an original data set; performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction; training the training data set subjected to dimensionality reduction based on the intrusion detection model to obtain a classification model; and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system. According to the scheme, the feature dimension reduction is carried out by using a principal component analysis method, so that redundant information is removed, the calculated amount is reduced, and the technical problem that the intrusion detection method of the industrial control system in the traditional technology is long in training time is solved.
In an embodiment, referring to fig. 2, fig. 2 is a flowchart illustrating a method for extracting a raw data set of an industrial control system from a network data set of a communication protocol of the industrial control system in an embodiment, and step S101 may include the following steps:
step S201, classifying the data to be processed according to the communication flow of the industrial control system, and acquiring the intrusion type of the data to be processed.
The data to be processed is data in a network data set, the communication flow of the industrial control system can comprise communication flows of a Modbus client and a Modbus server, for intrusion detection of the industrial control system, a state model of normal flow of the system can be established, unknown flow is compared, and if the unknown flow deviates from the established normal model, abnormal flow is regarded and an alarm is given. Specifically, the intrusion categories of the data in the Modbus network data set can be classified according to the communication flow of the Modbus client and the Modbus server. The intrusion categories may include: one or more of normal, spy attack, response injection attack, command injection attack, and denial of service attack.
Step S202, obtaining a command data packet and a response data packet in a network data set;
step S203, the data characteristics of the data to be processed are obtained according to the command data packet and the response data packet.
The command data packet and the response data packet are stored in the Modbus network data set, and particularly, data characteristics related to characteristics of the industrial control system can be extracted from the Modbus network data set. For example: the data characteristics may include: the device addresses, the initial memory positions, the read-write commands, the byte numbers of the responded memory, the read and write function codes of the command data packet and the response data packet, the lengths of the command data packet and the response data packet, the time interval between the command data packet and the response data packet, the error rate of the cyclic redundancy check and the characteristic or state value of a specific industrial control system in the command data packet and the response data packet. For example, PID parameter values and state values specific to the industrial control system, such as pipeline pressure, solenoid valve state, pump state, etc., may be extracted according to the characteristics of different industrial control systems.
Step S204, the data characteristics and the intrusion types are set as an original data set.
In this step, the data characteristics obtained in step S203 and the intrusion type of the data obtained in step S201 may be used to form an original data set for intrusion detection.
Further, after the intrusion category of the data to be processed is obtained in step S201, the intrusion category of the data may be assigned. For example: the intrusion type of the feature vector in the normal state may be labeled as 0, the intrusion type in the investigation as the attack type is 1, the intrusion type in the response to the injection attack as the attack type is 2, the intrusion type in the command injection attack as the attack type is 3, and the intrusion type in the denial of service attack as the attack type is 4. The assignment of the intrusion class can be used for judging the accuracy of intrusion detection, thereby verifying the reliability of the intrusion detection method of the industrial control system.
In an embodiment, referring to fig. 3, fig. 3 is a flowchart illustrating a method for performing feature dimension reduction on a training data set and a test data set by using a principal component analysis method to obtain a training data set after feature dimension reduction and a test data set after feature dimension reduction in one embodiment, where step S103 includes the following steps:
step S301, elements in the training data set are standardized to form a standardized matrix.
Specifically, the input data set is regarded as a matrix form of M × N +1, and the data in the ith row and the j column in the matrix is xijThe mean and standard deviation of the j-th dimension data are respectively mujAnd σj(ii) a Obtain a normalized matrix yijThe matrix Y of the composition is formed,
Figure BDA0002249793910000081
step S302, a covariance matrix is calculated according to the standardized matrix;
step S303, an eigenvalue, an eigenvector, and a contribution ratio of the covariance matrix are calculated.
Specifically, the covariance matrix S can be calculated using the following formula:
Figure BDA0002249793910000082
calculating an eigenvalue (lambda) of S from the covariance matrix S12,...,λp) And a feature vector ai=(ai1,ai2,...,aip) Where i 1, 2.., p, the contribution η is calculated from the eigenvalues of the respective principal components.
Further, the contribution η may be calculated by the following formula:
Figure BDA0002249793910000083
wherein, ηiRepresents the contribution rate, lambda, corresponding to the ith eigenvalue of the covariance matrixiIs the ith eigenvalue of the covariance matrix,
Figure BDA0002249793910000084
step S304, extracting the maximum eigenvalue and the eigenvector corresponding to the maximum eigenvalue according to the accumulated contribution ratio of the demand, and forming a transformation matrix.
Specifically, the contribution rate η reflects the amount of information that the corresponding principal component includes the original variable, and the larger the contribution rate η is, the more information that the corresponding principal component includes the original variable, and the top k maximum eigenvalues can be extracted from the cumulative contribution rate of demand, and the eigenvector (a) thereof can be obtained1,a2,...,ak) Constituting a transformation matrix Q of p rows and k columns.
And S305, performing feature dimension reduction on the training data set and the test data set through the transformation matrix.
Specifically, a reduced-dimension k-dimensional data matrix T is obtained by T ═ YQ, where k is less than N +1, and the original test data set is reduced in dimension by using a transformation matrix Q, so that the training set and the test set maintain the same feature dimension.
Feature dimension reduction is carried out through a principal component analysis method, redundant information is removed, useful information is added, and meanwhile calculated amount is reduced, so that the technical problem that an industrial control system intrusion detection method in the traditional technology is long in training time is solved.
In one embodiment, step S104 is to train the training data set after dimension reduction based on the convolutional neural network model to obtain a classification model.
The architecture of the convolutional neural network used can refer to fig. 4, and fig. 4 is an architecture diagram of the convolutional neural network in one embodiment. Specifically, step S104 may include the steps of: extracting data characteristics from the training data set as an input data set; training an input data set through a convolutional neural network classification model to obtain a predicted value; inputting the predicted value and the actual value into a classified cross entropy loss function to obtain a loss function value output by the classified cross entropy loss function; wherein the actual value may be an assignment of an intrusion class for the data; and when the training times reach the condition that the variation of the loss function value is less than a set threshold value, selecting the trained convolutional neural network classification model with the minimum loss function value as the classification model.
Further, after the classification model is obtained, the test data set may be input into the classification model to perform classification processing, the classification result may be a five-dimensional confusion matrix, and then the five-dimensional confusion matrix may be evaluated to determine whether the network intrusion detection method of the industrial control system meets the detection requirement, and may be compared with a conventional intrusion detection method. Through comparative experiments, the following results can be obtained: the intrusion detection method of the industrial control system not only subtracts 40% of characteristic dimension, but also obviously improves the accuracy, the detection rate and the false alarm rate.
Next, an intrusion detection method of an industrial control system provided in an embodiment of the present application is shown by an application example, as shown in fig. 5, fig. 5 is a schematic flow diagram of an intrusion detection method of an industrial control system in an embodiment, and specifically includes the following steps:
s1: the method comprises the steps that a network data set of an industrial control system based on a Modbus protocol is collected, communication flow of a Modbus client and communication flow of a Modbus server are extracted, the type of each piece of data in the data set is divided into normal, investigation attack, response injection attack, command injection attack and denial of service attack, and specific features in each command data packet and a corresponding response data packet are combined to serve as one of the data set;
s2: aiming at the characteristics of an industrial control system, a command in the Modbus data set, a device address in a response data packet and a memory initial position can be extracted; reading and writing the command and the number of bytes of the responded memory; the read and write function codes of the command packet and the response packet; the length of the command packet and the response packet; the time interval between two packets; error rate of cyclic redundancy check. In addition, the PID parameter values and also state values specific to the industrial control system, such as the pipe pressure, the solenoid valve state, the pump state, etc., are extracted according to the characteristics of different industrial control systems. The above total N characteristics, the last dimension is labeled with the category, that is, each characteristic vector has N +1 values in total;
dividing the original data set into a training set and a testing set according to the proportion of 4:1, and then carrying out normalization processing to enable all values in the characteristic vector to belong to the same order of magnitude;
s3: aiming at the problems of large quantity of features and possible existence of correlation and redundancy in a data set, PCA is used to enable a few new features to replace original features, correlation of a plurality of variables with correlation existing in original data is eliminated, and a group of variables with small quantity and mutual independence are formed again, and the specific method is as follows: firstly, an input data set is regarded as a matrix form of M multiplied by N +1, and data of ith row and j column in the matrix is xijThe mean and standard deviation of the j-th dimension data are respectively mujAnd σj(ii) a Obtain a normalized matrix yijThe matrix Y of the composition is formed,
Figure BDA0002249793910000101
computing a covariance matrix
Figure BDA0002249793910000102
Then, the characteristic value (lambda) of S is calculated12,...,λp) And a feature vector ai=(ai1,ai2,...,aip) Wherein i 1, 2.. said, p;
Figure BDA0002249793910000103
the contribution rate η is calculated by the characteristic value of each principal component, the larger the contribution rate corresponding to the principal component is, the more information containing original variables is proved, at the moment, the k maximum characteristic values are taken out according to the accumulated contribution rate of the demand, and the characteristic vector (a) is obtained1,a2,...,ak) Forming a transformation matrix Q with p rows and k columns; finally, obtaining a k-dimension data matrix T after dimension reduction through T ═ YQ, wherein k is less than N +1, and the dimension reduction is carried out on the original test data set by using a transformation matrix Q, so that the training set and the test set keep the same characteristic dimension;
s4: marking the category of the characteristic vector in a normal state as 0, marking the attack type as 1 in detection, marking the attack type as 2 in response to injection attack, marking the attack type as 3 in command injection attack, and marking the attack type as 4 in denial of service attack;
s5: inputting the processed feature vectors into a convolutional neural network classification model, wherein the convolutional neural network model is based on a TensorFlow deep learning framework keras and written by using a Python language, and the configuration is accelerated by using a GPU;
the CNN model is designed into 3 convolution layers, 2 full-connection layers and 1 transition layer Flatten layer; the number of convolution kernels is designed to be 8-16-32 and gradually increased, and the capability of feature learning is enhanced; and a Dropout layer is added behind each convolution layer and the full connection layer, and the neural network units are discarded randomly according to the probability of 30% to prevent overfitting;
carrying out convolution operation on the preprocessed data through three convolution layers and extracting characteristics, and then entering a softmax classifier through a full connection layer to obtain the data of the classificationThe difference value between the predicted value and the true value can also be obtained as the predicted result, and the network weight in the convolutional neural network is adjusted according to the minimum loss mode; the higher the output value of the loss function is, the greater the difference is, so the training of the convolutional neural network aims to reduce the loss value as much as possible, and the classified cross entropy loss function is adopted (directly selecting the coordinated cross entropy in Keras)It is often used for multi-classification problems, increasing L2The norm controls the overfitting of the weight, the parameter lambda controls the intensity of the overfitting, and the overall loss function is as follows:
Figure BDA0002249793910000112
during training, the loss value is propagated reversely by using an Adam random gradient descent algorithm, and the weight parameter W and the bias parameter b of each layer in the network are updated, wherein η is a learning rate:
Figure BDA0002249793910000113
Figure BDA0002249793910000121
then repeating the training process until the loss function value is reduced to a small value, and storing the optimal model with the lowest loss value through a module ModelCheckpoint in Keras;
s6: inputting a test data set with five types of labels into the obtained optimal model for classification to obtain a classification result, namely a five-dimensional confusion matrix;
s7: and evaluating the five-dimensional confusion matrix, and evaluating the classification result by using the accuracy, the detection rate and the false alarm rate as evaluation indexes to judge whether the industrial control system network intrusion detection method based on the PCA and the CNN meets the detection requirement, and can compare the method with the conventional intrusion detection method in the aspects of complexity, time consumption, calculated amount and the like.
In an embodiment, an intrusion detection apparatus of an industrial control system is provided, and referring to fig. 6, fig. 6 is a block diagram illustrating a structure of the intrusion detection apparatus of the industrial control system in an embodiment, the intrusion detection apparatus of the industrial control system may include:
an original data set extraction module 601, configured to extract an original data set of the industrial control system from a network data set of a communication protocol of the industrial control system;
an original data set classification module 602, configured to obtain a training data set and a test data set from an original data set;
a feature dimension reduction module 603, configured to perform feature dimension reduction on the training data set and the test data set by using a principal component analysis method, and obtain a training data set after feature dimension reduction and a test data set after feature dimension reduction;
a model training module 604, configured to train the dimensionality-reduced training data set based on the intrusion detection model to obtain a classification model;
and the data classification module 605 is configured to input the test data set subjected to feature dimension reduction into the classification model to perform classification processing, and obtain an intrusion detection result of the industrial control system.
In an embodiment, the original data set extracting module 601 is further configured to classify the data to be processed according to the communication traffic of the industrial control system, and obtain an intrusion category of the data to be processed; the data to be processed is data in the network data set; acquiring a command data packet and a response data packet in a network data set; acquiring data characteristics of data to be processed according to the command data packet and the response data packet; and setting the data characteristics and the intrusion type as an original data set.
In one embodiment, the data features include: the device address, the initial position of the memory, the read-write command, the byte number of the responded memory, the read-write function codes of the command data packet and the response data packet, the lengths of the command data packet and the response data packet, the time interval between the command data packet and the response data packet, the error rate of cyclic redundancy check and the characteristic or state value of the industrial control system.
In one embodiment, the original data set extraction module 601 is further configured to perform assignment processing on intrusion types of data.
In one embodiment, the feature dimension reduction module 603 is further configured to normalize the elements in the training data set to form a normalized matrix; calculating a covariance matrix according to the standardized matrix; calculating an eigenvalue, an eigenvector and a contribution rate of the covariance matrix; extracting a maximum eigenvalue and an eigenvector corresponding to the maximum eigenvalue according to the accumulated contribution rate of the demand to form a transformation matrix; and performing feature dimension reduction on the training data set and the test data set through the transformation matrix.
In one embodiment, the contribution rate is calculated by the following formula:
Figure BDA0002249793910000131
wherein, ηiRepresents the contribution rate, lambda, corresponding to the ith eigenvalue of the covariance matrixiIs the ith eigenvalue of the covariance matrix,
in an embodiment, the model training module 604 is further configured to train the reduced-dimension training data set based on a convolutional neural network model to obtain a classification model.
The intrusion detection device of the industrial control system of the present application corresponds to the intrusion detection method of the industrial control system of the present application one-to-one, and for the specific limitations of the intrusion detection device of the industrial control system, reference may be made to the above limitations of the intrusion detection method of the industrial control system. The modules in the intrusion detection device of the industrial control system can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, and the computer device may be a terminal, and its internal structure diagram may be as shown in fig. 7, and fig. 7 is an internal structure diagram of the computer device in one embodiment. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an intrusion detection method for an industrial control system. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, there is provided a computer device comprising a processor and a memory, the memory storing a computer program which when executed by the processor performs the steps of: extracting an original data set of the industrial control system from a network data set of a communication protocol of the industrial control system; acquiring a training data set and a testing data set from an original data set; performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction; training the training data set subjected to dimensionality reduction based on the intrusion detection model to obtain a classification model; and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system.
In one embodiment, the processor, when executing the computer program, further performs the steps of: classifying the data to be processed according to the communication flow of the industrial control system to obtain the invasion category of the data to be processed; the data to be processed is data in the network data set; acquiring a command data packet and a response data packet in a network data set; acquiring data characteristics of data to be processed according to the command data packet and the response data packet; and setting the data characteristics and the intrusion type as an original data set.
In one embodiment, the data features include: the device address, the initial position of the memory, the read-write command, the byte number of the responded memory, the read-write function codes of the command data packet and the response data packet, the lengths of the command data packet and the response data packet, the time interval between the command data packet and the response data packet, the error rate of cyclic redundancy check and the characteristic or state value of the industrial control system.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and carrying out assignment processing on the intrusion type of the data.
In one embodiment, the processor, when executing the computer program, further performs the steps of: standardizing elements in the training data set to form a standardized matrix; calculating a covariance matrix according to the standardized matrix; calculating an eigenvalue, an eigenvector and a contribution rate of the covariance matrix; extracting a maximum eigenvalue and an eigenvector corresponding to the maximum eigenvalue according to the accumulated contribution rate of the demand to form a transformation matrix; and performing feature dimension reduction on the training data set and the test data set through the transformation matrix.
In one embodiment, the contribution rate is calculated by the following formula:
Figure BDA0002249793910000161
wherein, ηiRepresents the contribution rate, lambda, corresponding to the ith eigenvalue of the covariance matrixiIs the ith eigenvalue of the covariance matrix,
Figure BDA0002249793910000162
in one embodiment, the processor, when executing the computer program, further performs the steps of: and training the training data set subjected to dimensionality reduction based on the convolutional neural network model to obtain a classification model.
According to the computer equipment, redundant information is removed and the calculated amount is reduced through the computer program running on the processor, so that the technical problem that the intrusion detection method of the industrial control system in the traditional technology is long in training time is solved.
It will be understood by those skilled in the art that all or part of the processes in the intrusion detection method for implementing the industrial control system according to any of the above embodiments may be implemented by a computer program, which may be stored in a non-volatile computer readable storage medium, and when executed, may include the processes of the above embodiments of the methods. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
Accordingly, in one embodiment there is provided a computer readable storage medium having a computer program stored thereon, the computer program when executed by a processor implementing the steps of: extracting an original data set of the industrial control system from a network data set of a communication protocol of the industrial control system; acquiring a training data set and a testing data set from an original data set; performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction; training the training data set subjected to dimensionality reduction based on the intrusion detection model to obtain a classification model; and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system.
In one embodiment, the computer program when executed by the processor further performs the steps of: classifying the data to be processed according to the communication flow of the industrial control system to obtain the invasion category of the data to be processed; the data to be processed is data in the network data set; acquiring a command data packet and a response data packet in a network data set; acquiring data characteristics of data to be processed according to the command data packet and the response data packet; and setting the data characteristics and the intrusion type as an original data set.
In one embodiment, the data features include: the device address, the initial position of the memory, the read-write command, the byte number of the responded memory, the read-write function codes of the command data packet and the response data packet, the lengths of the command data packet and the response data packet, the time interval between the command data packet and the response data packet, the error rate of cyclic redundancy check and the characteristic or state value of the industrial control system.
In one embodiment, the computer program when executed by the processor further performs the steps of: and carrying out assignment processing on the intrusion type of the data.
In one embodiment, the computer program when executed by the processor further performs the steps of: standardizing elements in the training data set to form a standardized matrix; calculating a covariance matrix according to the standardized matrix; calculating an eigenvalue, an eigenvector and a contribution rate of the covariance matrix; extracting a maximum eigenvalue and an eigenvector corresponding to the maximum eigenvalue according to the accumulated contribution rate of the demand to form a transformation matrix; and performing feature dimension reduction on the training data set and the test data set through the transformation matrix.
In one embodiment, the contribution rate is calculated by the following formula:
wherein, ηiRepresents the contribution rate, lambda, corresponding to the ith eigenvalue of the covariance matrixiIs the ith eigenvalue of the covariance matrix,
Figure BDA0002249793910000181
in one embodiment, the computer program when executed by the processor further performs the steps of: and training the training data set subjected to dimensionality reduction based on the convolutional neural network model to obtain a classification model.
The computer readable storage medium realizes the removal of redundant information and the reduction of calculated amount through the stored computer program, thereby solving the technical problem of long training time of the intrusion detection method of the industrial control system in the traditional technology.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. An intrusion detection method for an industrial control system, comprising the steps of:
extracting an original data set of an industrial control system from a network data set of a communication protocol of the industrial control system;
acquiring a training data set and a testing data set from the original data set;
performing feature dimensionality reduction on the training data set and the test data set by using a principal component analysis method to obtain a training data set subjected to feature dimensionality reduction and a test data set subjected to feature dimensionality reduction;
training the training data set subjected to dimensionality reduction based on an intrusion detection model to obtain a classification model;
and inputting the test data set subjected to feature dimension reduction into the classification model for classification processing, and obtaining an intrusion detection result of the industrial control system.
2. The method of claim 1, wherein extracting the raw data set of the industrial control system from the network data set of the communication protocol of the industrial control system comprises:
classifying data to be processed according to the communication flow of the industrial control system to obtain the invasion category of the data to be processed; the data to be processed is data in the network data set;
acquiring a command data packet and a response data packet in the network data set;
acquiring the data characteristics of the data to be processed according to the command data packet and the response data packet;
setting the data characteristics and the intrusion category as the original data set.
3. The method of claim 2, wherein the data characteristics comprise: the device address, the initial position of the memory, the read-write command, the byte number of the responded memory, the read-write function codes of the command data packet and the response data packet, the lengths of the command data packet and the response data packet, the time interval between the command data packet and the response data packet, the error rate of cyclic redundancy check and the characteristic or state value of the industrial control system.
4. The method of claim 2, wherein after the obtaining the intrusion category of the data to be processed, further comprising:
and carrying out assignment processing on the intrusion type of the data.
5. The method of claim 1, wherein the performing feature dimensionality reduction on the training dataset and the test dataset using principal component analysis comprises:
standardizing elements in the training data set to form a standardized matrix;
calculating a covariance matrix according to the normalized matrix;
calculating an eigenvalue, an eigenvector and a contribution rate of the covariance matrix;
extracting a maximum eigenvalue and an eigenvector corresponding to the maximum eigenvalue according to the accumulated contribution rate of the demand to form a transformation matrix;
and performing feature dimension reduction on the training data set and the test data set through the transformation matrix.
6. The method of claim 5, wherein the contribution rate is calculated by the following formula:
Figure FDA0002249793900000021
wherein, ηiRepresents the contribution rate, lambda, corresponding to the ith eigenvalue of the covariance matrixiFor the ith eigenvalue of the covariance matrix,
7. the method according to any one of claims 1-6, wherein the training the reduced-dimension training dataset based on an intrusion detection model to obtain a classification model comprises:
and training the training data set subjected to the dimensionality reduction based on a convolutional neural network model to obtain the classification model.
8. An intrusion detection device for an industrial control system, comprising:
the system comprises an original data set extraction module, a data processing module and a data processing module, wherein the original data set extraction module is used for extracting an original data set of an industrial control system from a network data set of a communication protocol of the industrial control system;
the system comprises an original data set classification module, a training data set and a test data set, wherein the original data set classification module is used for acquiring a training data set and a test data set from the original data set;
the characteristic dimension reduction module is used for performing characteristic dimension reduction on the training data set and the test data set by utilizing a principal component analysis method to obtain the training data set after the characteristic dimension reduction and the test data set after the characteristic dimension reduction;
the model training module is used for training the training data set subjected to dimensionality reduction based on an intrusion detection model to obtain a classification model;
and the data classification module is used for inputting the test data set subjected to feature dimension reduction into the classification model for classification processing to obtain an intrusion detection result of the industrial control system.
9. A computer device comprising a processor and a memory, said memory storing a computer program, characterized in that said processor, when executing said computer program, implements the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN201911029762.3A 2019-09-29 2019-10-28 Industrial control system anomaly detection method based on PCA-CNN Pending CN110825068A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019109329483 2019-09-29
CN201910932948 2019-09-29

Publications (1)

Publication Number Publication Date
CN110825068A true CN110825068A (en) 2020-02-21

Family

ID=69550679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911029762.3A Pending CN110825068A (en) 2019-09-29 2019-10-28 Industrial control system anomaly detection method based on PCA-CNN

Country Status (1)

Country Link
CN (1) CN110825068A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111741002A (en) * 2020-06-23 2020-10-02 广东工业大学 Method and device for training network intrusion detection model
CN111898639A (en) * 2020-06-30 2020-11-06 河海大学 Dimension reduction-based hierarchical time memory industrial anomaly detection method and device
CN111931175A (en) * 2020-09-23 2020-11-13 四川大学 Industrial control system intrusion detection method based on small sample learning
CN112383563A (en) * 2020-12-03 2021-02-19 中国铁建重工集团股份有限公司 Intrusion detection method and related device
CN112437053A (en) * 2020-11-10 2021-03-02 国网北京市电力公司 Intrusion detection method and device
CN112464154A (en) * 2020-11-27 2021-03-09 中国船舶重工集团公司第七0四研究所 Method for automatically screening effective features based on unsupervised learning
CN112637165A (en) * 2020-12-14 2021-04-09 广东电网有限责任公司 Model training method, network attack detection method, device, equipment and medium
CN112688911A (en) * 2020-11-03 2021-04-20 桂林理工大学 Network intrusion detection system based on PCA + ADASYN and Xgboost
CN113179279A (en) * 2021-05-20 2021-07-27 哈尔滨凯纳科技股份有限公司 Industrial control network intrusion detection method and device based on AE-CNN
CN113572785A (en) * 2021-08-05 2021-10-29 中国电子信息产业集团有限公司第六研究所 Honeypot defense method and device for nuclear power industrial control system
CN114422241A (en) * 2022-01-19 2022-04-29 内蒙古工业大学 Intrusion detection method, device and system
CN114595448A (en) * 2022-03-14 2022-06-07 山东省计算中心(国家超级计算济南中心) Industrial control anomaly detection method, system and equipment based on correlation analysis and three-dimensional convolution and storage medium
CN115208703A (en) * 2022-09-16 2022-10-18 北京安帝科技有限公司 Industrial control equipment intrusion detection method and system of fragment parallelization mechanism
CN115996133A (en) * 2022-06-27 2023-04-21 西安电子科技大学 Industrial control network behavior detection method and related device
CN116170241A (en) * 2023-04-26 2023-05-26 国家工业信息安全发展研究中心 Intrusion detection method, system and equipment of industrial control system
CN117631599A (en) * 2024-01-26 2024-03-01 深圳一嘉智联科技有限公司 Industrial control computer data transmission method and system based on data analysis

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109962909A (en) * 2019-01-30 2019-07-02 大连理工大学 A kind of network intrusions method for detecting abnormality based on machine learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109962909A (en) * 2019-01-30 2019-07-02 大连理工大学 A kind of network intrusions method for detecting abnormality based on machine learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴东方: ""基于机器学习的工业互联网入侵检测方法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
李兆峰: ""基于主成分分析和卷积神经网络的入侵检测方法研究"", 《现代信息科技》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111741002A (en) * 2020-06-23 2020-10-02 广东工业大学 Method and device for training network intrusion detection model
CN111741002B (en) * 2020-06-23 2022-02-15 广东工业大学 Method and device for training network intrusion detection model
CN111898639A (en) * 2020-06-30 2020-11-06 河海大学 Dimension reduction-based hierarchical time memory industrial anomaly detection method and device
CN111898639B (en) * 2020-06-30 2022-07-26 河海大学 Dimension reduction-based hierarchical time memory industrial anomaly detection method and device
US11218502B1 (en) 2020-09-23 2022-01-04 Sichuan University Few-shot learning based intrusion detection method of industrial control system
CN111931175A (en) * 2020-09-23 2020-11-13 四川大学 Industrial control system intrusion detection method based on small sample learning
CN112688911B (en) * 2020-11-03 2023-04-18 桂林理工大学 Network intrusion detection system based on PCA + ADASYN and Xgboost
CN112688911A (en) * 2020-11-03 2021-04-20 桂林理工大学 Network intrusion detection system based on PCA + ADASYN and Xgboost
CN112437053A (en) * 2020-11-10 2021-03-02 国网北京市电力公司 Intrusion detection method and device
CN112437053B (en) * 2020-11-10 2023-06-30 国网北京市电力公司 Intrusion detection method and device
CN112464154A (en) * 2020-11-27 2021-03-09 中国船舶重工集团公司第七0四研究所 Method for automatically screening effective features based on unsupervised learning
CN112464154B (en) * 2020-11-27 2024-03-01 中国船舶重工集团公司第七0四研究所 Method for automatically screening effective features based on unsupervised learning
CN112383563A (en) * 2020-12-03 2021-02-19 中国铁建重工集团股份有限公司 Intrusion detection method and related device
CN112637165A (en) * 2020-12-14 2021-04-09 广东电网有限责任公司 Model training method, network attack detection method, device, equipment and medium
CN112637165B (en) * 2020-12-14 2022-08-30 广东电网有限责任公司 Model training method, network attack detection method, device, equipment and medium
CN113179279A (en) * 2021-05-20 2021-07-27 哈尔滨凯纳科技股份有限公司 Industrial control network intrusion detection method and device based on AE-CNN
CN113572785A (en) * 2021-08-05 2021-10-29 中国电子信息产业集团有限公司第六研究所 Honeypot defense method and device for nuclear power industrial control system
CN114422241A (en) * 2022-01-19 2022-04-29 内蒙古工业大学 Intrusion detection method, device and system
CN114595448B (en) * 2022-03-14 2022-09-27 山东省计算中心(国家超级计算济南中心) Industrial control anomaly detection method, system and equipment based on correlation analysis and three-dimensional convolution and storage medium
CN114595448A (en) * 2022-03-14 2022-06-07 山东省计算中心(国家超级计算济南中心) Industrial control anomaly detection method, system and equipment based on correlation analysis and three-dimensional convolution and storage medium
CN115996133A (en) * 2022-06-27 2023-04-21 西安电子科技大学 Industrial control network behavior detection method and related device
CN115996133B (en) * 2022-06-27 2024-04-09 西安电子科技大学 Industrial control network behavior detection method and related device
CN115208703B (en) * 2022-09-16 2022-12-13 北京安帝科技有限公司 Industrial control equipment intrusion detection method and system of fragment parallelization mechanism
CN115208703A (en) * 2022-09-16 2022-10-18 北京安帝科技有限公司 Industrial control equipment intrusion detection method and system of fragment parallelization mechanism
CN116170241A (en) * 2023-04-26 2023-05-26 国家工业信息安全发展研究中心 Intrusion detection method, system and equipment of industrial control system
CN117631599A (en) * 2024-01-26 2024-03-01 深圳一嘉智联科技有限公司 Industrial control computer data transmission method and system based on data analysis
CN117631599B (en) * 2024-01-26 2024-04-12 深圳一嘉智联科技有限公司 Industrial control computer data transmission method and system based on data analysis

Similar Documents

Publication Publication Date Title
CN110912867B (en) Intrusion detection method, device, equipment and storage medium for industrial control system
CN110825068A (en) Industrial control system anomaly detection method based on PCA-CNN
CN111210024B (en) Model training method, device, computer equipment and storage medium
AU2019201857B2 (en) Sparse neural network based anomaly detection in multi-dimensional time series
CN110287983B (en) Single-classifier anomaly detection method based on maximum correlation entropy deep neural network
WO2021037280A2 (en) Rnn-based anti-money laundering model training method, apparatus and device, and medium
CN111783442A (en) Intrusion detection method, device, server and storage medium
CN112839034A (en) Network intrusion detection method based on CNN-GRU hierarchical neural network
CN111325159B (en) Fault diagnosis method, device, computer equipment and storage medium
CN113179279A (en) Industrial control network intrusion detection method and device based on AE-CNN
CN110909348A (en) Internal threat detection method and device
CN116167010B (en) Rapid identification method for abnormal events of power system with intelligent transfer learning capability
CN116597384B (en) Space target identification method and device based on small sample training and computer equipment
CN110912908A (en) Network protocol anomaly detection method and device, computer equipment and storage medium
CN113835962A (en) Server fault detection method and device, computer equipment and storage medium
CN115630298A (en) Network flow abnormity detection method and system based on self-attention mechanism
CN115496384A (en) Monitoring management method and device for industrial equipment and computer equipment
Elmasry et al. Enhanced Anomaly‐Based Fault Detection System in Electrical Power Grids
CN110166422B (en) Domain name behavior recognition method and device, readable storage medium and computer equipment
CN113541985A (en) Internet of things fault diagnosis method, training method of model and related device
CN111679953B (en) Fault node identification method, device, equipment and medium based on artificial intelligence
CN111737320A (en) Method and device for establishing group user behavior baseline and computer equipment
CN117118693A (en) Abnormal flow detection method, device, computer equipment and storage medium
CN114140246A (en) Model training method, fraud transaction identification method, device and computer equipment
CN114462510A (en) Equipment classification method and system for precise protection of Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200221