CN113159615A

CN113159615A - Intelligent information security risk measuring system and method for industrial control system

Info

Publication number: CN113159615A
Application number: CN202110505744.9A
Authority: CN
Inventors: 麦荣章
Original assignee: Individual
Current assignee: Individual
Priority date: 2021-05-10
Filing date: 2021-05-10
Publication date: 2021-07-23

Abstract

The invention belongs to the technical field of information security, and discloses an intelligent measuring system and method for information security risk of an industrial control system, wherein the intelligent measuring system for the information security risk of the industrial control system comprises: the system comprises a data set acquisition module, a data preprocessing module, a central control module, a risk assessment model construction module, a model training module, a risk prediction module, an information safety assessment module, an early warning module, a data storage module and an updating display module. According to the method, the random forest prediction model optimized by a time sequence prediction method is adopted, so that the influence of a plurality of factors influencing the predicted value is reduced, and the prediction precision is improved; the random forest algorithm has the advantages when a large number of data samples are processed, the requirement on data is not high when the data are processed, the random forest algorithm can be classified variables or continuous variables, and the accuracy is more stable. Meanwhile, the invention improves the algorithm on the basis of the prior art, improves the model prediction accuracy value and predicts the future risk value accurately according to the existing risk value.

Description

Intelligent information security risk measuring system and method for industrial control system

Technical Field

The invention belongs to the technical field of information safety, and particularly relates to an intelligent measurement system and method for information safety risk of an industrial control system.

Background

Currently, industrial control systems are a prerequisite for high-speed transmission of large amounts of data (e.g., image and audio signals), thereby forming a combination of ethernet and control networks, which is currently very popular in the commercial field. The wave of the industrial control system network integrates various general technologies such as an embedded technology, multi-standard industrial control network interconnection, a wireless technology and the like, thereby expanding the development space and new development space of the industrial control field and bringing development opportunities.

With the development of computer technology, communication technology and control technology, the traditional control field has changed unprecedentedly, and starts to develop towards networking. With rapid development of industrialization and informatization, information technology and communication network technology are increasingly adopted by industrial control systems, and information security of the industrial control systems faces a serious challenge.

In some prediction problems, the dependent variable is affected by the variation of multiple independent variables. For example, the subject of the survey is a house, and the relevant attributes include a house price, a number of rooms, a floor, a geographic location, or a residential area. In this case, it is necessary to use a plurality of linear regression or neural network models to predict the dependent variable using the independent variable. However, the analysis method used in this prediction is different. Time series prediction uses historical data for a particular variable to predict future data for that variable. It has two main functions: first, surveys include only certain variables and look at how the variables have changed in the past; second, time series analysis does not require attention to the properties of other variables that may affect the target. The implementation of time series relies on time series decomposition, i.e., the decomposition of data into trend, season and noise components. The first two of which are referred to as system components because they are predictable. However, since the noise component is random, it is sometimes called a non-systematic component.

Random forest algorithms have become a common tool for many researchers and are used in many fields. The random forest has low requirements on data when processing the data, and can be classified variable or continuous variable, so that the data processing becomes easier and the application range is wider. Secondly, the random forest has the characteristics of being capable of performing discriminant analysis, logistic regression and multiple linear regression; in general, applying random forest preconditions is relatively free, without any statistical parameters required for the normality, homogeneity of variance of independent variables.

Through the above analysis, the problems and defects of the prior art are as follows:

(1) industrial control systems have difficulty eliminating information risks from a source.

(2) The core of risk assessment is to estimate the total loss of the industrial control system caused by various external threats or resource loss, and the method aims to assess the vulnerability and threat degree in the whole system on the premise.

(3) The algorithm adopted by the prior art has high requirements on data, is easy to overfit and has insufficient accuracy.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides an intelligent determination system and method for information security risk of an industrial control system.

The invention is realized in this way, an industrial control system information security risk intelligence survey system, the said industrial control system information security risk intelligence survey system includes:

the system comprises a data set acquisition module, a data preprocessing module, a central control module, a risk assessment model construction module, a model training module, a risk prediction module, an information safety assessment module, an early warning module, a data storage module and an updating display module.

The data set acquisition module is connected with the central control module and used for acquiring a plurality of risk influencing elements of the industrial control system and a plurality of groups of evaluation values corresponding to the risk influencing elements through data set acquisition equipment and taking the risk influencing elements and the evaluation values as an initial sample data set;

the data preprocessing module is connected with the central control module and used for preprocessing the acquired initial sample data set of the industrial control system through a data preprocessing program, and the data preprocessing module comprises:

(1) labeling a plurality of risk influence elements of the industrial control system and a plurality of groups of evaluation values corresponding to the risk influence elements by using a plurality of labeling frames of different types to obtain a first data set;

(2) determining the data acquisition cost according to a reverse-climbing mechanism triggered in the labeling process of the first data set; determining the data cleaning cost according to the structure type of the collected test data packet;

(3) determining data storage cost according to at least one data storage format and data quantity corresponding to various data storage formats;

(4) determining a data processing cost based on the data acquisition cost, the data cleaning cost and the data storage cost, and processing the initial sample data set according to the data processing cost;

the central control module is connected with the data set acquisition module, the data preprocessing module, the risk assessment model construction module, the model training module, the risk prediction module, the information security assessment module, the early warning module, the data storage module and the updating display module and is used for coordinating and controlling the normal operation of each module of the information security risk intelligent determination system of the industrial control system through the central processing unit;

the risk assessment model building module is connected with the central control module and used for building a risk assessment model through a model building program, and the risk assessment model building module comprises:

(1) acquiring a plurality of groups of evaluation value data corresponding to a plurality of risk influence elements;

(2) analyzing the correlation of the evaluation value data, and drawing a time series curve of the evaluation value data;

(3) jumping points and inflection points of the time series curve are obtained, and a stable time series ARMA model is selected for curve fitting;

(4) constructing according to the curve fitting data by using a model construction program to obtain a risk assessment model;

the model training module is connected with the central control module and used for taking the initial sample data set after preprocessing as a training sample, and training a random forest optimized by a time sequence algorithm through a model training program to obtain a final risk assessment model, and the model training module comprises:

(1) taking the initial sample data set after preprocessing as a training sample, predicting the training sample data by adopting a time sequence, and predicting future data of a single variable by using historical data of the variable;

(2) constructing a random forest decision tree and determining attribute test conditions;

(3) training a random forest risk assessment model by using a regrF _ train function defined in an RF toolkit;

the risk prediction module is connected with the central control module and used for inputting the evaluation value of the risk influence element into a risk evaluation model through a risk prediction program to obtain a prediction value of the information security risk of the industrial control system;

the information security evaluation module is connected with the central control module and used for evaluating the obtained predicted value of the information security risk of the industrial control system through an information security evaluation program and generating an evaluation report;

the early warning module is connected with the central control module and is used for carrying out early warning notification on the abnormal information safety risk of the industrial control system through the acousto-optic early warning device;

the data storage module is connected with the central control module and used for storing the acquired initial sample data set, data preprocessing results, risk assessment models, model training results, predicted values of information security risks of the industrial control system, information security assessment reports and early warning notifications of the industrial control system through a memory;

and the updating display module is connected with the central control module and is used for updating and displaying the acquired initial sample data set, the data preprocessing result, the risk assessment model, the model training result, the predicted value of the information security risk of the industrial control system, the information security assessment report and the real-time data of the early warning notice of the industrial control system through a display.

Further, in the data set acquisition module, the risk influencing elements include: enterprise management layer elements, process control layer elements and field control layer elements; the enterprise management layer elements include: unauthorized access, malicious code, distributed denial of service, virus trojan and forgery attacks; the process control layer elements include: denial of service attacks, DOS attacks, flooding attacks, response spoofing, and direction misleading attacks; the field control layer elements include: physical attacks, information theft, data tampering, denial of service attacks, illegal access, and replay attacks.

Further, in the data preprocessing module, the determining the data acquisition cost includes:

1) searching the acquisition difficulty corresponding to the triggered anti-crawling mechanism from the corresponding relation between the preset anti-crawling mechanism and the acquisition difficulty; the larger the cracking difficulty of the anti-crawling mechanism is, the larger the acquisition difficulty corresponding to the anti-crawling mechanism is;

2) and determining the product of the sum of the searched acquisition difficulties and the basic acquisition cost as the data acquisition cost.

Further, in the model training module, the following statistics are needed in the data-based time sequence analysis:

(1) time interval: t 1,2,3.., n;

(2) time-series data: y is₁,y₂,y₃,...,y_n；

(3) Predicting the value: f_n+hA predicted value representing an h-th time interval after n; when h is 1, it means the next time interval immediately after one interval; h represents a time span, set to a value greater than 1;

(4) prediction error: at time t, e_t＝y_t-F_t。

Further, in the model training module, the attribute testing condition includes:

(1) binary property: the test condition of the binary attribute can generate two output results;

(2) nominal attributes: the nominal attribute has a plurality of output results, and the nominal attribute has two expression modes, namely multipath division and binary division;

(3) the sequence value attribute is as follows: dividing the grouping of the attribute values into two or more paths according to the artificial desire under the condition of not influencing the sequence of the values;

(4) continuous attributes: for continuous attributes, the test condition is a comparison test with binary output (A < v) or (A > v), and may also be a range query.

Further, if multi-way partitioning is applied, all possible partition points and continuous intervals are fully considered; the discretization method can be adopted for the continuous attribute, each discretization interval is endowed with a new ordinal value, if the ordering of the intervals is kept, adjacent ordinal values can be gathered into a wider interval, and the optimal characteristic attribute is selected, wherein:

gini coefficient: the Gini coefficient index is a relative index and is widely applied to the aspects of economics or statistics; in the decision tree algorithm, the coefficient represents the degree of confusion of the attribute category in the pre-classified data set, and if P (X, Y) ═ P (X) × P (Y), and X and Y are independent from each other, then:

Log(XY)＝Log(X)+Log(Y)；

further, in the model training module, the random forest further includes:

the main algorithm idea of the random forest is random sampling, namely randomly collecting a fixed number of samples in a preprocessed training set, and adopting a replacement extraction method for each extraction; in such a case, the same number of samples are collected each time, but the contents are different, and the specific algorithm flow is as follows:

setting a sample set D { (x1, y1), (x2, y2), … (xn, yn) }, setting the iteration number of the weak classifier as T, and outputting a strong classifier f (x); for T ═ 1,2, …, T, there are:

firstly, randomly sampling a training set for the t time, and acquiring m times in total to obtain a sampling set Dt containing m samples;

training the t-th weak classifier by using the sampling set Dt;

and thirdly, the final m results are aggregated into the result f (x) of the strong classifier.

Further, in the model training module, the format of the regRF _ train function defined in the RF toolkit is as follows:

model＝regRF_train(X,Y,ntree,mtry,extra_options)；

in the expression, two parameters are mainly used by calling, and the other parameters can be freely selected to be used or not; wherein, X represents a data matrix, and the input normalized training set pn _ train is taken as the input data to be trained when the function is called; y represents a target value, and the output normalized training set tn _ train is taken as an output data set of the current training; secondly, selecting parameters, wherein the number of trees constructed by the model is expressed by ntree; defining the parameter as 20100, i.e. 20100 trees; mtry is the characteristic number in the tree and is used as the branch of the tree; setting the parameter to 45, and returning the model obtained by training to the model;

and finally, a section of existing risk value is input into the prediction model, and the future risk value is predicted step by step according to the trained model.

Another object of the present invention is to provide a computer program product stored on a computer readable medium, which includes a computer readable program for providing a user input interface to apply the intelligent risk determination system for information security of an industrial control system when the computer program product is executed on an electronic device.

Another object of the present invention is to provide a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to apply the intelligent industrial control system information security risk determination system.

By combining all the technical schemes, the invention has the advantages and positive effects that: according to the intelligent measuring system for the information security risk of the industrial control system, the random forest prediction model optimized by the time sequence prediction method is adopted, the influence of a plurality of factors influencing the predicted value is reduced, the prediction precision is greatly improved, and the random forest algorithm has advantages when a large number of data samples are processed and is more stable in precision.

Meanwhile, the random forest provided by the invention has low requirements on data when processing the data, and can be classified variable or continuous variable, so that the data processing is easier and the application range is wider. Secondly, the random forest has the characteristics of being capable of performing discriminant analysis, logistic regression and multiple linear regression; in general, applying random forest preconditions is relatively free, without any statistical parameters required for the normality, homogeneity of variance of independent variables. Therefore, the invention improves the algorithm on the basis of the prior art, improves the model prediction accuracy value, and predicts and accurately predicts the future risk value according to the existing risk value.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.

FIG. 1 is a block diagram of an intelligent risk determination system for information security of an industrial control system according to an embodiment of the present invention;

in the figure: 1. a dataset acquisition module; 2. a data preprocessing module; 3. a central control module; 4. a risk assessment model construction module; 5. a model training module; 6. a risk prediction module; 7. an information security evaluation module; 8. an early warning module; 9. a data storage module; 10. and updating the display module.

Fig. 2 is a flowchart of an intelligent risk determination method for information security of an industrial control system according to an embodiment of the present invention.

Fig. 3 is a flowchart of a method for preprocessing an acquired initial sample data set of the industrial control system by using a data preprocessing program through a data preprocessing module according to an embodiment of the present invention.

Fig. 4 is a flowchart of a method for constructing a risk assessment model by using a risk assessment model construction module and a model construction program according to an embodiment of the present invention.

Fig. 5 is a flowchart of a method for obtaining a final risk assessment model by using a model training program to train a random forest optimized by a time sequence algorithm with a preprocessed initial sample data set as a training sample through a model training module according to an embodiment of the present invention.

Fig. 6 is a schematic diagram of a random forest Bagging principle provided by an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Aiming at the problems in the prior art, the invention provides an intelligent determination system and method for information security risk of an industrial control system, and the invention is described in detail below with reference to the accompanying drawings.

As shown in fig. 1, an intelligent measurement system for information security risk of an industrial control system according to an embodiment of the present invention includes: the system comprises a data set acquisition module 1, a data preprocessing module 2, a central control module 3, a risk assessment model construction module 4, a model training module 5, a risk prediction module 6, an information safety assessment module 7, an early warning module 8, a data storage module 9 and an update display module 10.

The data set acquisition module 1 is connected with the central control module 3, and is used for acquiring a plurality of risk influencing elements of the industrial control system and a plurality of groups of evaluation values corresponding to the risk influencing elements through data set acquisition equipment, and taking the risk influencing elements and the evaluation values as an initial sample data set;

the data preprocessing module 2 is connected with the central control module 3 and used for preprocessing the acquired initial sample data set of the industrial control system through a data preprocessing program;

the central control module 3 is connected with the data set acquisition module 1, the data preprocessing module 2, the risk assessment model construction module 4, the model training module 5, the risk prediction module 6, the information safety assessment module 7, the early warning module 8, the data storage module 9 and the updating display module 10, and is used for coordinating and controlling the normal operation of each module of the information safety risk intelligent determination system of the industrial control system through a central processing unit;

the risk assessment model building module 4 is connected with the central control module 3 and used for building a risk assessment model through a model building program;

the model training module 5 is connected with the central control module 3 and used for training a random forest optimized by a time sequence algorithm through a model training program by taking the preprocessed initial sample data set as a training sample to obtain a final risk assessment model;

the risk prediction module 6 is connected with the central control module 3 and used for inputting the evaluation value of the risk influence element into a risk evaluation model through a risk prediction program to obtain a prediction value of the information security risk of the industrial control system;

the information security evaluation module 7 is connected with the central control module 3 and used for evaluating the obtained predicted value of the information security risk of the industrial control system through an information security evaluation program and generating an evaluation report;

the early warning module 8 is connected with the central control module 3 and is used for carrying out early warning notification on the abnormal information safety risk of the industrial control system through an acousto-optic early warning device;

the data storage module 9 is connected with the central control module 3 and is used for storing the acquired initial sample data set, data preprocessing results, risk assessment models, model training results, predicted values of information security risks of the industrial control system, information security assessment reports and early warning notifications of the industrial control system through a memory;

and the updating display module 10 is connected with the central control module 3 and is used for updating and displaying the acquired initial sample data set, the data preprocessing result, the risk assessment model, the model training result, the predicted value of the information security risk of the industrial control system, the information security assessment report and the real-time data of the early warning notice of the industrial control system through a display.

As shown in fig. 2, the intelligent determination method for information security risk of an industrial control system according to an embodiment of the present invention includes the following steps:

s101, acquiring a plurality of risk influencing elements of the industrial control system and a plurality of groups of evaluation values corresponding to the risk influencing elements by using a data set acquisition module through data set acquisition equipment, and taking the risk influencing elements and the evaluation values as an initial sample data set;

s102, preprocessing the acquired initial sample data set of the industrial control system by a data preprocessing module through a data preprocessing program;

s103, the normal operation of each module of the intelligent information safety risk measuring system of the industrial control system is coordinated and controlled by a central control module through a central processing unit; constructing by a risk assessment model construction module through a model construction program to obtain a risk assessment model;

s104, using the initial sample data set after preprocessing as a training sample through a model training module, and training a random forest optimized by a time sequence algorithm by using a model training program to obtain a final risk assessment model;

s105, inputting the evaluation value of the risk influence element into a risk evaluation model by using a risk prediction program through a risk prediction module to obtain a prediction value of the information security risk of the industrial control system;

s106, evaluating the obtained predicted value of the information security risk of the industrial control system by using an information security evaluation program through an information security evaluation module, and generating an evaluation report; early warning and informing abnormal information safety risks of the industrial control system by using an acousto-optic early warning device through an early warning module;

s107, storing the obtained initial sample data set, data preprocessing results, risk assessment models, model training results, predicted values of information security risks of the industrial control system, information security assessment reports and early warning notifications of the industrial control system by using a memory through a data storage module;

and S108, updating and displaying the acquired initial sample data set, data preprocessing results, risk assessment models, model training results, predicted values of information security risks of the industrial control system, information security assessment reports and real-time data of early warning notifications by using a display through an updating and displaying module.

In step S101 provided in the embodiment of the present invention, the risk influencing elements include: enterprise management layer elements, process control layer elements and field control layer elements; the enterprise management layer elements include: unauthorized access, malicious code, distributed denial of service, virus trojan and forgery attacks; the process control layer elements include: denial of service attacks, DOS attacks, flooding attacks, response spoofing, and direction misleading attacks; the field control layer elements include: physical attacks, information theft, data tampering, denial of service attacks, illegal access, and replay attacks.

The invention is further described with reference to specific examples.

Example 1

The method for intelligently determining the information security risk of the industrial control system, provided by the embodiment of the present invention, is shown in fig. 1, and as a preferred embodiment, is shown in fig. 3, and the method for preprocessing the acquired initial sample data set of the industrial control system by using a data preprocessing program through a data preprocessing module, provided by the embodiment of the present invention, includes:

s201, labeling a plurality of risk influence elements of the industrial control system and a plurality of groups of evaluation values corresponding to the risk influence elements by using a plurality of labeling frames of different types to obtain a first data set;

s202, determining the data acquisition cost according to a reverse-crawling mechanism triggered in the labeling process of the first data set; determining the data cleaning cost according to the structure type of the collected test data packet;

s203, determining data storage cost according to at least one data storage format and data quantity corresponding to various data storage formats;

s204, determining data processing cost based on the data acquisition cost, the data cleaning cost and the data storage cost, and processing the initial sample data set according to the data processing cost.

Example 2

The intelligent determination method for information security risk of an industrial control system provided by the embodiment of the invention is shown in fig. 1, and as a preferred embodiment, as shown in fig. 4, the method for obtaining a risk assessment model through a risk assessment model building module and by using a model building program comprises the following steps:

s301, acquiring multiple groups of evaluation value data corresponding to multiple risk influence elements;

s302, analyzing the correlation of the evaluation value data, and drawing a time series curve of the evaluation value data;

s303, acquiring jumping points and inflection points of the time series curve, and performing curve fitting by using a stable time series ARMA model;

and S304, constructing according to the curve fitting data by using a model construction program to obtain a risk assessment model.

Example 3

The method for intelligently determining the information security risk of the industrial control system, provided by the embodiment of the invention, is shown in fig. 1, and as a preferred embodiment, is shown in fig. 5, and the method for obtaining the final risk assessment model by using the model training program to train the random forest optimized by the time sequence algorithm by using the initial sample data set after the preprocessing as the training sample through the model training module, provided by the embodiment of the invention, comprises the following steps:

s401, taking the initial sample data set after preprocessing as a training sample, predicting the training sample data by adopting a time sequence, and predicting future data of a single variable by using historical data of the variable;

s402, constructing a random forest decision tree, and determining attribute test conditions;

and S403, training a random forest risk assessment model by using a regrF _ train function defined in the RF toolkit.

In step S401 provided in the embodiment of the present invention, the following statistics are required to be used in the data-based time sequence analysis:

(1) time interval: t 1,2,3.., n;

(2) time-series data: y is₁,y₂,y₃,...,y_n；

(4) prediction error: at time t, e_t＝y_t-F_t。

In step S402 provided in the embodiment of the present invention, the attribute test condition includes:

If multi-way partitioning is applied, all possible partition points and contiguous intervals are taken into account. The discretization method can be adopted for the continuous attribute, each discretization interval is endowed with a new ordinal value, and if the orderliness of the intervals is kept, adjacent ordinal values can be gathered into a wider interval.

Selecting the best characteristic attribute:

gini coefficient: the Gini coefficient index is a relative index and is widely applied to the aspects of economics or statistics. In the decision tree algorithm, the coefficient represents the degree of confusion of the attribute category in the pre-classified data set, and if P (X, Y) ═ P (X) × P (Y), and X and Y are independent from each other, then:

Log(XY)＝Log(X)+Log(Y)；

the random forest provided by the embodiment of the invention further comprises:

the main algorithm idea of the random forest is random sampling, namely, a fixed number of samples are randomly acquired in a preprocessed training set, and a method of extraction with replacement is adopted in each extraction. In such a case, the same number of samples are collected each time, but the contents are different, and the specific algorithm flow is as follows:

training the t-th weak classifier by using the sampling set Dt;

In step S403 provided by the embodiment of the present invention, a usage format of a regRF _ train function defined in the RF toolkit is as follows:

model＝regRF_train(X,Y,ntree,mtry,extra_options)；

in the expression, two parameters are mainly used and the other parameters can be freely selected. Wherein, X represents a data matrix, and the input normalized training set pn _ train is taken as the input data to be trained when the function is called; y represents a target value, and the output normalized training set tn _ train is taken as an output data set of the current training; second, the number of trees constructed by the model is denoted ntree, an optional parameter. Defining the parameter as 20100 in order to optimize the training result, namely 20100 trees; mtry is the number of features in the tree, as the sub-trunk of the tree. This parameter is set to 45 in this model. Finally, the trained model is returned to the model.

The schematic diagram of the random forest Bagging principle provided by the embodiment of the invention is shown in fig. 6.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When used in whole or in part, can be implemented in a computer program product that includes one or more computer instructions. When loaded or executed on a computer, cause the flow or functions according to embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL), or wireless (e.g., infrared, wireless, microwave, etc.)). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

In the description of the present invention, "a plurality" means two or more unless otherwise specified; the terms "upper", "lower", "left", "right", "inner", "outer", "front", "rear", "head", "tail", and the like, indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, are only for convenience in describing and simplifying the description, and do not indicate or imply that the device or element referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, should not be construed as limiting the invention. Furthermore, the terms "first," "second," "third," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims

1. An industrial control system information security risk intelligent measurement system, characterized in that, industrial control system information security risk intelligent measurement system includes:

the system comprises a data set acquisition module, a data preprocessing module, a central control module, a risk assessment model construction module, a model training module, a risk prediction module, an information safety assessment module, an early warning module, a data storage module and an update display module;

2. The intelligent industrial control system information security risk determination system of claim 1, wherein in the data set acquisition module, the risk influencing elements comprise: enterprise management layer elements, process control layer elements and field control layer elements; the enterprise management layer elements include: unauthorized access, malicious code, distributed denial of service, virus trojan and forgery attacks; the process control layer elements include: denial of service attacks, DOS attacks, flooding attacks, response spoofing, and direction misleading attacks; the field control layer elements include: physical attacks, information theft, data tampering, denial of service attacks, illegal access, and replay attacks.

3. The intelligent industrial control system information security risk determination system of claim 1, wherein the determining the data collection cost in the data preprocessing module comprises:

4. The intelligent industrial control system information security risk measurement system of claim 1, wherein the model training module requires the following statistics for data-based time series analysis:

(1) time interval: t 1,2,3.., n;

(2) time-series data: y is₁,y₂,y₃,...,y_n；

(4) prediction error: at time t, e_t＝y_t-F_t。

5. The intelligent industrial control system information security risk measurement system of claim 1, wherein the attribute test conditions in the model training module comprise:

6. The intelligent industrial control system information security risk measurement system of claim 5, wherein if multi-path division is applied, all possible division points and continuous intervals are fully considered; the discretization method can be adopted for the continuous attribute, each discretization interval is endowed with a new ordinal value, if the ordering of the intervals is kept, adjacent ordinal values can be gathered into a wider interval, and the optimal characteristic attribute is selected, wherein:

Log(XY)＝Log(X)+Log(Y)；

7. the intelligent industrial control system information security risk measurement system of claim 1, wherein the random forest in the model training module further comprises:

training the t-th weak classifier by using the sampling set Dt;

8. The intelligent industrial control system information security risk measurement system of claim 1, wherein in the model training module, the format of the regRF _ train function defined in the RF toolkit is as follows:

model＝regRF_train(X,Y,ntree,mtry,extra_options)；

9. A computer program product stored on a computer readable medium, comprising a computer readable program for providing a user input interface for applying the intelligent industrial control system information security risk determination system of any one of claims 1 to 8 when executed on an electronic device.

10. A computer readable storage medium storing instructions which, when executed on a computer, cause the computer to apply the intelligent industrial control system information security risk determination system according to any one of claims 1 to 8.