CN113159181B - Industrial control system anomaly detection method and system based on improved deep forest - Google Patents

Industrial control system anomaly detection method and system based on improved deep forest Download PDF

Info

Publication number
CN113159181B
CN113159181B CN202110438900.4A CN202110438900A CN113159181B CN 113159181 B CN113159181 B CN 113159181B CN 202110438900 A CN202110438900 A CN 202110438900A CN 113159181 B CN113159181 B CN 113159181B
Authority
CN
China
Prior art keywords
feature
feature vector
vector
class
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110438900.4A
Other languages
Chinese (zh)
Other versions
CN113159181A (en
Inventor
李肯立
陈伟杰
余思洋
肖国庆
段明星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202110438900.4A priority Critical patent/CN113159181B/en
Publication of CN113159181A publication Critical patent/CN113159181A/en
Application granted granted Critical
Publication of CN113159181B publication Critical patent/CN113159181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Abstract

The invention discloses an anomaly detection method based on an improved annular multi-granularity scanning deep forest under an industrial control network, which specifically comprises the following steps: carrying out normalization processing on the constructed training set sample by adopting a Z-score method, and mapping the characteristic data to an interval of [ -1, 1 ]; performing feature dimensionality reduction on the sample feature set by adopting a principal component analysis method to generate a new feature vector set with characteristics irrelevant to each other; generating a feature sub-vector of each sample by passing the feature vector set subjected to dimension reduction through an annular multi-granularity scanning structure; respectively inputting the feature sub-vector sets into a semi-random forest and a completely random forest to generate corresponding class feature vectors, and combining the class feature vectors into a new class feature vector to be used as feature input of the cascade forest; and inputting the generated class feature vector set into a diversified cascade forest structure, iterating until convergence is reached, and generating a final class feature vector. The method can solve the problems of low detection rate, weak generalization and the like of the existing method for the abnormal behaviors of the industrial control system network.

Description

Industrial control system anomaly detection method and system based on improved deep forest
Technical Field
The invention belongs to the technical field of information safety, and particularly relates to an industrial control system anomaly detection method and system based on improved deep forest.
Background
Industrial control networks typically communicate using proprietary protocols, which often only consider functional requirements at the beginning of the design, but security is not sufficiently robust. With the development of network technology, industrial control networks gradually deepen contact with the internet, and meanwhile, the probability of external attack or intrusion is gradually increased. More and more industrial control networks are connected to public networks such as the internet, which also makes security problems in industrial control networks more and more obvious.
The traditional measure for preventing the industrial control network from being attacked by the network is mainly to analyze and identify the abnormal communication behavior of the general Internet protocol and match the abnormal communication behavior according to the preset rule and the characteristic value, thereby realizing simple safety filtration.
However, the above-mentioned method of abnormal behavior detection has not negligible drawbacks: the method mainly relies on feature matching to identify dangerous network behaviors, and not only is the identification rate of industrial control network abnormity low, but also the generalization is weak.
Disclosure of Invention
Aiming at the defects or the improvement requirements of the prior art, the invention provides an industrial control system anomaly detection method based on an improved annular multi-granularity scanning deep forest, and aims to solve the technical problems of low recognition rate and low generalization rate of the existing industrial control network anomaly detection method.
To achieve the above object, according to one aspect of the present invention, there is provided an improved deep forest based industrial control system anomaly detection method, comprising the steps of:
(1) acquiring network data from an industrial control system to be detected, and preprocessing the network data to obtain a sample set;
(2) and (3) inputting the sample set obtained in the step (1) into a pre-trained anomaly detection model to obtain an anomaly detection result.
Preferably, the step (1) is specifically to perform unified numerical conversion on the data types of the acquired network data, represent the labels of normal data and abnormal data by 0 and 1 respectively, then perform normalization processing on the numerically-converted network data by using a Z-score method, and finally use one feature of each sample in the normalized network data as a data dimension, so as to convert the sample into a feature vector, and the feature vectors corresponding to all the samples form a sample set.
Preferably, the step (2) is specifically to perform feature extraction on the sample set obtained in the step (1) by using a PCA method to obtain a new feature vector set after dimension reduction, process each feature vector in the new feature vector set after dimension reduction by using an annular multi-granularity scanning structure in a trained anomaly detection model to obtain a feature sub-vector set corresponding to the feature vector, form a large set by using all the feature sub-vector sets, input each feature sub-vector in each feature sub-vector set in the large set into a fully random forest classifier and a semi-random forest classifier respectively to obtain class feature vectors, form a class feature vector set by using a plurality of initial class feature vectors corresponding to all the feature sub-vector sets in the set, then input the class feature vectors into a cascade forest model in the trained anomaly detection model, and finally, inputting the final class feature vector into the last layer of integrated classification model to obtain a plurality of classification results, and obtaining the average value of all classification results, wherein if the average value is more than 0.5, the industrial control system to be detected is abnormal, otherwise, the industrial control system to be detected is normal.
Preferably, the anomaly detection model is trained by the following steps:
(1) obtaining network data, and constructing a data set X according to the network data, wherein X belongs to Rn×mWhere R represents a set of real numbers and n represents a total number of samples in the data set;
Figure GDA0003588953970000021
wherein x isi=[x1i x2i … xni]TAnd (i ═ 1,2, … m) represents a feature set composed of the ith-dimension features of each sample in the dataset.
(2) And (2) carrying out normalization processing on the data set obtained in the step (1) by adopting a Z-score normalization method, and carrying out normalization processing on the data set after normalization processing according to the ratio of 5: 1 into training set Xtrain,Xtrain∈Rn×mAnd test set Xtest,Xtest∈Rn×m
(3) Using the PCA methodCarrying out the method on the training set X obtained in the step (2)trainAnd (5) carrying out feature extraction to obtain a new feature vector set after dimension reduction.
(4) And (3) processing each feature vector in the new feature vector set subjected to dimensionality reduction in the step (3) by using an annular multi-granularity scanning structure in the anomaly detection model to obtain a feature sub-vector set corresponding to the feature vector, wherein all n feature sub-vector sets (used for enhancing the characterization learning of the subsequent diversified cascade forest structure) form a large set H.
(5) Collecting H each feature sub-vector in the large set H obtained in the step (4)dAnd each eigenvector in d e {1,2, …, n } is respectively input into a fully random forest classifier and a semi-random forest classifier to obtain 2 c-dimensional class eigenvectors U ═ U [ n ]1,u2,…,ucH, each feature subvector set HdAnd d is equal to k 2 c-dimensional feature vectors corresponding to all k feature sub-vectors in {1,2, …, n }, so as to form an initial class feature vector bdThe dimension is 1 × 2kc, and n initial class feature vectors corresponding to all feature sub-vector sets in the set H form an initial class feature vector set B ═ B1b2…bnDimension n × 2kc, wherein c represents the number of classification categories;
(6) inputting the initial class feature vector generated in the step (5) into a diversified cascade forest structure for iterative training until the diversified cascade forest structure is converged, thereby obtaining a trained anomaly detection model.
Preferably, step (3) is specifically that firstly, the training set X is processed by using PCA methodtrainThe m feature sets in (1) are linearly transformed as follows:
Figure GDA0003588953970000041
wherein Y ═ Y1 y2 … ym]Is a new feature set after conversion, and the sample covariance matrix a is equal to:
Figure GDA0003588953970000042
wherein the content of the first and second substances,
Figure GDA0003588953970000043
xia feature set composed of the ith dimension features of each sample in the step (1),
Figure GDA0003588953970000044
representing the average value of the feature sets corresponding to all samples in the data set;
then, a feature vector set alpha and a feature value set lambda of the new feature set Y are obtained according to the lambda alpha-Y alpha,
wherein α ═ α12,…,αm],λ=[λ12,…,λm]Wherein λ isiRepresenting a characteristic value corresponding to the ith principal component;
then, according to the characteristic value lambda corresponding to the ith principal componentiCalculating the variance contribution rate p of the ith principal componenti
Figure GDA0003588953970000045
Wherein p isiRepresenting the variance contribution rate of the ith principal component.
Then, k is set to 1, and the characteristic value λ corresponding to the ith principal component is usediAnd obtaining the cumulative variance contribution rate of the ith principal component when k is 1 by using the following formula
Figure GDA0003588953970000046
Figure GDA0003588953970000047
Then, setting k to 2, and following the calculation formula in the previous paragraph, to obtain the cumulative variance contribution rate of the ith principal component when k is 2
Figure GDA0003588953970000051
And judging the cumulative variance contribution rate of the ith principal component when k is 1
Figure GDA0003588953970000052
If the growth rate is less than 1%, fixing the k value to 2 if the growth rate is less than 1%, otherwise, continuing to set k to 3, and acquiring the cumulative variance contribution rate of the ith principal component when k is 3
Figure GDA0003588953970000053
And judging the cumulative variance contribution rate of the ith principal component when k is 2
Figure GDA0003588953970000054
If the growth rate is less than 1%, fixing the value of k to 3 if the growth rate is less than 1%, otherwise, continuously setting k to 4, … and the like;
then, sorting the main components from large to small according to the variance contribution rates of the main components, selecting characteristic values corresponding to the variance contribution rates of the first k main components from the sorting result, and recording a subscript set consisting of the subscripts corresponding to the characteristic values as index ═ index { (index) }1,index2,…,indexkAnd selecting columns corresponding to the subscripts from the new characteristic set Y according to the subscript set index, thereby obtaining a new characteristic vector set of all the samples after dimensionality reduction, and marking as a new characteristic vector set
Figure GDA0003588953970000055
Wherein z isd(d-1, 2, …, n) represents the d-th sample's feature vector after dimensionality reduction, which contains k features, i.e., zd={fd1fd2 … fdkDimension 1 × k.
Preferably, step (4) is specifically to first perform dimension reduction on the first eigenvector z in the new eigenvector set in step (3)1={f11 f12 … f1kPerforming annular multi-granularity scanning processing to the characteristic f11And characteristic f1kConnected so that the feature vector z1={f11 f12 … f1kForming an annular shape connected end to end; secondly, from the start feature f, a sliding window of length t11Initially, a feature subvector is obtained for each unit sliding from left to right, and then the sliding window is moved one unit to the right, so that the starting feature is f12And the obtained feature subvector is marked as F2={f12,f13,…,f1(t+1)…, and finally, the sliding window is moved to have the starting feature f1kAnd the obtained characteristic subvectors are marked as Fk={f1k,f11,…,f1(t-1)All k feature subvectors form a feature subvector set H1={F1 F2 … Fk}。
Then, firstly, the second eigenvector z in the new eigenvector set after dimension reduction in step (3) is selected2={f21 f22… f2kPerforming annular multi-granularity scanning processing to the characteristic f21And characteristic f2kConnected so that the feature vector z2={f21 f22 … f2kBecoming a ring shape connected end to end; secondly, from the start feature f, a sliding window of length t21Initially, a feature subvector is obtained for each unit sliding from left to right, and then the sliding window is moved one unit to the right, so that the starting feature is f22And the obtained feature subvector is marked as F2={f22,f23,…,f2(t+1)…, and finally, the sliding window is moved to have the starting feature Fk={f2k,f21,…,f2(t-1)All k feature subvectors form a feature subvector set H2={F1 F2 … Fk};
… and so on;
subsequently, the nth feature vector z in the new feature vector set after dimension reduction in the step (3) is subjected ton={fn1 fn2 … fnkPerforming annular multi-granularity scanning processing to the characteristic fn1And characteristic fnkConnected so that the feature vector zn={fn1 fn2 … fnkForming an annular shape connected end to end; secondly, from the starting feature f, a sliding window of length tn1Initially, a feature subvector is obtained for each unit sliding from left to right, and then the sliding window is moved one unit to the right, so that the starting feature is fn2And the obtained feature subvector is marked as F2={fn2,fn3,…,fn(t+1)…, and finally, the sliding window is moved to have the starting feature Fk={fnk,fn1,…,fn(t-1)All k feature subvectors form a feature subvector set Hn={F1F2…Fk}。
Finally, a large set composed of all feature sub-vector sets corresponding to all n sample feature vectors is recorded as H ═ H1,H2,…,Hd},(d=1,2,…,n)。
Preferably, step (6) is specifically to, first, perform the following processing on the first layer of the diversified cascade forest structure:
firstly, a first initial class feature vector B in an initial class feature vector set B is used1Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b1Generating (5c +2kc) dimension class feature vector e in combination11={o1 o2 o3 o4 o5 b1};
Then, a second final class feature vector B in the initial class feature vector set B is used2Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b2Generating a (5c +2kc) dimensional feature vector e in combination12={o1 o2 o3 o4 o5 b2}, … and so on;
finally, all (5c +2kc) dimensional class feature vectors corresponding to all n initial class feature vectors in the initial class feature vector set B form a class feature vector set E of the first layer1={e11 e12 … e1n}。
Then, the second layer of the diversified cascade forest structure is processed as follows:
firstly, a class feature vector set E generated by a first layer is collected1={e11 e12 … e1nThe first class feature vector e in11Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b1Generating a (5c +2kc) dimensional feature vector e in combination21={o1 o2 o3 o4 o5 b1}。
Secondly, the class feature vectors generated by the first layer are collected into a set E1={e11 e12 … e1nThe second class feature vector e in12Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b2Generating a (5c +2kc) dimensional feature vector e in combination22={o1 o2 o3 o4 o5 b2}, … and so on.
Finally, all (5c +2kc) dimensional feature vectors corresponding to all n initial class feature vectors in the initial class feature vector set B form a class feature vector set E of the second layer2={e21 e22 … e2n}。
And then, processing subsequent layers of the diversified cascading forest structure in the same mode as the first layer and the second layer until the diversified cascading forest structure is converged, and finishing model training.
According to another aspect of the invention, there is provided an improved deep forest based industrial control system anomaly detection system comprising:
the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring network data from an industrial control system to be detected and preprocessing the network data to obtain a sample set;
and the second module is used for inputting the sample set obtained by the first module into a pre-trained anomaly detection model so as to obtain an anomaly detection result.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
(1) because the invention adopts the step (4), the characteristic vectors of the samples are fully scanned through the annular multi-granularity scanning structure, rich characteristic sub-vectors are obtained, and the characterization learning capacity of the subsequent diversified cascade forest structure is enhanced, so that the abnormity detection recognition rate is improved, and the technical problem of low abnormity detection recognition rate of the existing industrial control network can be solved;
(2) because the step (5) is adopted, the classifiers can make up each other before by increasing the diversity of the classifiers in the diversified cascade forest structure, and the technical problem of poor generalization of the abnormal detection of the existing industrial control network can be solved;
(3) in the invention, the idea of ensemble learning is adopted in the step (5), and a plurality of classifiers are integrated, so that the feasibility of distributed parallel operation exists, and the detection efficiency of the anomaly detection model can be accelerated.
Drawings
FIG. 1 is a flow chart of an improved deep forest based industrial control system anomaly detection method of the present invention;
FIG. 2 is a schematic diagram of an anomaly detection model used in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The method specifically comprises the steps of aiming at the network particularity of the industrial control system, constructing an industrial control system network abnormity detection model by utilizing a deep learning thought, on one hand, preprocessing sample data and extracting characteristics of the sample, on the other hand, fully scanning characteristic vectors by adopting an annular multi-granularity scanning structure, constructing a characteristic subset and carrying out abnormity detection by utilizing a diversified cascade forest structure. The effect of the anomaly detection herein is enhanced in these two angles.
As shown in FIG. 1, the invention provides an industrial control system anomaly detection method based on improved deep forest, which comprises the following steps:
(1) acquiring network data from an industrial control system to be detected, and preprocessing the network data to obtain a sample set;
specifically, according to different data acquisition methods, the data are represented in different modes, namely normal data or abnormal data, the data types of the acquired network data are subjected to unified numerical conversion, labels of the normal data and the abnormal data are represented by 0 and 1 respectively, then the network data subjected to the numerical conversion are subjected to normalization processing by adopting a Z-score method, the unified dimension among the features is ensured, finally, one feature of each sample in the normalized network data is used as a data dimension, the sample is converted into a feature vector, and the feature vectors corresponding to all the samples form a sample set.
The step (1) has the advantages of ensuring the dimension consistency among the characteristics and avoiding influencing the subsequent abnormal detection performance.
(2) And (2) inputting the sample set obtained in the step (1) into a pre-trained anomaly detection model to obtain an anomaly detection result.
Specifically, the method comprises the following steps of (3), (4) and (5) sequentially processing a sample set to obtain a class feature vector, inputting the class feature vector into the cascade forest model trained in the step (6) to obtain a final class feature vector, inputting the final class feature vector into the integrated classification model of the last layer to obtain a plurality of classification results, and calculating a mean value of all classification results, wherein if the mean value is greater than 0.5, it is indicated that the industrial control system to be detected is abnormal, and otherwise, it is indicated that the industrial control system to be detected is normal.
As shown in fig. 2, the anomaly detection model of the present invention includes a circular multi-granularity scanning structure and a diversified cascade forest structure connected to each other.
For an annular multi-granularity scanning structure, 1 × m-dimensional feature vectors (m is the total number of features of each sample) are input, and the 1 × m-dimensional feature vectors are scanned by using a sliding window with the length of 1 × t (t is a natural number and is generally adaptively adjusted according to specific feature vectors) to generate corresponding m feature sub-vectors;
for the diversified cascade forest structure, the input of the diversified cascade forest structure is m characteristic sub-vectors generated by the first part, the characteristic sub-vectors are input into a first layer of integrated learning module in the diversified cascade forest structure to respectively generate corresponding class vectors, the class vectors and the m characteristic sub-vectors are linearly combined to be used as the input of a second layer of integrated learning module, …, and the like;
specifically, the anomaly detection model in this step is obtained by training through the following steps:
(1) and acquiring network data and constructing a data set according to the network data.
In the step, a data set X is constructed, wherein X belongs to Rn×mWhere R represents the real number set and n represents the total number of samples in the data set.
Figure GDA0003588953970000101
Wherein x isi=[x1i x2i … xni]TAnd (i ═ 1,2, … m) represents a feature set composed of the i-th dimensional features of each sample in the dataset.
(2) And (3) carrying out normalization processing on the data set obtained in the step (1) by adopting a Z-score normalization method, and carrying out normalization processing on the data set after the normalization processing according to the ratio of 5: 1 into training set Xtrain,Xtrain∈Rn×mAnd test set Xtest,Xtest∈Rn×m
(3) Using Principal Component Analysis (PCA) method to the training set X obtained in step (2)trainAnd (5) carrying out feature extraction to obtain a new feature vector set after dimension reduction.
Specifically, the method first uses PCA method to combine the training set XtrainThe m feature sets in (1) are linearly transformed as follows:
Figure GDA0003588953970000111
wherein Y ═ Y1 y2 … ym]Is a new feature set after conversion, and the sample covariance matrix is recorded as A:
Figure GDA0003588953970000112
wherein the content of the first and second substances,
Figure GDA0003588953970000113
xifor each sample in step (1)The feature set composed of the ith dimension features of (1),
Figure GDA0003588953970000114
representing the average of the feature sets corresponding to all samples in the data set.
And satisfies:
1)yi,yj(i ≠ j) is independent of each other.
2)y1Variance greater than y2Variance, and so on.
3)
Figure GDA0003588953970000115
Then, a feature vector set alpha and a feature value set lambda of the new feature set Y are obtained according to the lambda alpha-Y alpha,
wherein α ═ α12,…,αm],λ=[λ12,…,λm]Wherein λ isiAnd representing the characteristic value corresponding to the ith principal component.
Then, according to the characteristic value lambda corresponding to the ith principal componentiCalculating the variance contribution rate p of the ith principal componenti
Figure GDA0003588953970000121
Wherein p isiRepresenting the variance contribution rate of the ith principal component.
Subsequently, k is set to 1, and the eigenvalue λ corresponding to the ith principal component is calculatediAnd obtaining the cumulative variance contribution rate of the ith principal component when k is 1 by using the following formula
Figure GDA0003588953970000122
Figure GDA0003588953970000123
Then, set k to 2, follow the calculation formula of the previous paragraph, toObtaining the cumulative variance contribution rate of the ith principal component when k is 2
Figure GDA0003588953970000124
And judging the cumulative variance contribution rate of the ith principal component when k is 1
Figure GDA0003588953970000125
If the growth rate is less than 1%, fixing the k value to 2 if the growth rate is less than 1%, otherwise, continuously setting k to 3, and acquiring the cumulative variance contribution rate of the ith principal component when k is 3
Figure GDA0003588953970000126
And judging the cumulative variance contribution rate of the ith principal component when k is 2
Figure GDA0003588953970000127
If the growth rate of the time is less than 1%, if yes, fixing the value of k to be 3, otherwise, continuously setting k to be 4, … and the like;
then, sorting the components from large to small according to the variance contribution rates of the main components, selecting characteristic values corresponding to the variance contribution rates of the first k main components from the sorting result, and recording a subscript set consisting of the subscripts corresponding to the characteristic values as index ═ { index [ -index [ ]1,index2,…,indexkAnd (the subscript is discontinuous), selecting a column corresponding to the subscript from the new characteristic set Y according to the subscript set index, thereby obtaining a new characteristic vector set of all the samples after dimension reduction, and marking as a new characteristic vector set of all the samples after dimension reduction
Figure GDA0003588953970000128
(the matrix Q is of dimension n x k, where one row represents a sample and one column represents a feature vector of a sample), where zd(d-1, 2, …, n) represents the d-th sample's feature vector after dimensionality reduction, which contains k features, i.e., zd={fd1 fd2 … fdkDimension 1 × k.
(4) And (3) processing each feature vector in the new feature vector set subjected to dimensionality reduction in the step (3) by using an annular multi-granularity scanning structure in the anomaly detection model to obtain a feature sub-vector set corresponding to the feature vector, wherein all n feature sub-vector sets (used for enhancing the characterization learning of the subsequent diversified cascade forest structure) form a large set H.
Specifically, the annular multi-granularity scanning method uses a sliding window with the length t (t ranges from 2 to k, preferably 3) to perform one-by-one comparison on each sample feature vector z obtained in the step (3)dAnd (d is 1,2, …, n).
The specific step is that firstly, the first eigenvector z in the new eigenvector set after dimension reduction in the step (3) is carried out1={f11 f12 … f1kPerforming annular multi-granularity scanning processing to the characteristic f11And characteristic f1kConnected so that the feature vector z1={f11 f12 … f1kForming an annular shape connected end to end; secondly, from the start feature f, a sliding window of length t11Initially, a feature sub-vector is obtained by sliding from left to right in units, each time sliding a unit (e.g., starting feature is f)11Then the obtained feature subvector is noted as F1={f11,f12,…,f1t}) and then the sliding window is moved one unit to the right, so that the starting characteristic is f12And the obtained feature subvector is marked as F2={f12,f13,…,f1(t+1)…, and finally, the sliding window is moved to have the starting feature f1kAnd the obtained characteristic subvectors are marked as Fk={f1k,f11,…,f1(t-1)H, all k feature sub-vectors form a set of feature sub-vectors H1={F1 F2 … Fk}。
Then, firstly, the second eigenvector z in the new eigenvector set after dimension reduction in step (3) is selected2={f21 f22… f2kPerforming annular multi-granularity scanning processing to the characteristic f21And characteristic f2kConnected so that the feature vector z2={f21 f22 … f2kBecomeA ring shape with end-to-end characteristics; secondly, from the start feature f, a sliding window of length t21Starting with a unit of left-to-right sliding, each sliding of a unit, a feature sub-vector is obtained (e.g., the starting feature is f)21Then the obtained feature subvector is noted as F1={f21,f22,…,f2t}) and then the sliding window is moved one unit to the right, so that the starting characteristic is f22And the obtained feature subvector is marked as F2={f22,f23,…,f2(t+1)…, and finally, the sliding window is moved to have the starting feature Fk={f2k,f21,…,f2(t-1)All k feature subvectors form a feature subvector set H2={F1 F2 … Fk};
… and so on; subsequently, the nth feature vector z in the new feature vector set after dimension reduction in the step (3) is subjected ton={fn1 fn2 … fnkPerforming annular multi-granularity scanning processing to the characteristic fn1And characteristic fnkConnected so that the feature vector zn={fn1 fn2 … fnkForming an annular shape connected end to end; secondly, from the start feature f, a sliding window of length tn1Starting with a unit of left-to-right sliding, each sliding of a unit, a feature sub-vector is obtained (e.g., the starting feature is f)n1Then the obtained feature subvector is noted as F1={fn1,fn2,…,fnt}) and then the sliding window is moved one unit to the right, so that the starting characteristic is fn2And the obtained feature subvector is marked as F2={fn2,fn3,…,fn(t+1)…, and finally, the sliding window is moved to have the starting feature Fk={fnk,fn1,…,fn(t-1)All k feature subvectors form a feature subvector set Hn={F1 F2 … Fk}。
Finally, all n samples are specifiedThe large set composed of all feature sub-vector sets corresponding to the feature vectors is denoted as H ═ H { (H)1,H2,…,Hd},(d=1,2,…,n)。
(5) A set H of each feature sub-vector in the large set H obtained in the step (4)dAnd each feature subvector in d e {1,2, …, n } is respectively input into a fully random forest classifier and a semi-random forest classifier to obtain a class feature vector U ═ U ═ of 2c dimensions (where c denotes the number of classification classes, where anomaly detection belongs to a binary problem, so c ═ 2) of the class feature vector U ═ U ═ of 2c dimensions (where c denotes the number of classification classes)1,u2,…,ucH, each feature subvector set HdAnd d is equal to k 2 c-dimensional feature vectors corresponding to all k feature sub-vectors in {1,2, …, n }, so as to form an initial class feature vector bdThe dimension is 1 × 2kc, and n initial class feature vectors corresponding to all feature sub-vector sets in the set H form an initial class feature vector set B ═ B1 b2 … bnAnd dimension n × 2 kc.
The step (3) has the advantage that redundant features and irrelevant features in the features can be removed, so that the subsequent anomaly detection efficiency and accuracy are improved.
The step (4) has the advantages that the characteristic learning capacity of the diversified cascade forest structure is increased through the multi-granularity scanning structure, and the identification rate of the abnormal detection is further improved.
The step (5) has the advantage that a plurality of different classifiers are introduced through the diversified cascading forest structure, so that the generalization of the model is enhanced.
(6) Inputting the initial class feature vector generated in the step (5) into a diversified cascade forest structure for iterative training until the diversified cascade forest structure is converged, thereby obtaining a trained anomaly detection model.
Specifically, in the step (5), the initial class feature vector set B generated in the step (5) is used as a training set and is input into a diversified cascade forest structure for model training.
The method comprises the following steps of firstly, processing a first layer of a diversified cascade forest structure as follows:
firstly, the following components are mixedThe first initial class feature vector B in the initial class feature vector set B1Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b1Generating (5c +2kc) dimension class feature vector e in combination11={o1 o2 o3 o4 o5 b1};
Then, a second final class feature vector B in the initial class feature vector set B is used2Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b2Generating (5c +2kc) dimension feature vector e in combination12={o1 o2 o3 o4 o5 b2}, … and so on.
Finally, all (5c +2kc) dimensional class feature vectors corresponding to all n initial class feature vectors in the initial class feature vector set B form a class feature vector set E of the first layer1={e11 e12 … e1n}。
Then, the second layer of the diversified cascade forest structure is processed as follows:
firstly, a class feature vector set E generated by a first layer is collected1={e11 e12 … e1nThe first class feature vector e in11Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b1Generating (5c +2kc) dimension feature vector e in combination21={o1 o2 o3 o4 o5 b1}。
Secondly, the class feature vectors generated by the first layer are collected into a set E1={e11 e12 … e1nThe second class feature vector e in12Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b2Generating a (5c +2kc) dimensional feature vector e in combination22={o1 o2 o3 o4 o5 b2}, … and so on.
Finally, all (5c +2kc) dimensional feature vectors corresponding to all n initial class feature vectors in the initial class feature vector set B form a class feature vector set E of the second layer2={e21 e22 … e2n}。
And then, processing subsequent layers of the diversified cascading forest structure in the same mode as the first layer and the second layer until the diversified cascading forest structure is converged, and finishing model training.
Results of the experiment
In order to illustrate the effectiveness and detection effect of the invention in the field of network anomaly detection of industrial control systems, verification tests are performed on a plurality of data sets, natural gas pipeline test data of key infrastructure protection center of mississippi state university are used for comparing the test results obtained by the invention with the currently common method, and the evaluation results are as follows in table 1:
TABLE 1
Figure GDA0003588953970000161
According to the table 1 shown in the above, in the experimental comparison of the natural gas pipeline test data of the key infrastructure protection center at mississippi state university, the classification algorithm is superior to other three common classification algorithms in the accuracy, the missing report rate and the false report rate. The deep forest anomaly detection model based on the annular multi-granularity scanning structure and the diversified cascade forest structure provided by the invention can be used for obtaining the characterization learning of the reinforced subsequent cascade forest by fully scanning the characteristic vectors on the annular multi-granularity scanning structure, and introducing various weak classifiers into the diversified cascade forest structure to complement the disadvantages of different classifiers, so that an integrated model with generalization and strong performance is obtained, and the performance of the whole anomaly detection model is better.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (4)

1. An industrial control system anomaly detection method based on improved deep forests is characterized by comprising the following steps:
firstly, acquiring network data from an industrial control system to be detected, and preprocessing the network data to obtain a sample set;
inputting the sample set obtained in the step one into a pre-trained anomaly detection model to obtain an anomaly detection result; the anomaly detection model is obtained by training the following steps:
(1) obtaining network data, and constructing a data set X according to the network data, wherein X belongs to Rn×mWherein R represents a real number set and n represents a total number of samples in the data set;
Figure FDA0003560818430000011
wherein x isi=[x1i x2i … xni]Τ(i-1, 2, … m) represents a feature set composed of ith dimension features of each sample in the dataset;
(2) and (3) carrying out normalization processing on the data set obtained in the step (1) by adopting a Z-score normalization method, and carrying out normalization processing on the data set after the normalization processing according to the ratio of 5: 1 into training set Xtrain,Xtrain∈Rn×mAnd test set Xtest,Xtest∈Rn×m
(3) Using PCA method to perform training set X obtained in step (2)trainCarrying out feature extraction to obtain a new feature vector set after dimension reduction; step (3) is specifically, firstly, a PCA method is utilized to carry out training set XtrainThe m feature sets in (1) are linearly transformed as follows:
Figure FDA0003560818430000012
wherein Y ═ Y1 y2 … ym]Is a new feature set after conversion, and the sample covariance matrix a is equal to:
Figure FDA0003560818430000021
wherein the content of the first and second substances,
Figure FDA0003560818430000022
xia feature set composed of the ith dimension features of each sample in the step (1),
Figure FDA0003560818430000023
representing the average value of the feature sets corresponding to all samples in the data set;
then, a feature vector set alpha and a feature value set lambda of the new feature set Y are obtained according to the lambda alpha and the Y alpha,
wherein α ═ α12,…,αm],λ=[λ12,…,λm]Wherein λ isiRepresenting a characteristic value corresponding to the ith principal component;
then, according to the characteristic value lambda corresponding to the ith principal componentiCalculating the variance contribution rate p of the ith principal componenti
Figure FDA0003560818430000024
Wherein p isiRepresenting the variance contribution rate of the ith principal component;
then, k is set to 1, and the characteristic value λ corresponding to the ith principal component is usediAnd obtaining the cumulative variance contribution rate of the ith principal component when k is 1 by using the following formula
Figure FDA0003560818430000025
Figure FDA0003560818430000026
Then, setting k to 2, and following the calculation formula in the previous paragraph, to obtain the cumulative variance contribution rate of the ith principal component when k is 2
Figure FDA0003560818430000027
And judging the cumulative variance contribution rate of the ith principal component when k is 1
Figure FDA0003560818430000028
If the growth rate is less than 1%, fixing the k value to 2 if the growth rate is less than 1%, otherwise, continuing to set k to 3, and acquiring the cumulative variance contribution rate of the ith principal component when k is 3
Figure FDA0003560818430000031
And judging the cumulative variance contribution rate of the ith principal component when k is 2
Figure FDA0003560818430000032
If the growth rate is less than 1%, fixing the value of k to 3 if the growth rate is less than 1%, otherwise, continuously setting k to 4, … and the like;
then, sorting the components from large to small according to the variance contribution rates of the main components, selecting characteristic values corresponding to the variance contribution rates of the first k main components from the sorting result, and recording a subscript set consisting of the subscripts corresponding to the characteristic values as index ═ { index [ -index [ ]1,index2,…,indexkAnd selecting columns corresponding to the subscripts from the new characteristic set Y according to the subscript set index, thereby obtaining a new characteristic vector set of all the samples after dimensionality reduction, and marking as a new characteristic vector set
Figure FDA0003560818430000033
Wherein z isd(d-1, 2, …, n) represents the d-th sample's feature vector after dimensionality reduction, which contains k features, i.e., zd={fd1fd2 … fdkDimension 1 × k;
(4) processing each feature vector in the new feature vector set subjected to dimensionality reduction in the step (3) by using an annular multi-granularity scanning structure in the anomaly detection model to obtain a feature sub-vector set corresponding to the feature vector, wherein a large set H is formed by all n feature sub-vector sets; step (4) is specifically that firstly, the first eigenvector z in the new eigenvector set after dimension reduction in step (3) is carried out1={f11 f12 … f1kPerforming annular multi-granularity scanning processing to the characteristic f11And characteristic f1kConnected so that the feature vector z1={f11 f12 … f1kBecoming a ring shape connected end to end; secondly, from the start feature f, a sliding window of length t11Initially, a feature subvector is obtained for each unit sliding from left to right, and then the sliding window is moved one unit to the right, so that the starting feature is f12And the obtained feature subvector is marked as F2={f12,f13,…,f1(t+1)…, and finally, sliding the windowThe mouth is moved to have a starting characteristic of f1kAnd the obtained characteristic subvectors are marked as Fk={f1k,f11,…,f1(t-1)All k feature subvectors form a feature subvector set H1={F1 F2 … Fk};
Then, firstly, the second eigenvector z in the new eigenvector set after dimension reduction in step (3) is selected2={f21 f22 … f2kPerforming annular multi-granularity scanning processing to the characteristic f21And characteristic f2kConnected so that the feature vector z2={f21 f22 … f2kBecoming a ring shape connected end to end; secondly, from the starting feature f, a sliding window of length t21Initially, a feature subvector is obtained for each unit sliding from left to right, and then the sliding window is moved one unit to the right, so that the starting feature is f22And the obtained feature subvector is marked as F2={f22,f23,…,f2(t+1)…, and finally, the sliding window is moved to have the starting feature Fk={f2k,f21,…,f2(t-1)All k feature subvectors form a feature subvector set H2={F1 F2 … Fk};
… and so on; subsequently, the nth feature vector z in the new feature vector set after dimension reduction in the step (3) is subjected ton={fn1fn2 … fnkPerforming annular multi-granularity scanning processing to the characteristic fn1And characteristic fnkConnected so that the feature vector zn={fn1 fn2… fnkBecoming a ring shape connected end to end; secondly, from the start feature f, a sliding window of length tn1Starting with a unit sliding from left to right, one feature sub-vector is obtained for each unit of sliding, and then the sliding window is moved to the right by one unit, so that the starting feature is fn2And the obtained feature subvector is marked as F2={fn2,fn3,…,fn(t+1)…, and finally, the sliding window is moved to have the starting feature Fk={fnk,fn1,…,fn(t-1)All k feature subvectors form a feature subvector set Hn={F1 F2 … Fk};
Finally, a large set composed of all feature sub-vector sets corresponding to all n sample feature vectors is recorded as H ═ H1,H2,…,Hd},(d=1,2,…,n);
(5) Collecting H each feature sub-vector in the large set H obtained in the step (4)dAnd each eigenvector in d e {1,2, …, n } is respectively input into a fully random forest classifier and a semi-random forest classifier to obtain 2 c-dimensional class eigenvectors U ═ U [ n ]1,u2,…,ucH, each feature subvector set HdAnd d is equal to k 2 c-dimensional feature vectors corresponding to all k feature sub-vectors in {1,2, …, n }, so as to form an initial class feature vector bdThe dimension is 1 × 2kc, and n initial class feature vectors corresponding to all feature subvector sets in the set H form an initial class feature vector set B ═ B1 b2 … bnDimension n × 2kc, wherein c represents the number of classification categories;
(6) inputting the initial class feature vector generated in the step (5) into a diversified cascade forest structure for iterative training until the diversified cascade forest structure is converged, thereby obtaining a trained anomaly detection model; the concrete step (6) is that,
firstly, the first layer of the diversified cascade forest structure is processed as follows:
firstly, a first initial class feature vector B in an initial class feature vector set B is set1Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b1Combining to generate (5c +2kc) dimension class feature vector e11={o1 o2 o3 o4 o5 b1};
Then, a second final class feature vector B in the initial class feature vector set B is used2Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b2Generating a (5c +2kc) dimensional feature vector e in combination12={o1 o2 o3 o4 o5 b2}, … and so on;
finally, all (5c +2kc) dimensional class feature vectors corresponding to all n initial class feature vectors in the initial class feature vector set B form a class feature vector set E of the first layer1={e11 e12 … e1n};
Then, the second layer of the diversified cascade forest structure is processed as follows:
firstly, a class feature vector set E generated by a first layer is collected1={e11 e12 … e1nThe first class feature vector e in11Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b1Generating a (5c +2kc) dimensional feature vector e in combination21={o1 o2 o3 o4 o5 b1};
Secondly, the class feature vectors generated by the first layer are collected into a set E1={e11 e12 … e1nThe second class feature vector e in12Respectively input diversified cascade forestFive decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of the structure are used for obtaining five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b2Generating a (5c +2kc) dimensional feature vector e in combination22={o1 o2 o3 o4 o5 b2}, … and so on;
finally, all (5c +2kc) dimensional feature vectors corresponding to all n initial class feature vectors in the initial class feature vector set B form a class feature vector set E of the second layer2={e21 e22 … e2n};
And then, processing subsequent layers of the diversified cascading forest structure in the same mode as the first layer and the second layer until the diversified cascading forest structure is converged, and finishing model training.
2. The improved deep forest-based industrial control system abnormality detection method as claimed in claim 1, wherein step (1) is specifically to perform unified numerical conversion on data types of the acquired network data, label of normal data and label of abnormal data are respectively represented by 0 and 1, then normalization processing is performed on the numerically converted network data by a Z-score method, and finally, one feature of each sample in the normalized network data is used as a data dimension, so that the samples are converted into feature vectors, and the feature vectors corresponding to all the samples form a sample set.
3. The method as claimed in claim 1 or 2, wherein the step (2) is specifically that a PCA method is used to perform feature extraction on the sample set obtained in the step (1) to obtain a new feature vector set after dimension reduction, an annular multi-granularity scanning structure in a trained anomaly detection model is used to process each feature vector in the new feature vector set after dimension reduction to obtain a feature sub-vector set corresponding to the feature vector, all feature sub-vector sets form a large set, each feature sub-vector in each feature sub-vector set in the large set is respectively input into a fully random forest classifier and a semi-random forest classifier to obtain a class feature vector, and a plurality of initial class feature vectors corresponding to all feature sub-vector sets in the set form a class feature vector set, and finally, inputting the final class feature vector into the final layer of integrated classification model to obtain a plurality of classification results, and acquiring the mean value of all classification results, wherein if the mean value is more than 0.5, the industrial control system to be detected is abnormal, and otherwise, the industrial control system to be detected is normal.
4. An improved deep forest based industrial control system anomaly detection system, comprising:
the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring network data from an industrial control system to be detected and preprocessing the network data to obtain a sample set;
the second module is used for inputting the sample set obtained by the first module into a pre-trained anomaly detection model so as to obtain an anomaly detection result; the anomaly detection model is obtained by training the following steps:
(1) obtaining network data, and constructing a data set X according to the network data, wherein X belongs to Rn×mWherein R represents a real number set and n represents a total number of samples in the data set;
Figure FDA0003560818430000071
wherein x isi=[x1i x2i … xni]Τ(i-1, 2, … m) represents a feature set composed of ith dimension features of each sample in the dataset;
(2) normalization of step (1) by Z-score) And carrying out normalization processing on the obtained data set, and carrying out normalization processing on the data set according to the ratio of 5: 1 into training set Xtrain,Xtrain∈Rn×mAnd test set Xtest,Xtest∈Rn×m
(3) Using PCA method to perform training set X obtained in step (2)trainPerforming feature extraction to obtain a new feature vector set after dimension reduction; step (3) is specifically, firstly, a PCA method is utilized to carry out training set XtrainThe m feature sets in (1) are linearly transformed as follows:
Figure FDA0003560818430000072
wherein Y ═ Y1 y2 … ym]Is a new feature set after conversion, and the sample covariance matrix a is equal to:
Figure FDA0003560818430000081
wherein the content of the first and second substances,
Figure FDA0003560818430000082
xia feature set composed of the ith dimension features of each sample in the step (1),
Figure FDA0003560818430000083
representing the average value of the feature sets corresponding to all samples in the data set;
then, a feature vector set alpha and a feature value set lambda of the new feature set Y are obtained according to the lambda alpha and the Y alpha,
wherein α ═ α12,…,αm],λ=[λ12,…,λm]Wherein λ isiRepresenting a characteristic value corresponding to the ith principal component;
then, according to the characteristic value lambda corresponding to the ith principal componentiCalculating the ith principal componentVariance contribution ratio pi
Figure FDA0003560818430000084
Wherein p isiRepresenting the variance contribution rate of the ith principal component;
then, k is set to 1, and the characteristic value λ corresponding to the ith principal component is usediAnd obtaining the cumulative variance contribution rate of the ith principal component when k is 1 by using the following formula
Figure FDA0003560818430000085
Figure FDA0003560818430000086
Then, set k to 2, follow the calculation formula in the previous paragraph to obtain the cumulative variance contribution rate of the ith principal component when k is 2
Figure FDA0003560818430000087
And judging the cumulative variance contribution rate of the ith principal component when k is 1
Figure FDA0003560818430000088
If the growth rate is less than 1%, fixing the k value to 2 if the growth rate is less than 1%, otherwise, continuing to set k to 3, and acquiring the cumulative variance contribution rate of the ith principal component when k is 3
Figure FDA0003560818430000091
And judging the cumulative variance contribution rate of the ith principal component when k is 2
Figure FDA0003560818430000092
If the growth rate of the time is less than 1%, if yes, fixing the value of k to be 3, otherwise, continuously setting k to be 4, … and the like;
then, sorting the components from large to small according to the variance contribution rates of the main components, selecting characteristic values corresponding to the variance contribution rates of the first k main components from the sorting result, and recording a subscript set consisting of the subscripts corresponding to the characteristic values as index ═ { index [ -index [ ]1,index2,…,indexkAnd selecting columns corresponding to the subscripts from the new characteristic set Y according to the subscript set index, thereby obtaining a new characteristic vector set of all the samples after dimensionality reduction, and marking as a new characteristic vector set
Figure FDA0003560818430000093
Wherein z isd(d-1, 2, …, n) represents the d-th sample's feature vector after dimensionality reduction, which contains k features, i.e., zd={fd1fd2 … fdkDimension 1 × k;
(4) processing each feature vector in the new feature vector set subjected to dimensionality reduction in the step (3) by using an annular multi-granularity scanning structure in the anomaly detection model to obtain a feature sub-vector set corresponding to the feature vector, wherein a large set H is formed by all n feature sub-vector sets; step (4) is specifically that firstly, the first eigenvector z in the new eigenvector set after dimension reduction in step (3) is carried out1={f11 f12 … f1kPerforming annular multi-granularity scanning processing to the characteristic f11And characteristic f1kConnected so that the feature vector z1={f11 f12 … f1kBecoming a ring shape connected end to end; secondly, from the start feature f, a sliding window of length t11Initially, a feature subvector is obtained for each unit sliding from left to right, and then the sliding window is moved one unit to the right, so that the starting feature is f12And the obtained feature subvector is marked as F2={f12,f13,…,f1(t+1)…, and finally, the sliding window is moved to have the starting feature f1kAnd the obtained characteristic subvectors are marked as Fk={f1k,f11,…,f1(t-1)All k feature subvectors form a feature subvector set H1={F1 F2 … Fk};
Then, firstly, the second eigenvector z in the new eigenvector set after dimension reduction in step (3) is selected2={f21 f22 … f2kPerforming annular multi-granularity scanning processing to the characteristic f21And characteristic f2kConnected so that the feature vector z2={f21 f22 … f2kBecoming a ring shape connected end to end; secondly, from the start feature f, a sliding window of length t21Initially, a feature subvector is obtained for each unit sliding from left to right, and then the sliding window is moved one unit to the right, so that the starting feature is f22And the obtained feature subvector is marked as F2={f22,f23,…,f2(t+1)…, and finally, the sliding window is moved to have the starting feature Fk={f2k,f21,…,f2(t-1)All k feature subvectors form a feature subvector set H2={F1 F2 … Fk};
… and so on; subsequently, the nth feature vector z in the new feature vector set after dimension reduction in the step (3) is subjected ton={fn1fn2 … fnkPerforming annular multi-granularity scanning processing to the characteristic fn1And characteristic fnkConnected so that the feature vector zn={fn1 fn2… fnkBecoming a ring shape connected end to end; secondly, from the start feature f, a sliding window of length tn1Initially, a feature subvector is obtained for each unit sliding from left to right, and then the sliding window is moved one unit to the right, so that the starting feature is fn2And the obtained feature subvector is marked as F2={fn2,fn3,…,fn(t+1)…, and finally, the sliding window is moved to have the starting feature Fk={fnk,fn1,…,fn(t-1)All k feature subvectors form a feature subvector set Hn={F1 F2 … Fk};
Finally, a large set composed of all feature sub-vector sets corresponding to all n sample feature vectors is recorded as H ═ H1,H2,…,Hd},(d=1,2,…,n);
(5) Collecting H each feature sub-vector in the large set H obtained in the step (4)dAnd each eigenvector in d e {1,2, …, n } is respectively input into a fully random forest classifier and a semi-random forest classifier to obtain 2 c-dimensional class eigenvectors U ═ U [ n ]1,u2,…,ucH, each feature subvector set HdAnd d is equal to k 2 c-dimensional feature vectors corresponding to all k feature sub-vectors in {1,2, …, n }, so as to form an initial class feature vector bdThe dimension is 1 × 2kc, and n initial class feature vectors corresponding to all feature sub-vector sets in the set H form an initial class feature vector set B ═ B1 b2 … bnDimension n × 2kc, wherein c represents the number of classification categories;
(6) inputting the initial class feature vector generated in the step (5) into a diversified cascade forest structure for iterative training until the diversified cascade forest structure is converged, thereby obtaining a trained anomaly detection model; the concrete step (6) is that,
firstly, the first layer of the diversified cascade forest structure is processed as follows:
firstly, a first initial class feature vector B in an initial class feature vector set B is used1Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b1Generating (5c +2kc) dimension class feature vector e in combination11={o1 o2 o3 o4 o5 b1};
Then, the second one in the initial class feature vector set BFinal class feature vector b2Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b2Generating a (5c +2kc) dimensional feature vector e in combination12={o1 o2 o3 o4 o5 b2}, … and so on;
finally, all (5c +2kc) dimensional class feature vectors corresponding to all n initial class feature vectors in the initial class feature vector set B form a class feature vector set E of the first layer1={e11 e12 … e1n};
Then, the second layer of the diversified cascade forest structure is processed as follows:
firstly, a class feature vector set E generated by a first layer is collected1={e11 e12 … e1nThe first class feature vector e in11Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b1Generating a (5c +2kc) dimensional feature vector e in combination21={o1 o2 o3 o4 o5 b1};
Secondly, the class feature vectors generated by the first layer are collected into a set E1={e11 e12 … e1nThe second class feature vector e in12Respectively inputting five decision trees of semi-random forest, completely random forest, XGboost, GBDT and Catboost of a diversified cascade forest structure to obtain five c-dimensional class characteristic vectors o1,o2,o3,o4,o5Class feature vector o1,o2,o3,o4,o5And the initial class feature vector b2Generating a (5c +2kc) dimensional feature vector e in combination22={o1 o2 o3 o4 o5 b2}, … and so on;
finally, all (5c +2kc) dimensional feature vectors corresponding to all n initial class feature vectors in the initial class feature vector set B form a class feature vector set E of the second layer2={e21 e22 … e2n};
And then, processing subsequent layers of the diversified cascading forest structure in the same mode as the first layer and the second layer until the diversified cascading forest structure is converged, and finishing model training.
CN202110438900.4A 2021-04-23 2021-04-23 Industrial control system anomaly detection method and system based on improved deep forest Active CN113159181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110438900.4A CN113159181B (en) 2021-04-23 2021-04-23 Industrial control system anomaly detection method and system based on improved deep forest

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110438900.4A CN113159181B (en) 2021-04-23 2021-04-23 Industrial control system anomaly detection method and system based on improved deep forest

Publications (2)

Publication Number Publication Date
CN113159181A CN113159181A (en) 2021-07-23
CN113159181B true CN113159181B (en) 2022-06-10

Family

ID=76869794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110438900.4A Active CN113159181B (en) 2021-04-23 2021-04-23 Industrial control system anomaly detection method and system based on improved deep forest

Country Status (1)

Country Link
CN (1) CN113159181B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115078552B (en) * 2022-07-06 2023-09-08 江南大学 Flip chip defect detection method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958255A (en) * 2017-11-21 2018-04-24 中国科学院微电子研究所 A kind of object detection method and device based on image
CN109741597A (en) * 2018-12-11 2019-05-10 大连理工大学 A kind of bus section runing time prediction technique based on improvement depth forest
WO2020215671A1 (en) * 2019-08-19 2020-10-29 平安科技(深圳)有限公司 Method and device for smart analysis of data, and computer device and storage medium
CN111931953A (en) * 2020-07-07 2020-11-13 北京工业大学 Multi-scale characteristic depth forest identification method for waste mobile phones
CN112633368A (en) * 2020-12-21 2021-04-09 四川大学 Flat vibration motor defect detection system and method based on improved multi-granularity cascade forest
CN112686313A (en) * 2020-12-31 2021-04-20 江西理工大学 Improved parallel deep forest classification method based on information theory

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11277420B2 (en) * 2017-02-24 2022-03-15 Ciena Corporation Systems and methods to detect abnormal behavior in networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958255A (en) * 2017-11-21 2018-04-24 中国科学院微电子研究所 A kind of object detection method and device based on image
CN109741597A (en) * 2018-12-11 2019-05-10 大连理工大学 A kind of bus section runing time prediction technique based on improvement depth forest
WO2020215671A1 (en) * 2019-08-19 2020-10-29 平安科技(深圳)有限公司 Method and device for smart analysis of data, and computer device and storage medium
CN111931953A (en) * 2020-07-07 2020-11-13 北京工业大学 Multi-scale characteristic depth forest identification method for waste mobile phones
CN112633368A (en) * 2020-12-21 2021-04-09 四川大学 Flat vibration motor defect detection system and method based on improved multi-granularity cascade forest
CN112686313A (en) * 2020-12-31 2021-04-20 江西理工大学 Improved parallel deep forest classification method based on information theory

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
孙一恒.基于改进深度森林的入侵检测方法研究.《中国优秀硕士学位论文全文数据库(电子期刊)》.2021,I139-95. *
王雪宁.A Research of GcForest Methods for Network Abnormal Behavior Detection.《2020 International Conference on Computer Engineering and Application (ICCEA)》.2020,218-221. *

Also Published As

Publication number Publication date
CN113159181A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN108632279B (en) Multilayer anomaly detection method based on network traffic
CN110287983B (en) Single-classifier anomaly detection method based on maximum correlation entropy deep neural network
Veenman Statistical disk cluster classification for file carving
CN111901340B (en) Intrusion detection system and method for energy Internet
CN111835707B (en) Malicious program identification method based on improved support vector machine
CN113489685B (en) Secondary feature extraction and malicious attack identification method based on kernel principal component analysis
CN110602120B (en) Network-oriented intrusion data detection method
CN111143838B (en) Database user abnormal behavior detection method
CN108256449B (en) Human behavior identification method based on subspace classifier
Batal et al. A supervised time series feature extraction technique using dct and dwt
CN111598179A (en) Power monitoring system user abnormal behavior analysis method, storage medium and equipment
CN115277189B (en) Unsupervised intrusion flow detection and identification method based on generation type countermeasure network
CN113159181B (en) Industrial control system anomaly detection method and system based on improved deep forest
Shao et al. Deep learning hierarchical representation from heterogeneous flow-level communication data
CN113098862A (en) Intrusion detection method based on combination of hybrid sampling and expansion convolution
Du et al. Large-scale signature matching using multi-stage hashing
Huang et al. A high security BioHashing encrypted speech retrieval algorithm based on feature fusion
JP4476078B2 (en) Time series data judgment program
Wu et al. Intrusion Detection System Using a Distributed Ensemble Design Based Convolutional Neural Network in Fog Computing
CN111581640A (en) Malicious software detection method, device and equipment and storage medium
CN106778775B (en) Image classification method based on SIFT feature soft matching
CN113505826B (en) Network flow anomaly detection method based on joint feature selection
Seyedghorban et al. Anomaly Detection in File Fragment Classification of Image File Formats
CN113609480B (en) Multipath learning intrusion detection method based on large-scale network flow
Yin et al. High-Quality Triggers Based Fragile Watermarking for Optical Character Recognition Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant