CN113159310A - Intrusion detection method based on residual error sparse width learning system - Google Patents

Intrusion detection method based on residual error sparse width learning system Download PDF

Info

Publication number
CN113159310A
CN113159310A CN202011524068.1A CN202011524068A CN113159310A CN 113159310 A CN113159310 A CN 113159310A CN 202011524068 A CN202011524068 A CN 202011524068A CN 113159310 A CN113159310 A CN 113159310A
Authority
CN
China
Prior art keywords
bls
model
intrusion detection
data
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011524068.1A
Other languages
Chinese (zh)
Inventor
王振东
刘尧迪
李大海
王俊岭
曾珽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi University of Science and Technology
Original Assignee
Jiangxi University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi University of Science and Technology filed Critical Jiangxi University of Science and Technology
Priority to CN202011524068.1A priority Critical patent/CN113159310A/en
Publication of CN113159310A publication Critical patent/CN113159310A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Abstract

The invention relates to the technical field of network and host intrusion detection, in particular to an intrusion detection method based on a residual error sparse width learning system, which comprises the following steps: step1, preprocessing an original intrusion detection data set, and Step2, dividing a standard data set into a training set and a testing set; and Step3, training and optimizing parameters of the BLS model, and Step4, inputting test data into the trained RES-BLS intrusion detection model, so as to obtain a classification result of each piece of data. The intrusion detection method based on the residual error sparse width learning system can effectively solve the problems of low accuracy rate, real rate, false positive rate and the like of the width learning system; the model uses SVD to solve the output weight matrix of BLS, continuously adjusts the error in the network training process through residual error learning, and finally prunes the redundant characteristics of the network through sparse pruning and the output weight so as to prune the redundant nodes and avoid the model from falling into local optimum.

Description

Intrusion detection method based on residual error sparse width learning system
Technical Field
The invention belongs to the technical field of network and host intrusion detection, and particularly relates to an intrusion detection method based on a residual error sparse width learning system.
Background
The width learning system is a feedforward neural network, the algorithm generates characteristic nodes and enhanced nodes in a sparse self-coding or random mode, and calculates corresponding output weights through ridge regression generalized inverse; the realization is simpler, but the traditional BLS has characteristic nodes and enhanced nodes with smaller output weight, so that the node has little effect on the final output of the network; a large number of redundant nodes can increase the complexity of a network structure and reduce the learning efficiency; in addition, the traditional BLS can calculate the output weight value only once, and errors are not adjusted, so that the classification effect of the model is influenced to a certain extent until the model falls into local optimum; in order to overcome the defects of the traditional BLS, a residual sparse width learning system is proposed and applied to intrusion detection.
The intrusion detection system can monitor network transmission in real time, and send out an alarm or take active defense measures when suspicious transmission is found, so that the intrusion detection system as an active safety protection technology becomes one of important technologies for guaranteeing network safety; the detection methods of intrusion detection systems are mainly classified into two categories: misuse detection and anomaly detection; the misuse detection is that the known intrusion behavior and attempt are subjected to feature extraction and written into a rule base, and the monitored network behavior is subjected to pattern matching with the rule base so as to judge the intrusion behavior or the intrusion attempt, and the method has the advantages of low false alarm rate, but has the main defects of difficult collection and update of intrusion information and large maintenance workload of the feature base; the abnormal detection is to detect an attack behavior from a large number of normal user behaviors; the obvious advantage of being capable of detecting unknown attacks is that the method is easy to generate higher false positives in the detection process; with the continuous perfection of the detection theory, the intrusion detection system is continuously developed; from an initial pattern matching algorithm and an expert system based on rule integration to the current algorithm based on artificial intelligence, good detection effects are obtained, and methods such as a support vector machine, an artificial neural network, a group intelligence algorithm, deep learning and the like are applied to intrusion detection research; however, with the advent of the big data era, the network topology is more and more complex, the data flow is more and more large, and the intrusion behavior is constantly changed, so that many defects, such as high false alarm rate and missing report rate, occur in the actual use process of the existing intrusion detection system, and especially in the high-speed switching network environment, the existing intrusion detection system cannot well detect all data packets; due to the defects of the data analysis method, the accuracy rate of data packet analysis is not high, report omission often occurs, and due to the fact that the updating of the detection rule lags the updating of the attack means, no corresponding detection rule exists for new attacks, and the false alarm phenomenon is caused; due to the fact that intrusion behaviors are variable, the intrusion detection model is difficult to maintain stable detection performance for all attack types; how to rapidly and accurately detect the intrusion behavior in the network in the current and future network environments becomes a key problem to be solved urgently in intrusion detection research; therefore, an intrusion detection method based on the residual error sparse width learning system is designed, and is urgently needed for the technical field of network and host intrusion detection at present.
Disclosure of Invention
The invention provides an intrusion detection method based on a residual error sparse width learning system, which aims to solve the problems in the prior art.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
according to the embodiment of the invention, the intrusion detection method based on the residual error sparse width learning system comprises the following steps:
step1, preprocessing an original intrusion detection data set, wherein the preprocessing processes of the intrusion detection data set based on the network and the intrusion detection data set based on the host are respectively as follows:
A. preprocessing the intrusion detection data set based on the network:
(a) and high-dimensional data feature mapping: converting the discrete features into digital features using high-dimensional feature mapping;
(b) and data normalization: because the difference between the data with the same attribute is large, the training of the neural network is influenced, and therefore the data are normalized into real numbers of [0,1 ];
B. preprocessing an intrusion detection data set based on a host:
(a) converting the file type: the txt type file is converted into an xls file, and the attributes of each type of data type are separated, so that MATLAB (matrix laboratory) processing is facilitated;
(b) bag of words model characterization data: because the original data in the data set belong to text class characteristics and are not beneficial to use, the data are characterized by using a word bag model according to word frequency;
step2, standard data set partitioning: dividing a standard data set into a training set and a test set;
step3, model training: training and parameter tuning are carried out on the BLS model;
(a) initializing parameters of the BLS model: the number of characteristic nodes n, the number of enhanced nodes m, and a sparse parameter thetakVector of
Figure RE-GDA0003085109780000031
And initial weights and thresholds of the network;
(b) calculating the characteristic node mapping value Z of the BLS modeli=φ(XWθiθi) And enhanced node mapping value Hj=ξ(ZnWhjhj);
(c) Merging the characteristic node mapping values and the enhanced node mapping values into a matrix A;
(d) calculating a residual error A + A of the BLS model, sequencing the enhancement nodes according to the residual error, calculating a relative error of the enhancement nodes, trimming the enhancement nodes according to the relative error, reserving the enhancement nodes with smaller relative errors, if an iteration termination condition is met, ending the iteration and turning to the step (e), otherwise, turning to the step (d), wherein the relative error is as follows:
Figure RE-GDA0003085109780000032
(e) outputting the optimal BLS model, namely the RES-BLS model;
and Step4, inputting the test data into the trained RES-BLS intrusion detection model, and further obtaining the classification result of each piece of data.
Further, the training or prediction time of the RES-BLS model depends on the time complexity, and by assuming that the number of training samples is m, the iteration times is l, and the neuron numbers of the feature node layer and the output layer are n respectively2And n3The number of neurons in the enhanced node layer is n2In the original BLS model, the time complexity of calculating a sample is o [ (n)1+n2)*n3]Therefore, the overall time complexity of the BLS model is o { m x l [ (n)1+n2)*n3]And the RES-BLS model is improved on the basis of BLS, the enhanced nodes need to be sorted according to relative errors, and the time complexity is o (n)2logn2) Therefore, the overall time complexity of the RES-BLS model is o { m x l [ (n)1+n2)*n3+n2logn2]}。
The invention has the following advantages:
the intrusion detection method based on the residual error sparse width learning system can effectively solve the problems of low accuracy rate, real rate, false positive rate and the like of the width learning system; according to the model, SVD decomposition is used for solving an output weight matrix of BLS, errors in a network training process are continuously adjusted through residual error learning, and finally redundant nodes are pruned through sparse pruning to avoid the model from falling into local optimization through redundant features of a pruning network and output weights, so that the detection performance of the model is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.
The structures, ratios, sizes, and the like shown in the present specification are only used for matching with the contents disclosed in the specification, so that those skilled in the art can understand and read the present invention, and do not limit the conditions for implementing the present invention, so that the present invention has no technical significance, and any structural modifications, changes in the ratio relationship, or adjustments of the sizes, without affecting the functions and purposes of the present invention, should still fall within the scope of the present invention.
FIG. 1 is a schematic diagram of a RES-BLS intrusion detection framework of the present invention;
FIG. 2 is a flow chart of the RES-BLS algorithm of the present invention;
FIG. 3 is a schematic diagram of the residual error on the KDDCup99 data set of the present invention;
Detailed Description
The present invention is described in terms of particular embodiments, other advantages and features of the invention will become apparent to those skilled in the art from the following disclosure, and it is to be understood that the described embodiments are merely exemplary of the invention and that it is not intended to limit the invention to the particular embodiments disclosed. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the present specification, the terms "upper", "lower", "left", "right", "middle", and the like are used for clarity of description, and are not intended to limit the scope of the present invention, and changes or modifications in the relative relationship may be made without substantial changes in the technical content.
Referring to fig. 1-3, the present invention provides a technical solution:
an intrusion detection method based on a residual error sparse width learning system comprises the following steps:
step1, preprocessing an original intrusion detection data set, wherein the preprocessing processes of the intrusion detection data set based on the network and the intrusion detection data set based on the host are respectively as follows:
A. preprocessing the intrusion detection data set based on the network:
(a) and high-dimensional data feature mapping: converting the discrete features into digital features using high-dimensional feature mapping;
(b) and data normalization: because the difference between the data with the same attribute is large, the training of the neural network is influenced, and therefore the data are normalized into real numbers of [0,1 ];
B. preprocessing an intrusion detection data set based on a host:
(a) converting the file type: the txt type file is converted into an xls file, and the attributes of each type of data type are separated, so that MATLAB (matrix laboratory) processing is facilitated;
(b) bag of words model characterization data: because the original data in the data set belong to text class characteristics and are not beneficial to use, the data are characterized by using a word bag model according to word frequency;
step2, standard data set partitioning: dividing a standard data set into a training set and a test set;
step3, model training: training and parameter tuning are carried out on the BLS model;
(a) initializing parameters of the BLS model: the number of characteristic nodes n, the number of enhanced nodes m, and a sparse parameter thetakVector of
Figure RE-GDA0003085109780000051
And initial weights and thresholds of the network;
(b) calculating the characteristic node mapping value Z of the BLS modeli=φ(XWθiθi) And enhanced node mapping value Hj=ξ(ZnWhjhj);
(c) Merging the characteristic node mapping values and the enhanced node mapping values into a matrix A;
(d) calculating a residual error A + A of the BLS model, sequencing the enhancement nodes according to the residual error, calculating a relative error of the enhancement nodes, trimming the enhancement nodes according to the relative error, reserving the enhancement nodes with smaller relative errors, if an iteration termination condition is met, ending the iteration and turning to the step (e), otherwise, turning to the step (d), wherein the relative error is as follows:
Figure RE-GDA0003085109780000052
(e) outputting the optimal BLS model, namely the RES-BLS model;
and Step4, inputting the test data into the trained RES-BLS intrusion detection model, and further obtaining the classification result of each piece of data.
In the invention: the training or predicting time of the RES-BLS model depends on time complexity, and by assuming that the number of training samples is m, the iteration times is l, and the neuron numbers of a characteristic node layer and an output layer are n respectively2And n3The number of neurons in the enhanced node layer is n2In the original BLS model, the time complexity of calculating a sample is o [ (n)1+n2)*n3]Therefore, the overall time complexity of the BLS model is o { m x l [ (n)1+n2)*n3]And the RES-BLS model is improved on the basis of BLS, the enhanced nodes need to be sorted according to relative errors, and the time complexity is o (n)2logn2) Therefore, the overall time complexity of the RES-BLS model is o { m x l [ (n)1+n2)*n3+n2logn2]Comparing the two models, the time complexity of the RES-BLS model is slightly larger than that of the BLS model, but the detection precision of the RES-BLS model is superior to that of the BLS model, and the detection precision of the RES-BLS model is effectively improved.
In the invention, the width learning system is a network transversely-extended efficient incremental learning system which takes a random vector function-linked neural network as a mapping feature and is based on the fact that a single-hidden-layer neural network passes through a neural enhancement node and directly connects the mapping feature and the enhancement node to an output end, and N input data with M dimensions are assumed
Figure RE-GDA0003085109780000061
N sets of feature maps are obtained by equation (1):
Zi=φ(XWeiei)#(1)
wherein, WeiFor input to the input weight matrix between feature nodes, betaeiFor biasing of the characteristic node, let Zn=[Z1,Z2,…,Zn]The first n groups of mapping feature sets formed for mapping can generate m groups of enhanced nodes through nonlinear transformation (2) of the activation function
Hj=ξ(ZnWhjhj)#(2)
Wherein, WhjIs an input weight matrix, beta, between the feature node and the enhancement nodehjTo enhance the biasing of the node, let Hm=[H1,H2,…,Hm]Enhancing the feature set of the nodes for the first m groups formed for enhancement;
thus, the width learning system can be represented as a whole by equation (3)
Y=[Z1,Z2,…,Zn|H1,H2,…,Hm]Wm
=[Zn|Hm]Wm#(3)
=AmWm
Wherein, the network target weight matrix Wm=[Zn|Hm]+Y=(Am)+Y, and (A)m)+Can be approximated by an equation
Figure RE-GDA0003085109780000062
Calculating;
Y∈RN×Qfor the output of the network, Y ∈ R when only two classification and regression tasks are consideredN
In the invention, the residual error stage of the width learning system mainly comprises BLS based on SVD decomposition, truncation error in SVD, pruning of hidden neurons and a BLS sparse stage; since the RES-BLS model mainly includes three steps, as shown in fig. 2, data is first preprocessed and converted into data that can be input into the RES-BLS model; secondly, the BLS network is refined by utilizing the residual error sorting idea, and therefore, the residual error A of the enhanced node neuron needs to be calculated+A, then we can use the obtained residual to order each enhanced node neuron, and finally, the enhanced node neurons are separated from the network and the dimension of data AAnd (6) discharging.
1. BLS based on SVD decomposition
In equation (3), use is made of
Figure RE-GDA0003085109780000071
Calculating Wm=(Am)+Y, however using Wm=(Am)+Y=(λI+AAT)-1ATY often results in unstable numerical calculations; in addition, the output weight matrix W is calculated in equation (3)mThe operation of matrix inversion is involved, and if an input sample is too large, the complexity of matrix inversion is increased, so that the training efficiency of BLS is reduced; in order to overcome the instability of numerical calculation and reduce the calculation complexity of BLS, a BLS algorithm based on SVD is provided, and a solving method for changing a BLS output weight matrix is provided;
singular value decomposition, also called single value decomposition, is a most famous and widely used matrix decomposition method, and can simplify an intrusion detection data set, remove noise points and improve the accuracy of an algorithm by applying the method to BLS; assuming that N is an m × N real matrix, there is an m-order orthogonal matrix U and an N-order orthogonal matrix V, such that N is represented as:
N=UDVT
Figure RE-GDA0003085109780000072
wherein:
Figure RE-GDA0003085109780000073
representing left singular value orthogonal matrix, column vector
Figure RE-GDA0003085109780000074
Is NNTIs determined by the feature vector of (a),
Figure RE-GDA0003085109780000075
orthogonal matrix of right singular values, column vector
Figure RE-GDA0003085109780000076
Is NTCharacteristic of NVectors, and each satisfy NNT=UΛ1UT=I,NTN= VΛ2VTI, a block form of an mxn matrix D
Figure RE-GDA0003085109780000077
And is
Figure RE-GDA0003085109780000078
Is uniquely determined by decomposition, and is marked as a singular value of a matrix N, a matrix Lambda1Or matrix Λ2The non-zero elements on the diagonal are respectively lambdai(i-1, 2, …, r), where the decomposed form of the matrix product N-UDVTIs the singular value decomposition of the matrix N.
2. Truncation error in SVD
The matrix a in equation (3) is SVD decomposed as follows:
A=PRQT#(4)
at this time
W=A+Y=QR+PTT#(5)
Wherein
Figure RE-GDA0003085109780000081
Where α is the threshold value, θkIs the k-th of RthThe singular values, SVD, present problems during the setting, in practice, very small singular values are usually set to 0, in order to avoid the problem of small value expansion during the calculation of equation (5); the singular value return to zero hardly causes numerical errors, the numerical errors are called residual errors, a sparse model is designed by utilizing the small-range error values, the accuracy of the final classification of the model can be greatly improved, and the residual errors are in Wm=(Am)+Has a direct role in the calculation of Y, since A is influencedmAnd (5) calculating a pseudo-inverse.
3. Pruning of hidden neurons
The utilization occurs in the process of calculating the weight matrix WTo be able to design such a pruning tool better, it is necessary to know the BLS model in matrix a+Where errors occur during the calculation of (c); in fact, if we know that there are linearly independent columns in matrix A, one can derive A+Where I is an identity matrix, in practice, there is A+A is approximately equal to I; thus, due to the presence of A+The nature of A ≈ I, A+A is named as a pseudo identity matrix and passes
Figure RE-GDA0003085109780000082
Represents; while
Figure RE-GDA0003085109780000083
The diagonal and off-diagonal element distributions in (1) are slightly different from 1 and 0 because of the negligible and small range of singular value nulling in equation (6);
matrix array
Figure RE-GDA0003085109780000084
Having m rows or columns, equal to the number of enhanced node neurons in the BLS; it is helpful to understand and prune the enhanced node neurons that cause large residual error, only needing to know the I e R of each row or columnm×mAnd
Figure RE-GDA0003085109780000085
how much deviation is present;
in FIG. 3, BLS with tansig enhanced nodes is used in matrix A+The difference between A and the identity matrix I, A, with increasing number of ganglion neurons+The absolute value of the difference between A and I also exhibits a fluctuating increase, which also demonstrates that equation (6) is used to calculate A+Can be affected by truncation errors;
Figure RE-GDA0003085109780000086
lei 1 order Am-1=[K(Whjhj,Hj)],h=1,2,…,NJ-1, 2, …, m being a positive definite matrix of BLS with m enhanced node neurons and N instances, a when increasing the number of enhanced nodesm-1Is updated to Am=[Am-1 am]= [K(Whjhj,Hj)](h-1, 2, …, N, j-1, 2, …, m +1), the residual a according to equation (6)m-1Can be defined as
Where α is a threshold, neuron A with m enhanced nodesm-1Has a characteristic spectrum of { mu j1,2, …, m, and likewise, a with m +1 enhanced nodesmHas a residual error of
Figure RE-GDA0003085109780000091
Where { ρ j1,2, …, m +1, noting that μjAnd ρjAre respectively as
Figure RE-GDA0003085109780000092
And
Figure RE-GDA0003085109780000093
can prove that:
E(Am)≥E(Am-1)#(8)
to prove lemma 1, assume
Figure RE-GDA0003085109780000094
With descending order of μ12>…>μmAnd for the real and positive eigenvalues of
Figure RE-GDA0003085109780000095
Also has ρ12>…>ρm+1As demonstrated in the courent-Fischer theorem,
Figure RE-GDA0003085109780000096
and
Figure RE-GDA0003085109780000097
the characteristic values of (a) appear alternately:
ρ1122>…>μmm+1#(9)
Figure RE-GDA0003085109780000098
ρ1according to equation (6), if θjAlpha is less than or equal to the singular value B in the diagonal term of the matrix B j0, in the following attestation process, the square root will be ignored as it has no effect on our attestation, where the maximum eigenvalue cannot be 0 because ρ1Greater than the maximum eigenvalue μ1I.e. p11In the equation (9), ρ is ignored in terms of the definition of the residual being different from 01Then, the following can be obtained:
μ122>…>ρmmm+1#(10)
or rhoj+1jJ is 1,2, …, m, which means that when the enhancing node adds a new neuron, there is ρj+1jOr
Figure RE-GDA0003085109780000099
Therefore, it is desirable that the error is larger than
Figure RE-GDA00030851097800000910
Suppose that
Figure RE-GDA00030851097800000911
(T<The T characteristic value of m is set to 0, e.g. muj=0,j=(m-T+ 1),(m-T+2),…,m
Thus, therefore, it is
Figure RE-GDA0003085109780000101
Due to rhoj+1j J 1,2, m, for j (m-T +1), (m-T +2), …, m using the same procedure for ρ j+10; for example
Figure RE-GDA0003085109780000102
Eigenvalues μ of m-T remaining in (m-1) iterationsjNot less than the threshold α, but at a new value ρ in the current iteration mj+1Less than the value mu in the (m-1) th iterationjI.e. pj+1jJ is 1,2, …, m-T; since the new values become smaller, 1) if at least one of them is smaller than the threshold α, E (A) is derivedm)> E(Am-1) 2) if all values are not less than the threshold α in equation (6), E (a) is derivedm)= E(Am-1) (ii) a To this end, E (A)m)≥E(Am-1) And the proof of equation (8) ends here as well;
error E (A) in current BLS networksm) Is increased depending on the addition to AmNew column a inm.amIs constructed by randomly initializing an input weight value WhjAnd a threshold value betahjAnd the type of activation function, since WhjAnd betahjIs randomly produced and therefore does not have good control over the singular values and residuals in equation (6), and therefore prunes away those enhanced node neurons for which the residual values are high;
to calculate the relative error, first in matrix A+And performing element-by-element subtraction between the A and the unit matrix I, and dividing the sum of absolute values of element differences by the sum of elements of the unit matrix I for each iteration, so that some neuron nodes contribute to reducing relative errors, while some neuron nodes greatly increase the relative errors, and the relative difference is used as a standard for pruning the neurons of the enhancement nodes.
4. Sparse phase of BLS
Sparsifying BLS networks using residuals to design matrix A, by a feed-forward feature selection strategyPerformed based on a pseudo-identity matrix
Figure RE-GDA0003085109780000103
We have designed algorithm 1 using residual pruning BLS and named RES-BLS;
first, we input weight WhhAnd a threshold value betahjCreating a matrix a with the most enhanced node neurons, then we select one enhanced node neuron and select the column from a, and progressively more enhanced node neurons as candidate neurons, in order to minimize the relative error of enhanced node neuron j, as follows:
Figure RE-GDA0003085109780000111
wherein Ij∈RmIs the jth column vector of the identity matrix I,
Figure RE-GDA0003085109780000112
is a matrix
Figure RE-GDA0003085109780000113
And sum (I) is the addition of all elements of the unit matrix I, ωjStoring the relative error of the enhanced node j, equation (11) is used to calculate mmaxAn enhanced node, among all enhanced nodes, enhanced node neurons having a relatively small relative error are retained, and a candidate neuron ωjIs stored in the vector
Figure RE-GDA0003085109780000114
Performing the following steps;
thus, m will remainmax-1 enhancement node as candidate node, the relative error of the remaining enhancement node neurons can be calculated by equation (11), and the index and residual value of the next candidate node are added to the index and residual value, respectively
Figure RE-GDA0003085109780000115
And ω, taking m this calculation up to the maximum number of enhancement nodes we wish to retainmaxTheta ofk%. parameter thetakIs a sparsity parameter and has a value less than mmax
By updating
Figure RE-GDA0003085109780000116
And ω, it can be observed which column vectors in a have higher relative error values, and we can easily find the corresponding index value from a and can prune other columns, the left sparse design matrix after pruning
Figure RE-GDA0003085109780000117
Wherein m < mmaxIs the number of column vectors in a.
Although the invention has been described in detail above with reference to a general description and specific examples, it will be apparent to one skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims (2)

1. An intrusion detection method based on a residual error sparse width learning system is characterized in that: the method comprises the following steps:
step1, preprocessing an original intrusion detection data set, wherein the preprocessing processes of the intrusion detection data set based on the network and the intrusion detection data set based on the host are respectively as follows:
A. preprocessing the intrusion detection data set based on the network:
(a) and high-dimensional data feature mapping: converting the discrete features into digital features using high-dimensional feature mapping;
(b) and data normalization: because the difference between the data with the same attribute is large, the training of the neural network is influenced, and therefore the data are normalized into real numbers of [0,1 ];
B. preprocessing an intrusion detection data set based on a host:
(a) converting the file type: the txt type file is converted into an xls file, and the attributes of each type of data type are separated, so that MATLAB (matrix laboratory) processing is facilitated;
(b) bag of words model characterization data: because the original data in the data set belong to text class characteristics and are not beneficial to use, the data are characterized by using a word bag model according to word frequency;
step2, standard data set partitioning: dividing a standard data set into a training set and a test set;
step3, model training: training and parameter tuning are carried out on the BLS model;
(a) initializing parameters of the BLS model: the number of characteristic nodes n, the number of enhanced nodes m, and a sparse parameter thetakVector of
Figure RE-FDA0003039861920000011
And initial weights and thresholds of the network;
(b) calculating the characteristic node mapping value Z of the BLS modeli=φ(XWθiθi) And enhanced node mapping value Hj=ξ(ZnWhjhj);
(c) Merging the characteristic node mapping values and the enhanced node mapping values into a matrix A;
(d) calculating a residual error A + A of the BLS model, sequencing the enhancement nodes according to the residual error, calculating a relative error of the enhancement nodes, trimming the enhancement nodes according to the relative error, reserving the enhancement nodes with smaller relative errors, if an iteration termination condition is met, ending the iteration and turning to the step (e), otherwise, turning to the step (d), wherein the relative error is as follows:
Figure RE-FDA0003039861920000012
(e) outputting the optimal BLS model, namely the RES-BLS model;
and Step4, inputting the test data into the trained RES-BLS intrusion detection model, and further obtaining the classification result of each piece of data.
2. The intrusion detection method based on the residual error sparse width learning system according to claim 1, wherein: the training or predicting time of the RES-BLS model depends on time complexity, and by assuming that the number of training samples is m, the iteration times is l, and the neuron numbers of a characteristic node layer and an output layer are n respectively2And n3The number of neurons in the enhanced node layer is n2In the original BLS model, the time complexity of calculating a sample is o [ (n)1+n2)*n3]Therefore, the overall time complexity of the BLS model is o { m x l [ (n)1+n2)*n3]And the RES-BLS model is improved on the basis of BLS, the enhanced nodes need to be sorted according to relative errors, and the time complexity is o (n)2logn2) Therefore, the overall time complexity of the RES-BLS model is o { m x l [ (n)1+n2)*n3+n2logn2]}。
CN202011524068.1A 2020-12-21 2020-12-21 Intrusion detection method based on residual error sparse width learning system Pending CN113159310A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011524068.1A CN113159310A (en) 2020-12-21 2020-12-21 Intrusion detection method based on residual error sparse width learning system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011524068.1A CN113159310A (en) 2020-12-21 2020-12-21 Intrusion detection method based on residual error sparse width learning system

Publications (1)

Publication Number Publication Date
CN113159310A true CN113159310A (en) 2021-07-23

Family

ID=76882698

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011524068.1A Pending CN113159310A (en) 2020-12-21 2020-12-21 Intrusion detection method based on residual error sparse width learning system

Country Status (1)

Country Link
CN (1) CN113159310A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115967631A (en) * 2022-12-19 2023-04-14 天津大学 Internet of things topology optimization method based on breadth learning and application thereof
CN117370717A (en) * 2023-12-06 2024-01-09 珠海錾芯半导体有限公司 Iterative optimization method for binary coordinate reduction

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960339A (en) * 2018-07-20 2018-12-07 吉林大学珠海学院 A kind of electric car induction conductivity method for diagnosing faults based on width study
US20190183428A1 (en) * 2017-12-19 2019-06-20 Hill-Rom Services, Inc. Method and apparatus for applying machine learning to classify patient movement from load signals
CN111598236A (en) * 2020-05-20 2020-08-28 中国矿业大学 Width learning system network model compression method
CN111641598A (en) * 2020-05-11 2020-09-08 华南理工大学 Intrusion detection method based on width learning
CN111832748A (en) * 2020-08-24 2020-10-27 西南大学 Electronic nose width learning method for performing regression prediction on concentration of mixed gas

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190183428A1 (en) * 2017-12-19 2019-06-20 Hill-Rom Services, Inc. Method and apparatus for applying machine learning to classify patient movement from load signals
CN108960339A (en) * 2018-07-20 2018-12-07 吉林大学珠海学院 A kind of electric car induction conductivity method for diagnosing faults based on width study
CN111641598A (en) * 2020-05-11 2020-09-08 华南理工大学 Intrusion detection method based on width learning
CN111598236A (en) * 2020-05-20 2020-08-28 中国矿业大学 Width learning system network model compression method
CN111832748A (en) * 2020-08-24 2020-10-27 西南大学 Electronic nose width learning method for performing regression prediction on concentration of mixed gas

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
C. L. PHILIP CHEN等: "Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture", 《IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS》 *
PEYMAN HOSSEINZADEH KASSANI: "Multimodal Sparse Classifier for Adolescent", 《IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS》 *
ZHIDA LI等: "Comparison of machine learning algorithms for detection of network intrusions", 《2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC)》 *
张思聪等: "基于dCNN的入侵检测方法", 《清华大学学报(自然科学版)》 *
李旺: "稀疏的宽度学习系统及其应用研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115967631A (en) * 2022-12-19 2023-04-14 天津大学 Internet of things topology optimization method based on breadth learning and application thereof
CN117370717A (en) * 2023-12-06 2024-01-09 珠海錾芯半导体有限公司 Iterative optimization method for binary coordinate reduction
CN117370717B (en) * 2023-12-06 2024-03-26 珠海錾芯半导体有限公司 Iterative optimization method for binary coordinate reduction

Similar Documents

Publication Publication Date Title
Li et al. Building auto-encoder intrusion detection system based on random forest feature selection
Xu et al. An improved data anomaly detection method based on isolation forest
Rakkiyappan et al. Event-triggered H∞ state estimation for semi-Markov jumping discrete-time neural networks with quantization
CN111222133A (en) Multistage self-adaptive coupling method for industrial control network intrusion detection
CN113159310A (en) Intrusion detection method based on residual error sparse width learning system
Jagtap et al. Comparison of extreme-ANFIS and ANFIS networks for regression problems
Niranjan et al. ERCR TV: Ensemble of random committee and random tree for efficient anomaly classification using voting
Chen et al. A Generalized Matching Pursuit Approach for Graph-Structured Sparsity.
Tian et al. Adaptive normalized attacks for learning adversarial attacks and defenses in power systems
Kumar et al. Wind speed prediction using deep learning-LSTM and GRU
Zoltowski et al. Sparsity-promoting optimal control of spatially-invariant systems
Luo et al. ML-KELM: A kernel extreme learning machine scheme for multi-label classification of real time data stream in SIoT
Guang et al. Benchmark datasets for stochastic Petri net learning
Song et al. Real-time anomaly detection method for space imager streaming data based on HTM algorithm
Zeng et al. Computation of Adalines' sensitivity to weight perturbation
Wang et al. An anomaly detection method of industrial data based on stacking integration
Chakraborty et al. Brain-Inspired Spiking Neural Network for Online Unsupervised Time Series Prediction
Rao et al. Robust stability of nonlinear diffusion fuzzy neural networks with parameter uncertainties and time delays
Sun et al. Nonlinear function approximation based on least Wilcoxon Takagi-Sugeno fuzzy model
Liu et al. Network traffic big data prediction model based on combinatorial learning
Chakrabarti et al. A review on various artificial intelligence techniques used for transmission line fault location
Sharma et al. An adaptive sigmoidal activation function cascading neural networks
Liu An improved Bayesian network intrusion detection algorithm based on deep learning
Chen et al. Sparse LSTM neural network with hybrid PSO algorithm
Chen et al. Quantized minimum error entropy criterion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210723