CN101582813B - Distributed migration network learning-based intrusion detection system and method thereof - Google Patents

Distributed migration network learning-based intrusion detection system and method thereof Download PDF

Info

Publication number
CN101582813B
CN101582813B CN2009100230731A CN200910023073A CN101582813B CN 101582813 B CN101582813 B CN 101582813B CN 2009100230731 A CN2009100230731 A CN 2009100230731A CN 200910023073 A CN200910023073 A CN 200910023073A CN 101582813 B CN101582813 B CN 101582813B
Authority
CN
China
Prior art keywords
sample
prime
record
network
submodule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009100230731A
Other languages
Chinese (zh)
Other versions
CN101582813A (en
Inventor
缑水平
焦李成
王宇琴
田小林
王爽
马文萍
吴建设
慕彩红
冯静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN2009100230731A priority Critical patent/CN101582813B/en
Publication of CN101582813A publication Critical patent/CN101582813A/en
Application granted granted Critical
Publication of CN101582813B publication Critical patent/CN101582813B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a distributed migration network learning-based intrusion detection system and a method thereof, and mainly solves the problems that the prior method has low efficiency in detection of some attack types and is difficult to search data again. The whole system comprises a network behavior record preprocessing module, an abnormality detection module and an abnormal behavior analyzing module. The network behavior record preprocessing module completes the quantification and normalization processing of a network behavior record; the abnormality detection module uses an abnormality detection learning machine to completes the classification and identification for an input record, determines whether the record is a normal behavior, and completes the detection if the record isa normal behavior or transmits the record to the abnormal behavior analyzing module if the record is an abnormal behavior; and the abnormal behavior analyzing module uses an abnormal behavior analyzi ng learning machine to carry out the classification and identification of the input records and outputs the attach type of the record. The system and the method have the advantages of using other existing resources to improve the detection rate for the prior attach types with low detection rate and avoiding searching the data again and can be used for network intrusion detection.

Description

Intruding detection system and method thereof based on distributed migration network learning
Technical field
The invention belongs to network safety filed, be specifically related to a kind of intruding detection system, can be used for the intrusion detection of information security aspect.
Background technology
Along with being extensive use of of Internet, the increasing rogue attacks of computer network has been caused threat to the safety of information system, based on the network safety system of passive managements such as fire compartment wall back door for application layer, the attack that the unauthorized operation of internal user etc. causes or steal, destruction information has been powerless, and fire compartment wall itself is under attack easily, often feels simply helpless for the safety problem that internal network occurs.As a kind of important additional measure of network safety prevention instrument " fire compartment wall ", intruding detection system IDS demonstrates the importance that increases day by day.IDS can resist the attack from internal network, and can stop hacker's invasion and prevent spreading of virus, be the safe backing of fire compartment wall.Utilize record of the audit, IDS can identify any undesirable activity, thereby limits these activities, the safety of protection system.1998, MIT Lincoln laboratory and DARPA cooperation development the intruding detection system assessment, one of task of this plan provides the data set that is used for intrusion detection that comprises host log and network traffics, 9 all tcpdump data that 99 couples of DARPA of KDD CUP ' provide have been carried out suitable processing and feature extraction, as the intrusion detection data set of standard.
In recent years, development along with machine learning, new Intrusion Detection Technique also constantly occurs, as unit intrusion detection method based on neural net, Bayesian network, SVMs etc., but in the face of new environment and network security problem under the new situation, there are problems such as rate of false alarm height, adaptivity are poor, not high, the intelligent degree of the extent of reaction is not high automatically in above-mentioned Intrusion Detection Technique, so distributed algorithm becomes the detection speed that improves IDS and detects the essential of accuracy.2006, people such as Wang Shijun introduce the Boosting algorithm in the grader network, with the grader network and grader is integrated and use, and extend to distributed environment, propose the integrated algorithm DNB of distributed network, obtained having the classifier system of stronger generalization ability by communication between each node classifier and cooperation.Owing to, seldom the time, can not train sorter model preferably, and require training data and test data independent same distribution with existing other traditional machine learning methods based on the intrusion detection method of DNB, therefore have following shortcoming as label data:
1, there is energy imbalance in the network behavior verification and measurement ratio to different attack types, and is very low to the network behavior verification and measurement ratio of some attack type;
2 if improve verification and measurement ratio, needs user's gather data and learn this task expense costliness and spended time again;
3, can not utilize existing other available resources to improve the network behavior verification and measurement ratio of some attack type.
Summary of the invention
The objective of the invention is to overcome the shortcoming that above-mentioned intrusion detection method exists, transfer learning is introduced among the DNB, proposition is based on the intruding detection system and the method thereof of distributed migration network learning, utilizing existing other data to instruct the study of the lower network behavior of verification and measurement ratio, thereby improve its verification and measurement ratio.
For achieving the above object, detection system of the present invention comprises:
The network behavior record preprocessing module is used for quantification and normalization preliminary treatment finished in the network behavior record of collecting, and gives the abnormality detection module with pretreated result transmission;
The abnormality detection module is used for adopting the abnormality detection learning machine to carry out Classification and Identification to the record of input, determines whether this record belongs to normal behaviour, if this record belongs to normal behaviour and does not then deal with, detection of end, otherwise, this record is reached the abnormal behaviour analysis module;
The abnormal behaviour analysis module is used for adopting abnormal behaviour analytic learning machine to carry out Classification and Identification to the exception record of input, exports the affiliated attack type of this record.
Described network behavior record preprocessing module comprises:
Existing record preprocessing submodule is used for having label network behavior record collection to finish to quantize and normalized to existing, and will quantize with normalized after parameter import new record preliminary treatment submodule into;
New record preliminary treatment submodule, the parameter of utilizing existing record preprocessing submodule to import into quantizes and normalized new network behavior record.
Described abnormality detection module comprises:
Abnormality detection study submodule, existingly there is label network behavior record collection to be divided into normal and unusual two classes with pretreated, therefrom randomly draw the part sample respectively, adopt distributed network integrated study algorithm to learn, generate the abnormality detection learning machine, and this learning machine is transferred to abnormality detection test submodule;
Abnormality detection test submodule adopts the abnormality detection learning machine that the pretreated new record of input is carried out Classification and Identification, if the output result is normal, then do not deal with, and detection of end, otherwise, import this record into the abnormal behaviour analysis module.
Described abnormal behaviour analysis module comprises:
The migration sample is chosen submodule in advance, there are the network behavior record setting source territory sample and the aiming field of label that exemplar is arranged with existing, wait to instruct sample that territory, source sample is finished in advance according to aiming field and choose, with the source domain migration sample input abnormal behaviour analytic learning submodule of selecting;
Abnormal behaviour analytic learning submodule has exemplar together as training sample with source domain migration sample and the aiming field of importing, and adopts the distributed network integrated study algorithm of introducing transfer learning to learn, and generates abnormal behaviour analytic learning machine;
Abnormal behaviour analytical test submodule adopts abnormal behaviour analytic learning machine to carry out Classification and Identification to the abnormal behaviour of importing, and exports its attack type.
For achieving the above object, detection method of the present invention comprises the steps:
(1) imports the existing label network behavior record collection X that has, this data set is quantized and the normalization preliminary treatment, obtain pretreated X ' as a result;
(2) there is the pretreated X ' as a result of label network behavior record collection to be divided into normal and unusual two classes with existing, comprise M class attack type in unusual, from normal and exceptional sample, randomly draw a part of sample respectively, adopt distributed network integrated study algorithm containing K 1Carry out T on the network topology structure of individual node 1The wheel training, the grader network system of generation abnormality detection learning machine;
(3) set the sample of the middle normal type of X ' as territory, source sample set X S, sample number is m, the sample of Exception Type is as aiming field sample set X T, X TThe sample of the Exception Type that middle verification and measurement ratio is lower waits to instruct sample set X as aiming field T1, sample number is n 1, and with X SDivide equally and be m/n 1Part, be expressed as: X S = X 1 S ∪ X 2 S ∪ . . . ∪ X [ m / n 1 ] S , Wherein [] is rounding operation, with X i SWith X T1Be combined as training set T i 1(i=1,2 ..., [m/n 1]), adopt the method for adjusting sample weights in the AdaBoost algorithm training process to adjust sample weights, select bigger territory, the source sample subset X of weight Sub S
(4) with territory, source sample subset X Sub SWith other type sample of aiming field as training sample, reuse in the AdaBoost algorithm training process method of adjusting sample weights and adjust sample weights, from X Sub SMiddle bigger territory, the source sample of weight of removing is with X Sub SMiddle samples remaining is formed source domain migration sample set TR D
(5) from aiming field sample set X TIn randomly draw a part of sample and form aiming field sample subclass TR S, with the source domain migration sample set TR that selects DTogether as training sample, with TR DGive with aiming field and wait the label that instructs sample identical, layout contains K 2The network topology structure of individual node, input sampling rate ρ 2, exercise wheel counts T 2, with TR SAnd TR DBe distributed on each node, generate the training sample set S on each node k, k=1,2 ..., K 2, generate abnormal behaviour analytic learning machine as follows according to this training sample set:
5a) each node training sample set S of initialization kThe weight of middle sample;
5b) to each node training sample set S kThe weight sampling of putting back to is arranged, obtain the training subclass of each node, in the learning algorithm of each node, train, obtain the basic grader C of each node K, t 2, with the basic grader of each node to this S kClassify, wherein t is current exercise wheel number;
5c) basis is to S kClassification results calculate the weighting error rate ε of aiming field sample on each node K, t, and according to ε K, t, calculate the weight of each basic grader K, t 2:
5d) the weight of renewal source domain migration sample and aiming field sample is as t<T 2The time, change step 5b, work as t=T 2The time, finish training, obtain by all basic grader C K, t 2(k=1,2 ..., K 2, t=1,2 ..., T 2) the grader network system of the abnormal behaviour analytic learning machine formed;
(6) the new network behavior record x of input ", it is quantized and the normalization preliminary treatment, obtain pretreated network behavior record x as a result " ';
(7) with x " ' be input in the grader network system of the abnormality detection learning machine that step 2 generates and classify, obtain classification results:
H 1 ( x ′ ′ ′ ) = sign ( Σ k = 1 K 1 Σ t = 1 T 1 ( α k , t 1 h k , t 1 ( x ′ ′ ′ ) + Σ p α p , t 1 h p , t 1 ( x ′ ′ ′ ) ) )
Wherein, h K, t 1(x " ') be in the abnormality detection learning machine each basic grader to x " ' classification results, α K, t 1Be the weight of each basic grader, p is the neighboring node label of node k, works as H 1(x " ') be 1 o'clock, expression belongs to normal type, does not do any processing, the detection of end process; Work as H 1(x " ') be-1 o'clock, expression belongs to Exception Type, then changes step (8) over to;
(8) with x " ' be input in the grader network system of the abnormal behaviour analytic learning machine that step 5 generates and classify, obtain classification results:
H 2 ( x ′ ′ ′ ) = arg max y ∈ Y ( Σ k = 1 K 2 Σ t = 1 T 2 ( α k , t 2 I [ h k , t 2 ( x ′ ′ ′ ) = y ] + Σ p α p , t 2 I [ h p , t 2 ( x ′ ′ ′ ) = y ] ) )
Wherein, h K, t 2(x " ') be in the abnormal behaviour analytic learning machine each basic grader to x " ' classification results, and h k , t 2 ( x ′ ′ ′ ) ∈ Y , Y={1 wherein, 2 ..., M}, 1,2 ..., M is respectively the call number of M kind attack type, I[] be indicator function, its value is 0 or 1, H 2(x " ') ∈ Y;
(9) with H 2(x " ') as call number, search the attack type of this call number correspondence, this attack type is exported as final testing result.
The present invention has the following advantages compared with prior art:
1) the present invention can utilize existing other to have label data to instruct the study of the lower attack type of verification and measurement ratio owing to introduce transfer learning, need not gather data again;
2) the present invention can select the migration sample that the lower attack type of verification and measurement ratio is had directive significance owing to adopt the AdaBoost algorithm to adjust the method for sample weights;
3) the present invention is owing to adopt the distributed network integrated study algorithm of introducing transfer learning, and the abnormal behaviour analytic learning machine of generation has the higher detection rate to the lower attack type of former verification and measurement ratio;
4) the present invention is owing to adopt distributed network integrated study algorithm, and the abnormality detection learning machine of generation has higher abnormality detection precision;
The present invention is based on the intruding detection system of network, can be used in the various complex network environments.Simulation result shows, to the large scale network intrusion detection data set KDD CUP ' 99 of standard, can improve about 87.3% before adopting the intrusion detection method based on distributed migration network learning of the present invention that the verification and measurement ratio of R2L attack type is introduced transfer learning.
Description of drawings
Fig. 1 is the intruding detection system schematic diagram based on distributed migration network learning of the present invention;
Fig. 2 is the intrusion detection method flow chart based on distributed migration network learning of the present invention;
Fig. 3 chooses flow chart in advance for source domain migration sample among the present invention;
Fig. 4 is for generating the flow chart of abnormal behaviour analytic learning machine among the present invention.
Embodiment
The present invention is described in detail below in conjunction with the drawings and specific embodiments.
With reference to Fig. 1, the intruding detection system based on distributed migration network learning of the present invention mainly comprises: network behavior record preprocessing module, abnormality detection module and abnormal behaviour analysis module.Wherein:
The network behavior record preprocessing module comprises: existing record preprocessing submodule and new record preliminary treatment submodule.Existing record preprocessing submodule is used for having label network behavior record collection to finish to quantize and normalized to existing, and will quantize with normalized after parameter import new record preliminary treatment submodule into; The parameter that the existing record preprocessing submodule of new record preliminary treatment submodule utilization imports into is finished new record and is quantized and normalized.
The abnormality detection module comprises: abnormality detection study submodule and abnormality detection test submodule.Existingly there is label network behavior record collection to be divided into normal and unusual two classes with pretreated, therefrom randomly draw the part sample respectively, adopt distributed network integrated study algorithm to learn, generate the abnormality detection learning machine, and this learning machine is transferred to abnormality detection test submodule; Abnormality detection test submodule adopts the abnormality detection learning machine that the pretreated new record of input is carried out Classification and Identification, if the output result is normal, then do not deal with, and detection of end, otherwise, import this record into the abnormal behaviour analysis module.
The abnormal behaviour analysis module comprises: the migration sample is chosen submodule, abnormal behaviour analytic learning submodule and abnormal behaviour analytical test submodule in advance.The migration sample is chosen submodule in advance has the network behavior record setting source territory sample and the aiming field of label that exemplar is arranged with existing, wait to instruct sample that territory, source sample is finished in advance according to aiming field and choose, with the source domain migration sample input abnormal behaviour analytic learning submodule of selecting; Abnormal behaviour analytic learning submodule uses the source domain migration sample and the aiming field of input exemplar to be arranged together as training sample, adopts the distributed network integrated study algorithm of introducing transfer learning to learn, and generates abnormal behaviour analytic learning machine; Abnormal behaviour analytical test submodule adopts abnormal behaviour analytic learning machine to carry out Classification and Identification to the abnormal behaviour of input, exports its attack type.
Intruding detection system of the present invention quantizes and the normalization preliminary treatment existing network behavior record, and the parameter after the record preprocessing; Pretreated existing record is divided into normal and unusual two classes, and each extracting part divided data adopts distributed network integrated study algorithm to learn as training sample from two classes respectively, obtains the abnormality detection learning machine; Again with normal recordings as territory, source sample, other exception records are as the aiming field sample, adjust sample weights by the AdaBoost algorithm, according to sample weights source domain migration sample are chosen in advance; From four class abnormal datas of pretreated existing record, randomly draw a part of sample respectively, the source domain migration sample of obtaining with preliminary election is together as training sample, adopt the integrated algorithm of distributed network of introducing transfer learning to learn, generate abnormal behaviour analytic learning machine; When new network behavior record is imported, the parameter that obtains according to existing network behavior record preprocessing, it is quantized and the normalization preliminary treatment, then pretreated result being input to the abnormality detection learning machine tests, if detect normally, then do not deal with detection of end, test otherwise be input to abnormal behaviour analytic learning machine, finally export its affiliated attack type.
With reference to Fig. 2, intrusion detection method of the present invention comprises the steps:
Step 1: import the existing label network behavior record collection X that has, this data set is quantized and the normalization preliminary treatment, obtain pretreated X ' as a result.
Pretreated detailed process is as follows:
Be the attribute of character string 1a), add up its type and quantize, obtain quantized result X property value among the X 1
1b) to X 1Attribute in the span of property value of source address byte number and destination address byte number be defined as [0,1.3 * 10 9], two property values are done log10 () conversion, its range conversion to [0.0,9.14], is obtained the X as a result after the conversion 2
1c) to the X as a result after the conversion 2Carry out following normalization:
Suppose X 2Comprise n sample, each sample has the d dimensional feature, and the d dimensional feature vector of n sample is expressed as X 2=[F 1, F 2..., F d], and the i dimensional feature vector is expressed as F i=[f I1, f I2..., f In], all features are done following normalization conversion:
f′ ij=f ij/max(F i),i=1,2,...,d,j=1,2,...,n
f′ i=[f′ i1,f′ i2,...,f′ in],i=1,2,...,d
X′=[F′ 1,F′ 2,...,F′ d]
Obtain pretreated X ' as a result.
Step 2: have the pretreated X ' as a result of label network behavior record collection to be divided into normal and unusual two classes with existing, comprise M class attack type in unusual, from normal and exceptional sample, randomly draw a part of sample respectively, adopt distributed network integrated study algorithm containing K as training sample 1Carry out T on the network topology structure of individual node 1The wheel training, the grader network system of generation abnormality detection learning machine.
Step 3: existing other type has exemplar as territory, source sample set X among the setting X ' S, sample number is m, the sample of Exception Type is as aiming field sample set X T, X TThe sample of the Exception Type that middle verification and measurement ratio is lower waits to instruct sample set X as aiming field T1, sample number is n 1, according to X SAnd X T1, adopt the method for adjusting sample weights in the AdaBoost algorithm training process, select bigger territory, the source sample subset X of sample weights Sub S
Above-mentioned bigger territory, the source sample subset X of sample weights of selecting Sub SChoose flow process shown in Fig. 3 (a), its concrete steps are as follows:
3a) input source territory sample set X S, aiming field sample set X T, what wherein aiming field waited to instruct type has an exemplar collection X T1, sample weights threshold value W 1
3b) because n 1<<m sets X SLabel be+1, X T1Label be-1, for making two class sample balances, with X SDivide equally and be m/n 1Part, promptly X S = X 1 S ∪ X 2 S ∪ . . . ∪ X [ m / n 1 ] S , Wherein [] is rounding operation, forms training set T i 1 = X i S ∪ X T 1 ( i = 1,2 , . . . , [ m / n 1 ] ) ;
3c) with T i 1(i=1,2 ..., [m/n 1]) import in the AdaBoost algorithm and train, after too much training in rotation is practiced, respectively from T i 1(i=1,2 ..., [m/n 1]) in select sample weights greater than threshold value W 1And belong to X i S(i=1,2 ..., [m/n 1]) sample, form territory, source sample subset X Sub S
Step 4: with territory, source sample subset X Sub SWith other type sample of aiming field as training sample, reuse in the AdaBoost algorithm training process method of adjusting sample weights and adjust sample weights, from X Sub SMiddle bigger territory, the source sample of weight of removing is with X Sub SSample in the residue is formed source domain migration sample set TR D
Above-mentioned from X Sub SThe middle flow process of removing bigger territory, the source sample of weight is shown in Fig. 3 (b), and its concrete steps are as follows:
4a) with aiming field sample set X TIn remove the sample set X wait to instruct type T1Outside, the sample set of other Exception Types is expressed as X T2, set X Sub SLabel be+1, X T2Label be-1, input sample weights threshold value W 2
4b) with X Sub SWith X T2Form training set T 2, be input in the AdaBoost algorithm and train, make sample weights obtain adjustment, after many wheel training finished, sample weights was greater than threshold value W in the removal training set 2And belong to X Sub SSample, with X Sub SMiddle samples remaining is formed source domain migration sample set TR D
Step 5: from aiming field sample X TIn randomly draw a part of sample and form aiming field sample subclass TR S, with the source domain migration sample set TR that selects DTogether as training sample, with TR DGive with aiming field and wait the label that instructs sample identical, adopt the distributed network integrated study algorithm of introducing transfer learning to train, generate abnormal behaviour analytic learning machine, detailed process as shown in Figure 4:
5a) input contains K 2The network topology structure of individual node, sample rate ρ, exercise wheel is counted T 2, and with aiming field sample set TR SWith source domain migration sample set TR DBe distributed on each node, generate training sample set S on each node k, k=1,2 ..., K 2
5b) the weight D of last i the sample of each node k of initialization K, 1(x i)=1/l k, l wherein kFor node k goes up training set S kIn contained number of samples, i=1,2 ..., l k, k=1,2 ..., K 2
5c) employing has the weight sampling of putting back to, from the training set S of each node kThe middle training subclass of sampling and obtaining each node, wherein sample number is l kρ, k=1,2 ..., K 2, t is current exercise wheel number;
5d) according to each node training subclass, the basic grader C at training node k place K, t 2, and use basic grader C K, t 2To the training set S on this node kClassification;
5e) basis is to S kClassification results, calculate the weighting error rate ε of aiming field sample K, t:
ϵ k , t = Σ x i ∈ S k ∩ TR S D k , t ( x i ) I [ y ( x i ) ≠ h k , t 2 ( x i ) ] , k = 1,2 , . . . , K 2 ;
H wherein K, t 2(x i) be basic grader C K, t 2To sample x iClassification results, h k , t 2 ( x i ) ∈ Y , Y={1,2 ..., M}, 1,2 ..., M is respectively the call number of M kind attack type, y (x i) be sample x iKnown label;
5f) calculate basic grader C K, t 2Weight K, t 2:
α k , t 2 = 0.5 × log ( 1 - ϵ k , t ϵ k , t ) , k = 1,2 , . . . , K 2 ;
5g) the weight undated parameter of calculating aiming field sample is β k , t = ϵ k , t 1 - ϵ k , t Weight undated parameter with source domain migration sample γ k = 1 1 + 2 ln ( m k / T 2 ) , M wherein kBe source, node k place domain migration sample number;
5h) the sample x of new node k place more iWeight D K, t(x i), obtain upgrading back weight D K, t+1(x i):
D k , t + 1 ( x i ) = D k , t ( x i ) · β k , t λ k , t ( x i ) Z k , t , x i ∈ S k ∩ TR S D k , t ( x i ) · γ k - λ k , t ( x i ) Z k , t , x i ∈ S k ∩ TR D
Wherein, x i∈ S k∩ TR SExpression x iBelong to S kIn the aiming field sample, x i∈ S k∩ TR DExpression x iBelong to S kIn source domain migration sample, and
Z k , t = Σ x i ∈ S k ∩ TR S D k , t ( i ) · β k , t λ k , t ( x i ) + Σ x i ∈ S k ∩ TR D D k , t ( i ) · γ k - λ k , t ( x i )
λ k , t ( x i ) = - 2 α k , t 2 ( I ( y ( x i ) ≠ h k , t 2 ( x i ) ) - 1 / 2 ) - 2 Σ p α p , t 2 ( I ( y ( x i ) ≠ h p , t 2 ( x i ) ) - 1 / 2 )
Wherein, p is the neighboring node label of node k;
5i) as t<T 2The time, change step 5c, work as t=T 2The time, finish training, obtain by all basic grader C K, t 2, k=1,2 ..., K 2, t=1,2 ..., T 2The grader network system of the abnormal behaviour analytic learning machine of forming.
Step 6: import new network behavior record x ", it is carried out preliminary treatment, obtain pretreated result and be x " '.
Be the attribute of character string to property value 6a), according to step 1a) method it is quantized, quantizing the back result is x " 1
6b) source address byte number and two property values of destination address byte number are done log10 () conversion respectively, obtain after the conversion x as a result " 2
6c) will write down x " 2The d dimensional feature be expressed as x " 2=f " 1, f " 2..., f " d, and to x " 2Carry out following normalization, obtain pretreated x as a result " ':
f i ′ ′ ′ = f i ′ ′ / max ( F i ) f i ′ ′ ≤ max ( F i ) 1 , f i ′ ′ > max ( F i ) , i = 1,2 , . . . , d
x″′={f″′ 1,f″′ 2,...,f″′ d}。
Step 7: with x " ' be input in the grader network system of the abnormality detection learning machine that step 2 generates and classify, obtain classification results:
H 1 ( x ′ ′ ′ ) = sign ( Σ k = 1 K 1 Σ t = 1 T 1 ( α k , t 1 h k , t 1 ( x ′ ′ ′ ) + Σ p α p , t 1 h p , t 1 ( x ′ ′ ′ ) ) )
Wherein, h K, t 1(x " ') be in the abnormality detection learning machine each basic grader to x " ' classification results, α K, t 1For the weight of each basic grader, work as H 1(x " ') be 1 o'clock, expression belongs to normal type, does not do any processing, and the detection of end process is worked as H 1(x " ') be-1 o'clock, expression belongs to Exception Type, then changes step (8) over to.
Step 8: with x " ' be input in the grader network system of the abnormal behaviour analytic learning machine that step 5 generates and classify, obtain classification results:
H 2 ( x ′ ′ ′ ) = arg max y ∈ Y ( Σ k = 1 K 2 Σ t = 1 T 2 ( α k , t 2 I [ h k , t 2 ( x ′ ′ ′ ) = y ] + Σ p α p , t 2 I [ h p , t 2 ( x ′ ′ ′ ) = y ] ) )
Wherein, h K, t 2(x " ') be in the abnormal behaviour analytic learning machine each basic grader to x " ' classification results, and h k , t 2 ( x ′ ′ ′ ) ∈ Y , H 2(x″′)∈Y。
Step 9: with H 2(x " ') as call number, search the attack type of this call number correspondence, this attack type is exported as final testing result.
Effect of the present invention can further specify the extensive standard intrusion detection of standard emulated data by following:
1, simulated conditions
Emulation of the present invention is at Windows XP, and SPI, CPU Pentium (R) 4, fundamental frequency 2.4GHZ, software platform are the VC++6.0 operation.The original intrusion detection that emulation is selected for use is data from common data sets KDD CUP ' 99, and this data centralization comprises a kind of normal type Normal and four kinds of invasions type: DOS, Probe, R2L and U2R, and every network behavior record comprises 41 features.Choose wherein two sub-data sets " kddcup99.data.10_percent " and " corrected ", " kddcup99.data.10_percent " is as training dataset, " corrected " as test data set, its sample distribution situation is as shown in table 1.
Table 1 data set sample distribution
Figure G2009100230731D00111
2, simulation result
Specific implementation process to emulation intrusion detection of the present invention is:
(1) with the existing capable preliminary treatment of network behavior record preprocessing submodule of data set " kddeup99.data.10_percent " input;
(2) data in the pretreated data set " kddcup99.data.10_percent " are divided into two classes, Normal is a normal class, label is 1, DOS, Probe, R2L, U2R are a unusual class, label is-1, from normal and abnormal data, extract 5000 and 10000 samples respectively, with this sample that extracts as training sample, adopt distributed network integrated study algorithm to learn, obtain the abnormality detection learning machine, using the node number in the emulation is that 20 BA does not have the scale network, and sample rate is 0.7, the weight undated parameter is 0.8, and exercise wheel is counted T 1=10, basic grader is nuclear matching tracing learning machine KMPLM;
(3) with Normal class sample as territory, source sample, sample number is 97278, R2L class sample is as aiming field sample to be instructed, DOS, Probe, U2R class sample are other type sample of aiming field, Normal class sample is chosen in advance, obtain sample number and be 1874 source domain migration sample set, the basic grader of AdaBoost algorithm is nuclear matching tracing learning machine KMPLM in the emulation;
(4) randomly drawing sample by a certain percentage in the unusual four class data from pretreated data set " Kddcup99.data.10_percent ", all kinds of ratios is: DOS 2.5%, Probe 75%, R2L 100%, U2R 100%, with source domain migration sample set together as training sample, wherein give the label identical with the R2L type with source domain migration sample, adopt the distributed network integrated study algorithm of introducing transfer learning to train, obtain abnormal behaviour analytic learning machine, selecting the node number in the emulation for use is that 20 BA does not have the scale network, and sample rate is 0.6, and exercise wheel is counted T 2=10, basic grader is nuclear matching tracing learning machine KMPLM;
(5) data set " corrected " is carried out preliminary treatment;
(6) pretreated data set " corrected " is input to the abnormality detection learning machine and tests, simulation result is as shown in table 2;
Table 2 abnormality detection accuracy
Figure G2009100230731D00121
(7) abnormal data of " corrected " data set of data set after the preliminary treatment is input in the abnormal behaviour analytic learning machine tests, its simulation result is as shown in table 3, and wherein DTNL represents to introduce the distributed network integrated study algorithm of transfer learning.
Table 3 abnormal behaviour analyzing and testing accuracy
Figure G2009100230731D00122
As seen from Table 2, this emulation adopts the abnormality detection learning machine of distributed network integrated study generation that normal type and Exception Type are all had the higher detection rate.
As seen from Table 3, this emulation is adopted distributed network integrated study algorithm DNB respectively and is introduced the distributed network integrated study algorithm DTNL generation abnormal behaviour analytic learning machine of transfer learning, the distributed network integrated study algorithm of introducing transfer learning has improved about 87.3% than distributed network integrated study algorithm to the verification and measurement ratio of R2L, and the verification and measurement ratio of other abnormal behaviours does not obviously reduce.
Above-mentioned whole intrusion detection process all realizes its function by computer program, finishes the detection to network behavior.
This embodiment is being to implement under the prerequisite with the technical solution of the present invention, provided detailed execution mode and concrete operating process, but protection scope of the present invention is not limited to the foregoing description.

Claims (5)

1. intruding detection system based on distributed migration network learning comprises:
The network behavior record preprocessing module comprises existing record preprocessing submodule and new record preliminary treatment submodule; Should existing record preprocessing submodule, be used for having label network behavior record collection to finish to quantize and normalized to existing, and will quantize with normalized after parameter import new record preliminary treatment submodule into; This new record preliminary treatment submodule, the parameter of utilizing existing record preprocessing submodule to import into quantizes and normalized new network behavior record, and will quantize and normalized after result transmission to the abnormality detection module;
The abnormality detection module comprises abnormality detection study submodule and abnormality detection test submodule; This abnormality detection study submodule, existingly there is label network behavior record collection to be divided into normal and unusual two classes with pretreated, therefrom randomly draw the part sample respectively, adopt distributed network integrated study algorithm to learn, generate the abnormality detection learning machine, and this learning machine is transferred to abnormality detection test submodule; This abnormality detection test submodule adopts the abnormality detection learning machine that the pretreated new network behavior record of input is carried out Classification and Identification, if the output result is normal, then do not deal with, and detection of end, otherwise, import this record into the abnormal behaviour analysis module;
The abnormal behaviour analysis module comprises that moving sample chooses submodule, abnormal behaviour analytic learning submodule and abnormal behaviour analytical test submodule in advance; This migration sample is chosen submodule in advance, there are the network behavior record setting source territory sample and the aiming field of label that exemplar is arranged with existing, wait to instruct sample that territory, source sample is finished in advance according to aiming field and choose, with the source domain migration sample input abnormal behaviour analytic learning submodule of selecting; This abnormal behaviour analytic learning submodule has exemplar together as training sample with source domain migration sample and the aiming field of importing, and adopts the distributed network integrated study algorithm of introducing transfer learning to learn, and generates abnormal behaviour analytic learning machine; This abnormal behaviour analytical test submodule adopts abnormal behaviour analytic learning machine to carry out Classification and Identification to the exception record of importing, and exports its attack type.
2. the intrusion detection method based on distributed migration network learning comprises the steps;
(1) input is existing a label network behavior record collection X, existingly has label network behavior record collection to quantize and the normalization preliminary treatment to this, obtains pretreated X ' as a result;
(2) there is the pretreated X ' as a result of label network behavior record collection to be divided into normal and unusual two classes with existing, comprise M class attack type in unusual, from normal and exceptional sample, randomly draw a part of sample respectively, adopt distributed network integrated study algorithm containing K 1Carry out T on the network topology structure of individual node 1The wheel training, the grader network system of generation abnormality detection learning machine;
(3) set the sample of the middle normal type of X ' as territory, source sample set X S, sample number is m, the sample of Exception Type is as aiming field sample set X T, X TThe sample of the Exception Type that middle verification and measurement ratio is lower waits to instruct sample set X as aiming field T1, sample number is n 1, and with X SDivide equally and be m/n 1Part, be expressed as:
Figure FSB00000500085500021
Wherein [] is rounding operation, will With X T1Be combined as training set Adopt the method for adjusting sample weights in the AdaBoost algorithm training process to adjust sample weights, select bigger territory, the source sample subclass of weight
Figure FSB00000500085500024
(4) with territory, source sample subclass
Figure FSB00000500085500025
With other type sample of aiming field as training sample, reuse in the AdaBoost algorithm training process method of adjusting sample weights and adjust sample weights, from
Figure FSB00000500085500026
Middle bigger territory, the source sample of weight of removing will
Figure FSB00000500085500027
Middle samples remaining is formed source domain migration sample set TR D
(5) from aiming field sample set X TIn randomly draw a part of sample and form aiming field sample subclass TR S, with the source domain migration sample set TR that selects DTogether as training sample, with TR DGive with aiming field and wait the label that instructs sample identical, layout contains K 2The network topology structure of individual node, input sampling rate ρ 2, exercise wheel counts T 2, with TR SAnd TR DBe distributed on each node, generate the training sample set S on each node k, k=1,2 ..., K 2, generate abnormal behaviour analytic learning machine as follows according to this training sample set:
5a) each node training sample set S of initialization kThe weight of middle sample;
5b) to each node training sample set S kThe weight sampling of putting back to is arranged, obtain the training subclass of each node, in the learning algorithm of each node, train, obtain the basic grader of each node With the basic grader of each node to this S kClassify, wherein t is current exercise wheel number;
5c) basis is to S kClassification results calculate the weighting error rate ε of aiming field sample on each node K, t, and according to ε K, t, calculate the weight of each basic grader
Figure FSB00000500085500029
5d) the weight of renewal source domain migration sample and aiming field sample is as t<T 2The time, change step 5b, work as t=T 2The time, finish training, obtain by all basic graders (k=1,2 ..., K 2, t=1,2 ..., T 2) the grader network system of the abnormal behaviour analytic learning machine formed;
(6) existing network behavior record is quantized and normalized, and the parameter after the record preprocessing: as new network behavior record x " during input; according to having the parameter that the network behavior record preprocessing obtains; it is quantized and the normalization preliminary treatment, obtains pretreated network behavior record x ' as a result ";
(7) with x ' " be input in the grader network system of the abnormality detection learning machine that step 2 generates and classify, obtain classification results:
H 1 ( x ′ ′ ′ ) = sign ( Σ k = 1 K 1 Σ t = 1 T 1 ( α k , t 1 h k , t 1 ( x ′ ′ ′ ) + Σ p α p , t 1 h p , t 1 ( x ′ ′ ′ ) ) )
Wherein, For each basic grader in the abnormality detection learning machine to x ' " classification results,
Figure FSB00000500085500033
Be the weight of each basic grader, p is the neighboring node label of node k, works as H 1(x ' ") be 1 o'clock, expression belongs to normal type, does not do any processing, the detection of end process; Work as H 1(x ' ") be-1 o'clock, expression belongs to Exception Type, then changes step (8) over to;
(8) with x ' " be input in the grader network system of the abnormal behaviour analytic learning machine that step 5 generates and classify, obtain classification results:
H 2 ( x ′ ′ ′ ) = arg max y ∈ Y ( Σ k = 1 K 2 Σ t = 1 T 2 ( α k , t 2 I [ h k , t 2 ( x ′ ′ ′ ) = y ] + Σ p α p , t 2 I [ h p , t 2 ( x ′ ′ ′ ) = y ] ) )
Wherein, For each basic grader in the abnormal behaviour analytic learning machine to x ' " classification results, and
Figure FSB00000500085500036
Y={1 wherein, 2 ..., M}, 1,2 ..., M is respectively the call number of M kind attack type, I[] and be indicator function, its value is 0 or 1, H 2(x ' ") ∈ Y;
(9) with H 2(x ' ") as call number, search the attack type of this call number correspondence, this attack type is exported as final testing result.
3. method according to claim 2, wherein described bigger territory, the source sample subclass of sample weights of selecting of step 3
Figure FSB00000500085500037
Choose as follows:
3a) set
Figure FSB00000500085500038
Label be+1, X T1Label be-1, input sample weights threshold value W 1
3b) will Import respectively in the AdaBoost algorithm and train, make sample weights obtain adjusting, after many wheel training finish, respectively from In select sample weights greater than threshold value W 1And belong to
Figure FSB000005000855000311
Sample, form territory, source sample subclass
Figure FSB000005000855000312
4. method according to claim 2, wherein step 4 described from
Figure FSB000005000855000313
Middle bigger territory, the source sample of weight of removing, remove as follows:
4a) with aiming field sample X TIn remove the sample X wait to instruct type T1Outside, other Exception Types be expressed as X T2, set Label be+1, X T2Label be-1, input sample weights threshold value W 2
4b) will With X T2Form training set T 2, be input in the AdaBoost algorithm and train, make sample weights obtain adjusting, after many wheel training finish, remove T 2Middle sample weights is greater than threshold value W 2And belong to
Figure FSB000005000855000316
Sample, will
Figure FSB00000500085500041
Middle samples remaining is formed source domain migration sample set TR D
5. method according to claim 2, the wherein weight of the described renewal of step 5d source domain migration sample and aiming field sample, renewal as follows:
5a) calculate the weight undated parameter of aiming field sample respectively
Figure FSB00000500085500042
Weight with source domain migration sample
Undated parameter
Figure FSB00000500085500043
M wherein kBe the node k S of place kIn contained source domain migration sample number;
5b) the sample x of new node k place more iWeight D K, t(x i), obtain upgrading back weight D K, t+1(x i):
D k , t + 1 ( x i ) = D k , t ( x i ) · β k , t λ k , t ( x i ) Z k , t , x i ∈ S k ∩ TR S D k , t ( x i ) · γ k - λ k , t ( x i ) Z k , t , x i ∈ S k ∩ TR D
Wherein, x i∈ S k∩ TR SExpression x iBelong to S kIn the aiming field sample, x i∈ S k∩ TR DExpression x iBelong to S kIn source domain migration sample, and
Z k , t = Σ x i ∈ S k ∩ TR S D k , t ( i ) · β k , t λ k , t ( x i ) + Σ x i ∈ S k ∩ TR D D k , t ( i ) · γ k - λ k , t ( x i )
λ k , t ( x i ) = - 2 α k , t 2 ( I ( y ( x i ) ≠ h k , t 2 ( x i ) ) - 1 / 2 ) - 2 Σ p α p , t 2 ( I ( y ( x i ) ≠ h p , t 2 ( x i ) ) - 1 / 2 )
Wherein, y (x i) be sample x iKnown label,
Figure FSB00000500085500047
For
Figure FSB00000500085500048
To sample x iClassification results, and h k , t 2 ( x i ) ∈ Y .
CN2009100230731A 2009-06-26 2009-06-26 Distributed migration network learning-based intrusion detection system and method thereof Expired - Fee Related CN101582813B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009100230731A CN101582813B (en) 2009-06-26 2009-06-26 Distributed migration network learning-based intrusion detection system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009100230731A CN101582813B (en) 2009-06-26 2009-06-26 Distributed migration network learning-based intrusion detection system and method thereof

Publications (2)

Publication Number Publication Date
CN101582813A CN101582813A (en) 2009-11-18
CN101582813B true CN101582813B (en) 2011-07-20

Family

ID=41364786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009100230731A Expired - Fee Related CN101582813B (en) 2009-06-26 2009-06-26 Distributed migration network learning-based intrusion detection system and method thereof

Country Status (1)

Country Link
CN (1) CN101582813B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101395A (en) * 2018-07-27 2018-12-28 曙光信息产业(北京)有限公司 A kind of High Performance Computing Cluster application monitoring method and system based on LSTM
CN110113353A (en) * 2019-05-20 2019-08-09 桂林电子科技大学 A kind of intrusion detection method based on CVAE-GAN

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101794396B (en) * 2010-03-25 2012-12-26 西安电子科技大学 System and method for recognizing remote sensing image target based on migration network learning
CN102176698A (en) * 2010-12-20 2011-09-07 北京邮电大学 Method for detecting abnormal behaviors of user based on transfer learning
CN102387135B (en) * 2011-09-29 2015-01-28 北京邮电大学 User identity filtering method and firewall
CN103218431B (en) * 2013-04-10 2016-02-17 金军 A kind ofly can identify the system that info web gathers automatically
CN104008426A (en) * 2014-05-15 2014-08-27 上海交通大学 Distributed computing environment performance predicting method based on integrated learning
CN108604304A (en) * 2016-01-20 2018-09-28 商汤集团有限公司 For adapting the depth model indicated for object from source domain to the method and system of aiming field
CN108154029A (en) * 2017-10-25 2018-06-12 上海观安信息技术股份有限公司 Intrusion detection method, electronic equipment and computer storage media
CN107612938A (en) * 2017-10-27 2018-01-19 朱秋华 A kind of network user's anomaly detection method, device, equipment and storage medium
CN108322445A (en) * 2018-01-02 2018-07-24 华东电力试验研究院有限公司 A kind of network inbreak detection method based on transfer learning and integrated study
CN108040073A (en) 2018-01-23 2018-05-15 杭州电子科技大学 Malicious attack detection method based on deep learning in information physical traffic system
CN108197670B (en) * 2018-01-31 2021-06-15 国信优易数据股份有限公司 Pseudo label generation model training method and device and pseudo label generation method and device
CN109672666B (en) * 2018-11-23 2021-12-14 北京丁牛科技有限公司 Network attack detection method and device
CN109462610A (en) * 2018-12-24 2019-03-12 哈尔滨工程大学 A kind of network inbreak detection method based on Active Learning and transfer learning
CN109492193B (en) * 2018-12-28 2020-11-27 同济大学 Abnormal network data generation and prediction method based on deep machine learning model
CN109523018B (en) * 2019-01-08 2022-10-18 重庆邮电大学 Image classification method based on deep migration learning
CN110224987B (en) * 2019-05-08 2021-09-17 西安电子科技大学 Method for constructing network intrusion detection model based on transfer learning and detection system
CN110348486A (en) * 2019-06-13 2019-10-18 中国科学院计算机网络信息中心 Based on sampling and feature brief non-equilibrium data collection conversion method and system
CN110365583B (en) * 2019-07-17 2020-05-22 南京航空航天大学 Symbol prediction method and system based on bridge domain transfer learning
CN110399856B (en) * 2019-07-31 2021-09-14 上海商汤临港智能科技有限公司 Feature extraction network training method, image processing method, device and equipment
CN110995459B (en) * 2019-10-12 2021-12-14 平安科技(深圳)有限公司 Abnormal object identification method, device, medium and electronic equipment
CN110880020B (en) * 2019-10-30 2022-10-25 西安交通大学 Self-adaptive trans-regional base station energy consumption model migration and compensation method
CN111131185B (en) * 2019-12-06 2022-12-09 中国电子科技网络信息安全有限公司 CAN bus network anomaly detection method and device based on machine learning
CN111016720A (en) * 2019-12-23 2020-04-17 深圳供电局有限公司 Attack identification method based on K nearest neighbor algorithm and charging device
CN111666979B (en) * 2020-05-13 2023-09-08 北京科技大学 Underwater scene target detection integration method and system for label generation
CN111340144B (en) * 2020-05-15 2020-08-11 支付宝(杭州)信息技术有限公司 Risk sample detection method and device, electronic equipment and storage medium
CN111652297B (en) * 2020-05-25 2021-05-25 哈尔滨市科佳通用机电股份有限公司 Fault picture generation method for image detection model training
CN112153000B (en) * 2020-08-21 2023-04-18 杭州安恒信息技术股份有限公司 Method and device for detecting network flow abnormity, electronic device and storage medium
CN112348202B (en) * 2021-01-05 2021-03-30 博智安全科技股份有限公司 Method for establishing rule model in machine learning
CN115118450B (en) * 2022-05-17 2024-01-05 北京理工大学 Incremental dynamic weight integrated learning intrusion detection method integrating multistage features

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101395A (en) * 2018-07-27 2018-12-28 曙光信息产业(北京)有限公司 A kind of High Performance Computing Cluster application monitoring method and system based on LSTM
CN110113353A (en) * 2019-05-20 2019-08-09 桂林电子科技大学 A kind of intrusion detection method based on CVAE-GAN
CN110113353B (en) * 2019-05-20 2021-06-22 桂林电子科技大学 Intrusion detection method based on CVAE-GAN

Also Published As

Publication number Publication date
CN101582813A (en) 2009-11-18

Similar Documents

Publication Publication Date Title
CN101582813B (en) Distributed migration network learning-based intrusion detection system and method thereof
Pusara et al. User re-authentication via mouse movements
CN108848068A (en) Based on deepness belief network-Support Vector data description APT attack detection method
CN102291392B (en) Hybrid intrusion detection method based on Bagging algorithm
CN107862347A (en) A kind of discovery method of the electricity stealing based on random forest
CN104636449A (en) Distributed type big data system risk recognition method based on LSA-GCC
CN111428231A (en) Safety processing method, device and equipment based on user behaviors
CN111143838B (en) Database user abnormal behavior detection method
CN107273752B (en) Vulnerability automatic classification method based on word frequency statistics and naive Bayes fusion model
CN110830489B (en) Method and system for detecting counterattack type fraud website based on content abstract representation
CN109145544A (en) A kind of human-computer behavior detection system and method
CN110263539A (en) A kind of Android malicious application detection method and system based on concurrent integration study
Mathai Performance comparison of intrusion detection system between deep belief network (DBN) algorithm and state preserving extreme learning machine (SPELM) algorithm
CN108920953A (en) A kind of malware detection method and system
CN102324007A (en) Method for detecting abnormality based on data mining
Harbola et al. Improved intrusion detection in DDoS applying feature selection using rank & score of attributes in KDD-99 data set
Zubi et al. Using data mining techniques to analyze crime patterns in the libyan national crime data
CN108959922A (en) A kind of malice document detection method and device based on Bayesian network
CN109446780A (en) A kind of identity identifying method, device and its storage medium
Qu et al. Direct batch growth hierarchical self-organizing mapping based on statistics for efficient network intrusion detection
CN108427882A (en) The Android software dynamic analysis detection method of Behavior-based control feature extraction
Yang et al. Learning vector quantization neural network method for network intrusion detection
He et al. An improved kernel clustering algorithm used in computer network intrusion detection
CN111654463A (en) Support vector electromechanical network intrusion detection system and method based on feature selection
Shirbhate et al. Performance evaluation of PCA filter in clustered based intrusion detection system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110720

Termination date: 20150626

EXPY Termination of patent right or utility model