CN105704136A

CN105704136A - Big data association-based network attack detection method

Info

Publication number: CN105704136A
Application number: CN201610131314.4A
Authority: CN
Inventors: 焦栋; 敖乃翔; 王辰; 王德勇; 徐心毅; 郭静
Original assignee: China Electronics Technology Group Corp CETC
Current assignee: China Electronics Technology Group Corp CETC; Electronic Science Research Institute of CTEC
Priority date: 2016-03-09
Filing date: 2016-03-09
Publication date: 2016-06-22
Anticipated expiration: 2036-03-09
Also published as: CN105704136B

Abstract

The present invention provides a big data association-based network attack detection method. The method comprises the steps of mining and analyzing the big data of a historical Trojan program; mining the frequently adjacent plots of the historical Trojan program; constructing a frequent plot knowledge base based on the frequently adjacent plots of the historical Trojan program; comparing and matching the adjacent plots of a target program with the frequently adjacent plots of the historical Trojan program in the frequent plot knowledge base; judging the target program is a Trojan program on the condition that the adjacent plots of the target program are matched with the frequently adjacent plots of the historical Trojan program in the frequent plot knowledge base, or judging whether the target program is a Trojan program or not based on the naive Bayes algorithm on the condition that the adjacent plots of the target program are not matched with the frequently adjacent plots of the historical Trojan program in the frequent plot knowledge base; and predicting the subsequent attack action of the target program according to the suffix event of the adjacent plots of the target program on the condition that the target program is the Trojan program. According to the technical scheme of the invention, the detection rate of unknown Trojan programs is improved. Meanwhile, an effective control means is provided for Trojan programs.

Description

A kind of network attack detecting method based on big data association

Technical field

The present invention relates to internet security technical field, particularly relate to a kind of network attack detecting method based on big data association。

Background technology

Along with the development of network technology, network starts to cover on a large scale the every field such as people's daily life, work, study。Enjoy that network brings huge simultaneously easily, people are also faced with serious security threat。Wooden horse (Trojan) program, as a kind of attack tool, is used to steal the important informations such as the various account of user, classified papers, privacy information, thus speculating for assailant, privacy and the data safety of Internet user in serious threat。

At present, the trojan horse program detection technique of main flow has following two: signature detection technology and based on trojan horse program behavior characteristics detection technique。

The trojan horse program detection method of feature based code: analyze known trojan horse program and infected system file, summary and induction goes out trojan horse program condition code, and constructs trojan horse program feature database。Process title when described trojan horse program is characterized by by analyzing wooden horse and running in target program, wooden horse original document and the generation feature string of file, the mode of start-up loading, the filename of generation, file size and place catalogue, use the information such as fixed port draw。When judging whether target program is trojan horse program, the condition code of target program being contrasted with the condition code in trojan horse program feature database, if condition code coupling, then target program is judged to trojan horse program。

The trojan horse program detection method of Behavior-based control feature: judged a kind of method of trojan horse program by the distinctive behavior characteristics of trojan horse program。The method, mainly by the observation that trojan horse program is long-term, analysis, research and conclusion, extracts the logical adaptive behavior characteristics of trojan horse program, and these logical adaptive behavior characteristicss seldom occurs in normal procedure。By monitoring behavior when program is run, when finding trojan horse program behavior characteristics, system will send suspicious trojan horse program and report to the police, and takes trojan horse program treatment measures。Main wooden horse behavior characteristics has: executable file is made write operation, usurp closure works system and interrupt, the switching of Virus and host program, write boot sector or execution is formatted diskette, edit the registry, amendment startup item, revise file association, be registered as system service, establishment network communication channel, well known port are taken, are opened the questionable conduct such as port that are of little use。

The trojan horse program detection method of the trojan horse program detection method and Behavior-based control feature that are typically based on condition code is all by contrasting known trojan horse program feature, to identify trojan horse program, known trojan horse program is had higher Detection accuracy, and rate of false alarm is relatively low。But above two method cannot effectively identify unknown trojan horse program, lack the effective control device to unknown trojan horse program。

Summary of the invention

The technical problem to be solved in the present invention is to provide a kind of network attack detecting method based on big data association, improves the Detection accuracy to unknown trojan horse program。

The technical solution used in the present invention is, the described network attack detecting method based on big data association, including:

Step one, carries out big data mining analysis to history trojan horse program, excavates the frequently adjacent plot of history trojan horse program, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse；

Step 2, carrying out contrast by the adjacent plot of target program with the frequently adjacent plot of the history trojan horse program in frequent episodic knowledge storehouse and mates, if matching, then judging that described target program is as trojan horse program；If not matching, then NB Algorithm is adopted to judge whether described target program is trojan horse program；

Step 3, the adjacent plot suffix event when judging target program as trojan horse program, according to target program, it was predicted that target program follow-on attack behavior。

Further, step one, specifically include:

Analyzing each history trojan horse program crawler behavior feature at different periods, described crawler behavior feature comprises history trojan horse program crawler behavior characteristic vector S and history trojan horse program sequence of events K；

S=[(a₁,t₁),(a₂,t₂),…(a_i,t_i),…(a_n,t_n)]；

K=[a₁,a₂,…a_i,…a_n]；

Wherein, a_iIt is that history trojan horse program is at t_iThe crawler behavior feature of period；

The order arranging day part appearance according to the sequencing of time from front to back is as follows: t₁, t₂..., t_n；

The span of variable i: 1≤i≤n；

N is the sequence of events length of trojan horse program；

All history trojan horse program sequence of events K are stored in data base, history of forming trojan horse program sequence of events storehouse；

By automat, all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined plot excavate, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。

Further, it is characterized in that, described plot that all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined by automat is excavated, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse, specifically include:

All history trojan horse program sequence of events K from history trojan horse program sequence of events storehouse excavate, by automat, the frequently adjacent plot set E that length is j_j；

The rest may be inferred, according to the frequently adjacent plot set E that tap length is j_jMethod tap length from the frequently adjacent plot set of 2 to M, and constituted frequent episodic knowledge storehouse by length from the frequently adjacent plot set of length each in the scope of 2 to M；

In any program, event is called adjacent plot successively；

The span of variable j is: 2≤j≤M；

M is the longest adjacent plot length of history trojan horse program sequence of events K。

Further, described automat excavates the adjacent plot set E that length is j_j, specifically include:

The adjacent plot set of any two history trojan horse program sequence of events K in described history trojan horse program sequence of events storehouse is disassembled that to be two length be j, the adjacent plot that length in two adjacent plot set is j is carried out contrast coupling, when adjacent plot is mated completely, the adjacent plot of coupling be the length that automat is excavated is the adjacent plot of j, and is that the adjacent plot occurrence number of j adds 1 to length；

When the occurrence number of adjacent plot is not less than support threshold, will abut against plot and put into the frequently adjacent plot set E that length is j_jIn；

When the occurrence number of adjacent plot is less than support threshold, not will abut against plot and put into the frequently adjacent plot set E that length is j_jIn；

Further, step 2, specifically include:

Target program is grown most adjacent plot excavate, generate the adjacent plot e of target program_m；

e_m=[b₁,b₂,…b_i,…,b_m]；

Wherein, b_iFor the crawler behavior feature that target program occurs successively；

The span of variable i is: 1≤i≤m；

M is the sequence of events length of target program；

By adjacent for target program plot e_mContrast with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, if the adjacent plot e of target program_mMate with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, then judge that target program is as trojan horse program；

If target program adjoins plot e_mDo not mate with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, then adopt NB Algorithm to judge whether described target program is trojan horse program。

Further, described employing NB Algorithm judges whether described target program is trojan horse program, specifically includes:

B1: stochastic variable category set C is set；

C={ normal procedure, nondeterministic program, trojan horse program }；

If c1=normal procedure, c2=nondeterministic program, c3=trojan horse program；

B2: by NB Algorithm, comprises the adjacent plot e of target program in calculation procedure sample set Z_mNormal procedure Probability p (c₁|e_m), program sample set Z comprises the adjacent plot e of target program_mNondeterministic program Probability p (c₂|e_m) and program sample set Z in comprise the adjacent plot e of target program_mTrojan horse program Probability p (c₃|e_m)；

p (c_{1} | e_{m}) = \frac{p (e_{m} | c_{1}) p (c_{1})}{p (e_{m})};

p (c_{2} | e_{m}) = \frac{p (e_{m} | c_{2}) p (c_{2})}{p (e_{m})};

p (c_{3} | e_{m}) = \frac{p (e_{m} | c_{3}) p (c_{3})}{p (e_{m})};

B3: by adjacent for target program plot [b1, b₂,…b_i,…,b_m] substitute into p (c respectively₁|e_m)、p(c₂|e_m) and p (c₃|e_m), obtain:

p (c_{1} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{i} | c_{1}) p (c_{1})}{Π_{i = 1}^{m} p (b_{i})};

p (c_{2} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{1} | c_{2}) p (c_{2})}{Π_{i = 1}^{m} p (b_{i})};

p (c_{3} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{i} | c_{3}) p (c_{3})}{Π_{i = 1}^{m} p (b_{i})};

B4: to p (c₁|e_m)、p(c₂|e_m) and p (c₃|e_m) compare, it is judged that whether target program is trojan horse program, it is judged that process is as follows:

If p is (c₁|e_m) for maximum, then judge that target program is as normal procedure；

If p is (c₂|e_m) for maximum, then judge that target program is as nondeterministic program；

If p is (c₃|e_m) for maximum, then judge that target program is as trojan horse program；

If it is determined that target program is trojan horse program, then finds most like frequency based on Euclidean distance and connect plot。

Further, step 3, specifically include:

When judging target program as trojan horse program, target program in history trojan horse program sequence of events K adjoins the follow-up adjacent plot adjacent plot suffix event as target program of plot, and the adjacent plot suffix event of target program is target program follow-on attack behavior。

Adopting technique scheme, the present invention at least has the advantage that

Network attack detecting method based on big data association of the present invention, overcomes the defect that prior art is high to unknown trojan horse program detection rate of false alarm, while the present invention can realize reducing rate of false alarm, also achieves the anticipation to unknown wooden horse follow-on attack behavior。Compensate for existing wooden horse inspection method and lack the deficiency of unknown wooden horse follow-on attack prediction, provide strong support to systematic protection decision-making。

Accompanying drawing explanation

Fig. 1 is the network attack detecting method flow chart based on big data association of first embodiment of the invention；

Fig. 2 is the network attack detecting method flow chart based on big data association of second embodiment of the invention；

Fig. 3 is the network attack detection idiographic flow schematic diagram based on big data association of third embodiment of the invention。

Detailed description of the invention

For further setting forth that the present invention reaches technological means and effect that predetermined purpose is taked, below in conjunction with accompanying drawing and preferred embodiment, the present invention is described in detail as after。

First embodiment of the invention, a kind of network attack detecting method based on big data association, as it is shown in figure 1, include step in detail below:

Step S101, carries out big data mining analysis to history trojan horse program, excavates the frequently adjacent plot of history trojan horse program, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。

Concrete, step S101, including:

Step A1: analyze each history trojan horse program crawler behavior feature at different periods, described crawler behavior feature comprises history trojan horse program crawler behavior characteristic vector S and history trojan horse program sequence of events K；

S=[(a₁,t₁),(a₂,t₂),…(a_i,t_i),…(a_n,t_n)]；

K=[a₁,a₂,…a_i,…a_n]；

a_iIt is that history trojan horse program is at t_iThe crawler behavior feature of period；

The span of variable i: 1≤i≤n；

N is the sequence of events length of trojan horse program；

All history trojan horse program sequence of events K are stored in data base, history of forming trojan horse program sequence of events storehouse。

Step A2: by automat, all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined plot and excavates, is constituted frequent episodic knowledge storehouse by the frequently adjacent plot of history trojan horse program。

Concrete, step A2, including:

All history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are excavated, by automat, the frequently adjacent plot set E that length is j_j；

The span of variable j is: 2≤j≤M；

Concrete, automat excavates the frequently adjacent plot set E that length is j_j, including:

The adjacent plot set of any two history trojan horse program sequence of events K in described history trojan horse program sequence of events storehouse is disassembled by automat to be two length be j, the adjacent plot that length in two adjacent plot set is j is carried out contrast coupling, when adjacent plot is mated completely, the adjacent plot of coupling be the length that automat is excavated is the adjacent plot of j, and is that the adjacent plot occurrence number of j adds 1 to length；

When the occurrence number of adjacent plot is less than support threshold, not will abut against plot and put into the frequently adjacent plot set E that length is j_jIn。

In any program, event is called adjacent plot successively。

The rest may be inferred, and automat is according to the frequently adjacent plot set E that tap length is j_jMethod tap length from the frequently adjacent plot set of 2 to M, and constituted frequent episodic knowledge storehouse by length from the frequently adjacent plot set of length each in the scope of 2 to M。

Step S102, to the adjacent plot of target program with in numerous episodic knowledge storehouse frequently adjacent plot carry out contrast and mate, it is judged that whether target program is trojan horse program。

Concrete, step S102, including:

Step B1: target program is grown most adjacent plot and excavates, generates the adjacent plot e of target program_m；

e_m=[b₁,b₂,…b_i,…,b_m]；

b_iFor the crawler behavior feature that target program occurs successively；

I span is: 1≤i≤m；

M is the sequence of events length of target program。

Step B2: by adjacent for target program plot e_mContrast with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse。

As the adjacent plot e of target program_mWhen mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, it is determined that target program is trojan horse program；

As the adjacent plot e of target program_mWhen not mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, it is determined that target program is not trojan horse program。

Step S103, the adjacent plot suffix event prediction target program follow-on attack behavior when target program is judged to trojan horse program, according to target program。

Concrete, step S103, including:

Second embodiment of the invention, a kind of network attack detecting method based on big data association, described in the present embodiment, method is roughly the same with first embodiment, is distinctive in that as the adjacent plot e of target program_mWhen not mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, determine whether whether target program is trojan horse program, and the target program being judged as trojan horse program is carried out follow-on attack behavior prediction, as shown in Figure 2, the described method of the present embodiment, also includes step in detail below:

Step S201, carries out big data mining analysis to history trojan horse program, excavates the frequently adjacent plot of history trojan horse program, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。

Concrete, step S201, including:

S=[(a₁,t₁),(a₂,t₂),…(a_i,t_i),…(a_n,t_n)]；

K=[a₁,a₂,…a_i,…a_n]；

The span of variable i: 1≤i≤n；

N is the sequence of events length of trojan horse program；

Concrete, step A2, including:

The span of variable j is: 2≤j≤M；

In any program, event is called adjacent plot successively。

Step S202, to the adjacent plot of target program with in numerous episodic knowledge storehouse frequently adjacent plot carry out contrast and mate, it is judged that whether target program is trojan horse program。

Concrete, step S202, including:

e_m=[b₁,b₂,…b_i,…,b_m]；

b_iFor the crawler behavior feature that target program occurs successively；

I span is: 1≤i≤m；

M is the sequence of events length of target program。

As the adjacent plot e of target program_mWhen not mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, target program is performed step S203 operation, determines whether whether target program is trojan horse program。

Step S203, by adopting NB Algorithm to further determine whether the unmatched target program of step S202 into trojan horse program。

Concrete, step S203, including:

By NB Algorithm, calculation procedure sample set Z comprises the adjacent plot e of target program_mNormal procedure, nondeterministic program and trojan horse program probability。

Step D1: stochastic variable category set C is set；

C={ normal procedure, nondeterministic program, trojan horse program }；

If c₁=normal procedure, c₂=nondeterministic program, c₃=trojan horse program；

Step D2: by NB Algorithm, comprises the adjacent plot e of target program in calculation procedure sample set Z_mNormal procedure Probability p (c₁|e_m), program sample set Z comprises the adjacent plot e of target program_mNondeterministic program Probability p (c₂|e_m) and program sample set Z in comprise the adjacent plot e of target program_mTrojan horse program Probability p (c₃|e_m)；

p (c_{1} | e_{m}) = \frac{p (e_{m} | c_{1}) p (c_{1})}{p (e_{m})};

p (c_{2} | e_{m}) = \frac{p (e_{m} | c_{2}) p (c_{2})}{p (e_{m})};

p (c_{3} | e_{m}) = \frac{p (e_{m} | c_{3}) p (c_{3})}{p (e_{m})};

Step D3: by adjacent for target program plot [b₁,b₂,…b_i,…,b_m] substitute into p (c respectively₁|e_m)、p(c₂|e_m) and p (c₃|e_m), obtain:

p (c_{1} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{i} | c_{1}) p (c_{1})}{Π_{i = 1}^{m} p (b_{i})};

p (c_{2} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{i} | c_{2}) p (c_{2})}{Π_{i = 1}^{m} p (b_{i})};

p (c_{3} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{i} | c_{3}) p (c_{3})}{Π_{i = 1}^{m} p (b_{i})};

Step D4: to p (c₁|e_m)、p(c₂|e_m) and p (c₃|e_m) compare, it is judged that whether target program is trojan horse program, it is judged that process is as follows:

As p (c₁|e_m) for maximum time, it is determined that target program is normal procedure；

As p (c₂|e_m) for maximum time, it is determined that target program is nondeterministic program；

As p (c₃|e_m) for maximum time, it is determined that target program is trojan horse program。

When target program is judged to wooden horse, find most like adjacent plot based on Euclidean distance。

Step S204, the adjacent plot suffix event prediction target program follow-on attack behavior when target program is judged to trojan horse program, according to target program。

Concrete, step S204, including:

Third embodiment of the invention, the present embodiment is on the basis of above-described embodiment, introduces the application example of a present invention in conjunction with accompanying drawing 3, and the process that realizes of technical scheme, feature and advantage are described in detail。

As it is shown on figure 3, the network attack detecting method based on big data association of the present embodiment, comprise the steps:

Step one, carries out big data mining analysis to history trojan horse program, extracts the crawler behavior feature of history trojan horse program, forms frequent episodic knowledge storehouse。

Concrete, step one, including:

1) analyzing each history trojan horse program crawler behavior feature at different periods, described crawler behavior feature comprises history trojan horse program crawler behavior characteristic vector S and history trojan horse program sequence of events K；

S=[(a₁,t₁),(a₂,t₂),…(a_i,t_i),…(a_n,t_n)]；

K=[a₁,a₂,…a_i,…a_n]；

The span of variable i: 1≤i≤n；

N is the sequence of events length of trojan horse program；

2) by automat, all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined plot to excavate, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。

Concrete, including:

A) all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are put in the adjacent plot set of candidate as program event sequence length uniquely adjacent plot；

Adjacent plot is program event successively。

B) arbitrary history trojan horse program sequence of events K is excavated, by automat, the adjacent plot that adjacent plot length is j, and be the adjacent plot counting of j to length；

The span of variable j is: 2≤j≤M；

M is the history the longest adjacent plot length of trojan horse program sequence of events K；

Concrete, described automat excavates the adjacent plot that adjacent plot length is j, including:

Will abut against the adjacent plot set of the j that disassembles that to be two length be of any two history trojan horse program sequence of events K in plot set, the adjacent plot that length in two adjacent plot set is j is carried out contrast coupling, when adjacent plot is mated completely, the adjacent plot of coupling is the length that automat is excavated is the adjacent plot of j, and is that the adjacent plot counting of j adds 1 to length；

Such as, adjacent plot set comprises 2 history trojan horse program sequences of events, respectively K₁、K₂。Will abut against plot set and put into the adjacent plot that the adjacent plot length of excavation in automat is 3。

K₁=[a₃,a₄,a₅,a₆,a₇]；

K₂=[a₄,a₅,a₆,a₈,a₉]；

K₁Disassemble the adjacent plot set for adjacent plot length is 3 to include: [a₃,a₄,a₅]、[a₄,a₅,a₆] and [a₅,a₆,a₇]；

K₂Disassemble the adjacent plot set for adjacent plot length is 3 to include: [a₄,a₅,a₆]、[a₅,a₆,a₈]、[a₆,a₈,a₉]；

Pass through K₁Disassemble the adjacent plot for adjacent plot length is 3 and K₂Disassemble and carry out contrast coupling, [a for the adjacent plot that adjacent plot length is 3₄,a₅,a₆] mate completely, therefore [a₄,a₅,a₆] length excavated for automat is 3 adjacent plots, and is that 3 adjacent plot occurrence numbers add 1 to length。

When the occurrence number of adjacent plot is not less than support threshold, then adjacent for candidate plot is put into the frequently adjacent plot set E that length is j_jIn；

When the occurrence number of adjacent plot is less than support threshold, then adjacent for candidate plot is not put into the frequently adjacent plot set E that length is j_jIn；

Such as: adjacent plot set comprises 5 history trojan horse program sequences of events, respectively K₁、K₂、K₃、K₄、K₅。Will abut against plot set to put into automat excavates the adjacent plot that adjacent plot length is 3, and be the adjacent plot counting of 3 to length。

Described K₁=[a₁,a₂,a₃,a₄,a₅,a₆,a₇]；

Described K₂=[a₁,a₂,a₃,a₄,a₅,a₇]；

Described K₃=[a₂,a₃,a₄,a₅,a₇]；

Described K₄=[a₂,a₃,a₄,a₅]；

Described K₅=[a₁,a₄,a₅,a₆,a₇]；

As adjacent plot length j=3, when support threshold chooses 3, automat excavates the adjacent plot [a2 in history trojan horse program sequence of events K1, K2, K3 and K4, a3, a4], adjacent plot is counted as 4, more than support threshold 3, therefore will abut against plot [a2, a3, a4] and put into the frequently adjacent plot set E that length is 3₃In；Automat excavates the adjacent plot [a5, a6, a7] in history trojan horse program sequence of events K1 and K5, and adjacent plot is counted as 2, less than support threshold 3, therefore puts into the frequently adjacent plot set E that length is 3 without will abut against plot [a5, a6, a7]₃In。

C) to frequently adjacent plot set E_jIn any two frequently adjacent plot carry out contrast coupling, if j-1 adjacent plot coupling, then by two frequently adjacent plots merge into the adjacent plot of candidate that adjacent plot length is j+1；

Such as: [a₁,a₂,a₃] and [a₂,a₃,a₄] it is all frequently adjoin plot set E₃In frequently adjacent plot, by matching test, formed and meet the adjacent plot [a of candidate that plot length is 4₁,a₂,a₃,a₄]。

D) on the adjacent plot basis of the candidate that formation length is j+1, repeating step b) and step c), tap length adjoins plot set from the frequent of 2 to M；

E) frequent episodic knowledge storehouse is constituted by length from the frequently adjacent plot set of length each in the scope of 2 to M。

Step 2, to the adjacent plot of target program with in numerous episodic knowledge storehouse frequently adjacent plot carry out contrast and mate, it is judged that whether target program is trojan horse program。

Concrete, step 2 includes:

1) target program is grown most adjacent plot excavate, generate the adjacent plot e of target program_m；

e_m=[b₁,b₂,…b_i,…,b_m]；

Variable i span is: 1≤i≤m；

M is the sequence of events length of target program。

2) by adjacent for target program plot e_mContrast with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse；

As the adjacent plot e of target program_mWhen not mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, target program is carried out step 3 operation, determines whether whether target program is trojan horse program。

Step 3, by adopting NB Algorithm to further determine whether the unmatched target program of step 2 into trojan horse program。

Concrete, step 3 includes:

Calculation procedure sample set Z Program total amount is not less than 100；

In calculation procedure sample set Z, normal procedure quantity is not less than the 30% of program sample set Z program total amount；

In calculation procedure sample set Z, nondeterministic program quantity is not less than the 30% of program sample set Z program total amount；

In calculation procedure sample set Z, trojan horse program quantity is not less than the 30% of program sample set Z program total amount。

A) stochastic variable category set C is set；

C={ normal procedure, nondeterministic program, trojan horse program }；

B) by NB Algorithm, calculation procedure sample set Z comprises the adjacent plot e of target program_mNormal procedure Probability p (c₁|e_m), program sample set Z comprises the adjacent plot e of target program_mNondeterministic program Probability p (c₂|e_m) and program sample set Z in comprise the adjacent plot e of target program_mTrojan horse program Probability p (c₃|e_m)；

p (c_{1} | e_{m}) = \frac{p (e_{m} | c_{1}) p (c_{1})}{p (e_{m})};

p (c_{2} | e_{m}) = \frac{p (e_{m} | c_{2}) p (c_{2})}{p (e_{m})};

p (c_{3} | e_{m}) = \frac{p (e_{m} | c_{3}) p (c_{3})}{p (e_{m})};

C) by adjacent for target program plot [b₁,b₂,…b_i,…,b_m] substitute into p (c respectively₁|e_m)、p(c₂|e_m) and p (c₃|e_m), obtain；

p (c_{1} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{i} | c_{1}) p (c_{1})}{Π_{i = 1}^{m} p (b_{i})};

p (c_{2} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{i} | c_{2}) p (c_{2})}{Π_{i = 1}^{m} p (b_{i})};

p (c_{3} | e_{m}) = \frac{Π_{i = 1}^{m} p (b_{i} | c_{3}) p (c_{3})}{Π_{i = 1}^{m} p (b_{i})};

D) to p (c₁|e_m)、p(c₂|e_m) and p (c₃|e_m) compare, it is judged that whether target program is trojan horse program, it is judged that process is as follows:

E) when target program is judged to wooden horse, then most like adjacent plot is found based on Euclidean distance。

Step 4, the adjacent plot suffix event prediction target program follow-on attack behavior when target program is judged to trojan horse program, according to target program。

Concrete, step 4 includes:

Such as: the adjacent plot of target program is [a₂,a₄,a₆], history trojan horse program sequence of events K frequently adjoins plot [a₁,a₂,a₄,a₆,a₇,a₈,a₉] comprise the adjacent plot [a of target program₂,a₄,a₆], then [a₇,a₈,a₉] for the follow-up adjacent plot of the adjacent plot of target program, the adjacent plot suffix event that follow-up adjacent plot is target program of the adjacent plot of target program, the adjacent plot suffix event [a of target program₇,a₈,a₉] it is target program follow-on attack behavior。

By the explanation of detailed description of the invention, it should can be reach technological means that predetermined purpose takes and effect is able to more deeply and concrete understanding to the present invention, however appended diagram be only to provide with reference to and purposes of discussion, be not used for the present invention is any limitation as。

Claims

1. the network attack detecting method based on big data association, it is characterised in that including:

2. the network attack detecting method based on big data association according to claim 1, it is characterised in that step one, specifically includes:

S=[(a₁,t₁),(a₂,t₂),…(a_i,t_i),…(a_n,t_n)]；

K=[a₁,a₂,…a_i,…a_n]；

The span of variable i: 1≤i≤n；

N is the sequence of events length of trojan horse program；

3. the network attack detecting method based on big data association according to claim 2, it is characterized in that, described plot that all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined by automat is excavated, constituted frequent episodic knowledge storehouse by the frequently adjacent plot of history trojan horse program, specifically include:

In any program, event is called adjacent plot successively；

The span of variable j is: 2≤j≤M；

4. the network attack detecting method based on big data association according to claim 3, it is characterised in that described automat excavates the adjacent plot set E that length is j_j, specifically include:

5. the network attack detecting method based on big data association according to claim 1, it is characterised in that step 2, specifically includes:

e_m=[b₁,b₂,…b_i,…,b_m]；

The span of variable i is: 1≤i≤m；

M is the sequence of events length of target program；

6. the network attack detecting method based on big data association according to claim 1, it is characterised in that described employing NB Algorithm judges whether described target program is trojan horse program, specifically includes:

B1: stochastic variable category set C is set；

C={ normal procedure, nondeterministic program, trojan horse program }；

If c1=normal procedure, c2=nondeterministic program, c3=trojan horse program；

If it is determined that target program is trojan horse program, then find most like adjacent plot based on Euclidean distance。

7. the network attack detecting method based on big data association according to claim 1, it is characterised in that step 3, specifically includes: