CN105704136A - Big data association-based network attack detection method - Google Patents

Big data association-based network attack detection method Download PDF

Info

Publication number
CN105704136A
CN105704136A CN201610131314.4A CN201610131314A CN105704136A CN 105704136 A CN105704136 A CN 105704136A CN 201610131314 A CN201610131314 A CN 201610131314A CN 105704136 A CN105704136 A CN 105704136A
Authority
CN
China
Prior art keywords
program
trojan horse
plot
adjacent
adjacent plot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610131314.4A
Other languages
Chinese (zh)
Other versions
CN105704136B (en
Inventor
焦栋
敖乃翔
王辰
王德勇
徐心毅
郭静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Electronics Technology Group Corp CETC
Electronic Science Research Institute of CTEC
Original Assignee
China Electronics Technology Group Corp CETC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Electronics Technology Group Corp CETC filed Critical China Electronics Technology Group Corp CETC
Priority to CN201610131314.4A priority Critical patent/CN105704136B/en
Publication of CN105704136A publication Critical patent/CN105704136A/en
Application granted granted Critical
Publication of CN105704136B publication Critical patent/CN105704136B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24575Query processing with adaptation to user needs using context
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Virology (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention provides a big data association-based network attack detection method. The method comprises the steps of mining and analyzing the big data of a historical Trojan program; mining the frequently adjacent plots of the historical Trojan program; constructing a frequent plot knowledge base based on the frequently adjacent plots of the historical Trojan program; comparing and matching the adjacent plots of a target program with the frequently adjacent plots of the historical Trojan program in the frequent plot knowledge base; judging the target program is a Trojan program on the condition that the adjacent plots of the target program are matched with the frequently adjacent plots of the historical Trojan program in the frequent plot knowledge base, or judging whether the target program is a Trojan program or not based on the naive Bayes algorithm on the condition that the adjacent plots of the target program are not matched with the frequently adjacent plots of the historical Trojan program in the frequent plot knowledge base; and predicting the subsequent attack action of the target program according to the suffix event of the adjacent plots of the target program on the condition that the target program is the Trojan program. According to the technical scheme of the invention, the detection rate of unknown Trojan programs is improved. Meanwhile, an effective control means is provided for Trojan programs.

Description

A kind of network attack detecting method based on big data association
Technical field
The present invention relates to internet security technical field, particularly relate to a kind of network attack detecting method based on big data association。
Background technology
Along with the development of network technology, network starts to cover on a large scale the every field such as people's daily life, work, study。Enjoy that network brings huge simultaneously easily, people are also faced with serious security threat。Wooden horse (Trojan) program, as a kind of attack tool, is used to steal the important informations such as the various account of user, classified papers, privacy information, thus speculating for assailant, privacy and the data safety of Internet user in serious threat。
At present, the trojan horse program detection technique of main flow has following two: signature detection technology and based on trojan horse program behavior characteristics detection technique。
The trojan horse program detection method of feature based code: analyze known trojan horse program and infected system file, summary and induction goes out trojan horse program condition code, and constructs trojan horse program feature database。Process title when described trojan horse program is characterized by by analyzing wooden horse and running in target program, wooden horse original document and the generation feature string of file, the mode of start-up loading, the filename of generation, file size and place catalogue, use the information such as fixed port draw。When judging whether target program is trojan horse program, the condition code of target program being contrasted with the condition code in trojan horse program feature database, if condition code coupling, then target program is judged to trojan horse program。
The trojan horse program detection method of Behavior-based control feature: judged a kind of method of trojan horse program by the distinctive behavior characteristics of trojan horse program。The method, mainly by the observation that trojan horse program is long-term, analysis, research and conclusion, extracts the logical adaptive behavior characteristics of trojan horse program, and these logical adaptive behavior characteristicss seldom occurs in normal procedure。By monitoring behavior when program is run, when finding trojan horse program behavior characteristics, system will send suspicious trojan horse program and report to the police, and takes trojan horse program treatment measures。Main wooden horse behavior characteristics has: executable file is made write operation, usurp closure works system and interrupt, the switching of Virus and host program, write boot sector or execution is formatted diskette, edit the registry, amendment startup item, revise file association, be registered as system service, establishment network communication channel, well known port are taken, are opened the questionable conduct such as port that are of little use。
The trojan horse program detection method of the trojan horse program detection method and Behavior-based control feature that are typically based on condition code is all by contrasting known trojan horse program feature, to identify trojan horse program, known trojan horse program is had higher Detection accuracy, and rate of false alarm is relatively low。But above two method cannot effectively identify unknown trojan horse program, lack the effective control device to unknown trojan horse program。
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of network attack detecting method based on big data association, improves the Detection accuracy to unknown trojan horse program。
The technical solution used in the present invention is, the described network attack detecting method based on big data association, including:
Step one, carries out big data mining analysis to history trojan horse program, excavates the frequently adjacent plot of history trojan horse program, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse;
Step 2, carrying out contrast by the adjacent plot of target program with the frequently adjacent plot of the history trojan horse program in frequent episodic knowledge storehouse and mates, if matching, then judging that described target program is as trojan horse program;If not matching, then NB Algorithm is adopted to judge whether described target program is trojan horse program;
Step 3, the adjacent plot suffix event when judging target program as trojan horse program, according to target program, it was predicted that target program follow-on attack behavior。
Further, step one, specifically include:
Analyzing each history trojan horse program crawler behavior feature at different periods, described crawler behavior feature comprises history trojan horse program crawler behavior characteristic vector S and history trojan horse program sequence of events K;
S=[(a1,t1),(a2,t2),…(ai,ti),…(an,tn)];
K=[a1,a2,…ai,…an];
Wherein, aiIt is that history trojan horse program is at tiThe crawler behavior feature of period;
The order arranging day part appearance according to the sequencing of time from front to back is as follows: t1, t2..., tn
The span of variable i: 1≤i≤n;
N is the sequence of events length of trojan horse program;
All history trojan horse program sequence of events K are stored in data base, history of forming trojan horse program sequence of events storehouse;
By automat, all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined plot excavate, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。
Further, it is characterized in that, described plot that all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined by automat is excavated, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse, specifically include:
All history trojan horse program sequence of events K from history trojan horse program sequence of events storehouse excavate, by automat, the frequently adjacent plot set E that length is jj
The rest may be inferred, according to the frequently adjacent plot set E that tap length is jjMethod tap length from the frequently adjacent plot set of 2 to M, and constituted frequent episodic knowledge storehouse by length from the frequently adjacent plot set of length each in the scope of 2 to M;
In any program, event is called adjacent plot successively;
The span of variable j is: 2≤j≤M;
M is the longest adjacent plot length of history trojan horse program sequence of events K。
Further, described automat excavates the adjacent plot set E that length is jj, specifically include:
The adjacent plot set of any two history trojan horse program sequence of events K in described history trojan horse program sequence of events storehouse is disassembled that to be two length be j, the adjacent plot that length in two adjacent plot set is j is carried out contrast coupling, when adjacent plot is mated completely, the adjacent plot of coupling be the length that automat is excavated is the adjacent plot of j, and is that the adjacent plot occurrence number of j adds 1 to length;
When the occurrence number of adjacent plot is not less than support threshold, will abut against plot and put into the frequently adjacent plot set E that length is jjIn;
When the occurrence number of adjacent plot is less than support threshold, not will abut against plot and put into the frequently adjacent plot set E that length is jjIn;
Further, step 2, specifically include:
Target program is grown most adjacent plot excavate, generate the adjacent plot e of target programm
em=[b1,b2,…bi,…,bm];
Wherein, biFor the crawler behavior feature that target program occurs successively;
The span of variable i is: 1≤i≤m;
M is the sequence of events length of target program;
By adjacent for target program plot emContrast with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, if the adjacent plot e of target programmMate with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, then judge that target program is as trojan horse program;
If target program adjoins plot emDo not mate with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, then adopt NB Algorithm to judge whether described target program is trojan horse program。
Further, described employing NB Algorithm judges whether described target program is trojan horse program, specifically includes:
B1: stochastic variable category set C is set;
C={ normal procedure, nondeterministic program, trojan horse program };
If c1=normal procedure, c2=nondeterministic program, c3=trojan horse program;
B2: by NB Algorithm, comprises the adjacent plot e of target program in calculation procedure sample set ZmNormal procedure Probability p (c1|em), program sample set Z comprises the adjacent plot e of target programmNondeterministic program Probability p (c2|em) and program sample set Z in comprise the adjacent plot e of target programmTrojan horse program Probability p (c3|em);
p ( c 1 | e m ) = p ( e m | c 1 ) p ( c 1 ) p ( e m ) ;
p ( c 2 | e m ) = p ( e m | c 2 ) p ( c 2 ) p ( e m ) ;
p ( c 3 | e m ) = p ( e m | c 3 ) p ( c 3 ) p ( e m ) ;
B3: by adjacent for target program plot [b1, b2,…bi,…,bm] substitute into p (c respectively1|em)、p(c2|em) and p (c3|em), obtain:
p ( c 1 | e m ) = Π i = 1 m p ( b i | c 1 ) p ( c 1 ) Π i = 1 m p ( b i ) ;
p ( c 2 | e m ) = Π i = 1 m p ( b 1 | c 2 ) p ( c 2 ) Π i = 1 m p ( b i ) ;
p ( c 3 | e m ) = Π i = 1 m p ( b i | c 3 ) p ( c 3 ) Π i = 1 m p ( b i ) ;
B4: to p (c1|em)、p(c2|em) and p (c3|em) compare, it is judged that whether target program is trojan horse program, it is judged that process is as follows:
If p is (c1|em) for maximum, then judge that target program is as normal procedure;
If p is (c2|em) for maximum, then judge that target program is as nondeterministic program;
If p is (c3|em) for maximum, then judge that target program is as trojan horse program;
If it is determined that target program is trojan horse program, then finds most like frequency based on Euclidean distance and connect plot。
Further, step 3, specifically include:
When judging target program as trojan horse program, target program in history trojan horse program sequence of events K adjoins the follow-up adjacent plot adjacent plot suffix event as target program of plot, and the adjacent plot suffix event of target program is target program follow-on attack behavior。
Adopting technique scheme, the present invention at least has the advantage that
Network attack detecting method based on big data association of the present invention, overcomes the defect that prior art is high to unknown trojan horse program detection rate of false alarm, while the present invention can realize reducing rate of false alarm, also achieves the anticipation to unknown wooden horse follow-on attack behavior。Compensate for existing wooden horse inspection method and lack the deficiency of unknown wooden horse follow-on attack prediction, provide strong support to systematic protection decision-making。
Accompanying drawing explanation
Fig. 1 is the network attack detecting method flow chart based on big data association of first embodiment of the invention;
Fig. 2 is the network attack detecting method flow chart based on big data association of second embodiment of the invention;
Fig. 3 is the network attack detection idiographic flow schematic diagram based on big data association of third embodiment of the invention。
Detailed description of the invention
For further setting forth that the present invention reaches technological means and effect that predetermined purpose is taked, below in conjunction with accompanying drawing and preferred embodiment, the present invention is described in detail as after。
First embodiment of the invention, a kind of network attack detecting method based on big data association, as it is shown in figure 1, include step in detail below:
Step S101, carries out big data mining analysis to history trojan horse program, excavates the frequently adjacent plot of history trojan horse program, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。
Concrete, step S101, including:
Step A1: analyze each history trojan horse program crawler behavior feature at different periods, described crawler behavior feature comprises history trojan horse program crawler behavior characteristic vector S and history trojan horse program sequence of events K;
S=[(a1,t1),(a2,t2),…(ai,ti),…(an,tn)];
K=[a1,a2,…ai,…an];
aiIt is that history trojan horse program is at tiThe crawler behavior feature of period;
The order arranging day part appearance according to the sequencing of time from front to back is as follows: t1, t2..., tn
The span of variable i: 1≤i≤n;
N is the sequence of events length of trojan horse program;
All history trojan horse program sequence of events K are stored in data base, history of forming trojan horse program sequence of events storehouse。
Step A2: by automat, all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined plot and excavates, is constituted frequent episodic knowledge storehouse by the frequently adjacent plot of history trojan horse program。
Concrete, step A2, including:
All history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are excavated, by automat, the frequently adjacent plot set E that length is jj
The span of variable j is: 2≤j≤M;
M is the longest adjacent plot length of history trojan horse program sequence of events K。
Concrete, automat excavates the frequently adjacent plot set E that length is jj, including:
The adjacent plot set of any two history trojan horse program sequence of events K in described history trojan horse program sequence of events storehouse is disassembled by automat to be two length be j, the adjacent plot that length in two adjacent plot set is j is carried out contrast coupling, when adjacent plot is mated completely, the adjacent plot of coupling be the length that automat is excavated is the adjacent plot of j, and is that the adjacent plot occurrence number of j adds 1 to length;
When the occurrence number of adjacent plot is not less than support threshold, will abut against plot and put into the frequently adjacent plot set E that length is jjIn;
When the occurrence number of adjacent plot is less than support threshold, not will abut against plot and put into the frequently adjacent plot set E that length is jjIn。
In any program, event is called adjacent plot successively。
The rest may be inferred, and automat is according to the frequently adjacent plot set E that tap length is jjMethod tap length from the frequently adjacent plot set of 2 to M, and constituted frequent episodic knowledge storehouse by length from the frequently adjacent plot set of length each in the scope of 2 to M。
Step S102, to the adjacent plot of target program with in numerous episodic knowledge storehouse frequently adjacent plot carry out contrast and mate, it is judged that whether target program is trojan horse program。
Concrete, step S102, including:
Step B1: target program is grown most adjacent plot and excavates, generates the adjacent plot e of target programm
em=[b1,b2,…bi,…,bm];
biFor the crawler behavior feature that target program occurs successively;
I span is: 1≤i≤m;
M is the sequence of events length of target program。
Step B2: by adjacent for target program plot emContrast with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse。
As the adjacent plot e of target programmWhen mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, it is determined that target program is trojan horse program;
As the adjacent plot e of target programmWhen not mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, it is determined that target program is not trojan horse program。
Step S103, the adjacent plot suffix event prediction target program follow-on attack behavior when target program is judged to trojan horse program, according to target program。
Concrete, step S103, including:
When judging target program as trojan horse program, target program in history trojan horse program sequence of events K adjoins the follow-up adjacent plot adjacent plot suffix event as target program of plot, and the adjacent plot suffix event of target program is target program follow-on attack behavior。
Second embodiment of the invention, a kind of network attack detecting method based on big data association, described in the present embodiment, method is roughly the same with first embodiment, is distinctive in that as the adjacent plot e of target programmWhen not mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, determine whether whether target program is trojan horse program, and the target program being judged as trojan horse program is carried out follow-on attack behavior prediction, as shown in Figure 2, the described method of the present embodiment, also includes step in detail below:
Step S201, carries out big data mining analysis to history trojan horse program, excavates the frequently adjacent plot of history trojan horse program, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。
Concrete, step S201, including:
Step A1: analyze each history trojan horse program crawler behavior feature at different periods, described crawler behavior feature comprises history trojan horse program crawler behavior characteristic vector S and history trojan horse program sequence of events K;
S=[(a1,t1),(a2,t2),…(ai,ti),…(an,tn)];
K=[a1,a2,…ai,…an];
aiIt is that history trojan horse program is at tiThe crawler behavior feature of period;
The order arranging day part appearance according to the sequencing of time from front to back is as follows: t1, t2..., tn
The span of variable i: 1≤i≤n;
N is the sequence of events length of trojan horse program;
All history trojan horse program sequence of events K are stored in data base, history of forming trojan horse program sequence of events storehouse。
Step A2: by automat, all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined plot and excavates, is constituted frequent episodic knowledge storehouse by the frequently adjacent plot of history trojan horse program。
Concrete, step A2, including:
All history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are excavated, by automat, the frequently adjacent plot set E that length is jj
The span of variable j is: 2≤j≤M;
M is the longest adjacent plot length of history trojan horse program sequence of events K。
Concrete, automat excavates the frequently adjacent plot set E that length is jj, including:
The adjacent plot set of any two history trojan horse program sequence of events K in described history trojan horse program sequence of events storehouse is disassembled by automat to be two length be j, the adjacent plot that length in two adjacent plot set is j is carried out contrast coupling, when adjacent plot is mated completely, the adjacent plot of coupling be the length that automat is excavated is the adjacent plot of j, and is that the adjacent plot occurrence number of j adds 1 to length;
When the occurrence number of adjacent plot is not less than support threshold, will abut against plot and put into the frequently adjacent plot set E that length is jjIn;
When the occurrence number of adjacent plot is less than support threshold, not will abut against plot and put into the frequently adjacent plot set E that length is jjIn。
In any program, event is called adjacent plot successively。
The rest may be inferred, and automat is according to the frequently adjacent plot set E that tap length is jjMethod tap length from the frequently adjacent plot set of 2 to M, and constituted frequent episodic knowledge storehouse by length from the frequently adjacent plot set of length each in the scope of 2 to M。
Step S202, to the adjacent plot of target program with in numerous episodic knowledge storehouse frequently adjacent plot carry out contrast and mate, it is judged that whether target program is trojan horse program。
Concrete, step S202, including:
Step B1: target program is grown most adjacent plot and excavates, generates the adjacent plot e of target programm
em=[b1,b2,…bi,…,bm];
biFor the crawler behavior feature that target program occurs successively;
I span is: 1≤i≤m;
M is the sequence of events length of target program。
Step B2: by adjacent for target program plot emContrast with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse。
As the adjacent plot e of target programmWhen mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, it is determined that target program is trojan horse program;
As the adjacent plot e of target programmWhen not mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, target program is performed step S203 operation, determines whether whether target program is trojan horse program。
Step S203, by adopting NB Algorithm to further determine whether the unmatched target program of step S202 into trojan horse program。
Concrete, step S203, including:
By NB Algorithm, calculation procedure sample set Z comprises the adjacent plot e of target programmNormal procedure, nondeterministic program and trojan horse program probability。
Step D1: stochastic variable category set C is set;
C={ normal procedure, nondeterministic program, trojan horse program };
If c1=normal procedure, c2=nondeterministic program, c3=trojan horse program;
Step D2: by NB Algorithm, comprises the adjacent plot e of target program in calculation procedure sample set ZmNormal procedure Probability p (c1|em), program sample set Z comprises the adjacent plot e of target programmNondeterministic program Probability p (c2|em) and program sample set Z in comprise the adjacent plot e of target programmTrojan horse program Probability p (c3|em);
p ( c 1 | e m ) = p ( e m | c 1 ) p ( c 1 ) p ( e m ) ;
p ( c 2 | e m ) = p ( e m | c 2 ) p ( c 2 ) p ( e m ) ;
p ( c 3 | e m ) = p ( e m | c 3 ) p ( c 3 ) p ( e m ) ;
Step D3: by adjacent for target program plot [b1,b2,…bi,…,bm] substitute into p (c respectively1|em)、p(c2|em) and p (c3|em), obtain:
p ( c 1 | e m ) = Π i = 1 m p ( b i | c 1 ) p ( c 1 ) Π i = 1 m p ( b i ) ;
p ( c 2 | e m ) = Π i = 1 m p ( b i | c 2 ) p ( c 2 ) Π i = 1 m p ( b i ) ;
p ( c 3 | e m ) = Π i = 1 m p ( b i | c 3 ) p ( c 3 ) Π i = 1 m p ( b i ) ;
Step D4: to p (c1|em)、p(c2|em) and p (c3|em) compare, it is judged that whether target program is trojan horse program, it is judged that process is as follows:
As p (c1|em) for maximum time, it is determined that target program is normal procedure;
As p (c2|em) for maximum time, it is determined that target program is nondeterministic program;
As p (c3|em) for maximum time, it is determined that target program is trojan horse program。
When target program is judged to wooden horse, find most like adjacent plot based on Euclidean distance。
Step S204, the adjacent plot suffix event prediction target program follow-on attack behavior when target program is judged to trojan horse program, according to target program。
Concrete, step S204, including:
When judging target program as trojan horse program, target program in history trojan horse program sequence of events K adjoins the follow-up adjacent plot adjacent plot suffix event as target program of plot, and the adjacent plot suffix event of target program is target program follow-on attack behavior。
Third embodiment of the invention, the present embodiment is on the basis of above-described embodiment, introduces the application example of a present invention in conjunction with accompanying drawing 3, and the process that realizes of technical scheme, feature and advantage are described in detail。
As it is shown on figure 3, the network attack detecting method based on big data association of the present embodiment, comprise the steps:
Step one, carries out big data mining analysis to history trojan horse program, extracts the crawler behavior feature of history trojan horse program, forms frequent episodic knowledge storehouse。
Concrete, step one, including:
1) analyzing each history trojan horse program crawler behavior feature at different periods, described crawler behavior feature comprises history trojan horse program crawler behavior characteristic vector S and history trojan horse program sequence of events K;
S=[(a1,t1),(a2,t2),…(ai,ti),…(an,tn)];
K=[a1,a2,…ai,…an];
Wherein, aiIt is that history trojan horse program is at tiThe crawler behavior feature of period;
The order arranging day part appearance according to the sequencing of time from front to back is as follows: t1, t2..., tn
The span of variable i: 1≤i≤n;
N is the sequence of events length of trojan horse program;
All history trojan horse program sequence of events K are stored in data base, history of forming trojan horse program sequence of events storehouse。
2) by automat, all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined plot to excavate, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。
Concrete, including:
A) all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are put in the adjacent plot set of candidate as program event sequence length uniquely adjacent plot;
Adjacent plot is program event successively。
B) arbitrary history trojan horse program sequence of events K is excavated, by automat, the adjacent plot that adjacent plot length is j, and be the adjacent plot counting of j to length;
The span of variable j is: 2≤j≤M;
M is the history the longest adjacent plot length of trojan horse program sequence of events K;
Concrete, described automat excavates the adjacent plot that adjacent plot length is j, including:
Will abut against the adjacent plot set of the j that disassembles that to be two length be of any two history trojan horse program sequence of events K in plot set, the adjacent plot that length in two adjacent plot set is j is carried out contrast coupling, when adjacent plot is mated completely, the adjacent plot of coupling is the length that automat is excavated is the adjacent plot of j, and is that the adjacent plot counting of j adds 1 to length;
Such as, adjacent plot set comprises 2 history trojan horse program sequences of events, respectively K1、K2。Will abut against plot set and put into the adjacent plot that the adjacent plot length of excavation in automat is 3。
K1=[a3,a4,a5,a6,a7];
K2=[a4,a5,a6,a8,a9];
K1Disassemble the adjacent plot set for adjacent plot length is 3 to include: [a3,a4,a5]、[a4,a5,a6] and [a5,a6,a7];
K2Disassemble the adjacent plot set for adjacent plot length is 3 to include: [a4,a5,a6]、[a5,a6,a8]、[a6,a8,a9];
Pass through K1Disassemble the adjacent plot for adjacent plot length is 3 and K2Disassemble and carry out contrast coupling, [a for the adjacent plot that adjacent plot length is 34,a5,a6] mate completely, therefore [a4,a5,a6] length excavated for automat is 3 adjacent plots, and is that 3 adjacent plot occurrence numbers add 1 to length。
When the occurrence number of adjacent plot is not less than support threshold, then adjacent for candidate plot is put into the frequently adjacent plot set E that length is jjIn;
When the occurrence number of adjacent plot is less than support threshold, then adjacent for candidate plot is not put into the frequently adjacent plot set E that length is jjIn;
Such as: adjacent plot set comprises 5 history trojan horse program sequences of events, respectively K1、K2、K3、K4、K5。Will abut against plot set to put into automat excavates the adjacent plot that adjacent plot length is 3, and be the adjacent plot counting of 3 to length。
Described K1=[a1,a2,a3,a4,a5,a6,a7];
Described K2=[a1,a2,a3,a4,a5,a7];
Described K3=[a2,a3,a4,a5,a7];
Described K4=[a2,a3,a4,a5];
Described K5=[a1,a4,a5,a6,a7];
As adjacent plot length j=3, when support threshold chooses 3, automat excavates the adjacent plot [a2 in history trojan horse program sequence of events K1, K2, K3 and K4, a3, a4], adjacent plot is counted as 4, more than support threshold 3, therefore will abut against plot [a2, a3, a4] and put into the frequently adjacent plot set E that length is 33In;Automat excavates the adjacent plot [a5, a6, a7] in history trojan horse program sequence of events K1 and K5, and adjacent plot is counted as 2, less than support threshold 3, therefore puts into the frequently adjacent plot set E that length is 3 without will abut against plot [a5, a6, a7]3In。
C) to frequently adjacent plot set EjIn any two frequently adjacent plot carry out contrast coupling, if j-1 adjacent plot coupling, then by two frequently adjacent plots merge into the adjacent plot of candidate that adjacent plot length is j+1;
Such as: [a1,a2,a3] and [a2,a3,a4] it is all frequently adjoin plot set E3In frequently adjacent plot, by matching test, formed and meet the adjacent plot [a of candidate that plot length is 41,a2,a3,a4]。
D) on the adjacent plot basis of the candidate that formation length is j+1, repeating step b) and step c), tap length adjoins plot set from the frequent of 2 to M;
E) frequent episodic knowledge storehouse is constituted by length from the frequently adjacent plot set of length each in the scope of 2 to M。
Step 2, to the adjacent plot of target program with in numerous episodic knowledge storehouse frequently adjacent plot carry out contrast and mate, it is judged that whether target program is trojan horse program。
Concrete, step 2 includes:
1) target program is grown most adjacent plot excavate, generate the adjacent plot e of target programm
em=[b1,b2,…bi,…,bm];
Wherein, biFor the crawler behavior feature that target program occurs successively;
Variable i span is: 1≤i≤m;
M is the sequence of events length of target program。
2) by adjacent for target program plot emContrast with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse;
As the adjacent plot e of target programmWhen mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, it is determined that target program is trojan horse program;
As the adjacent plot e of target programmWhen not mating with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, target program is carried out step 3 operation, determines whether whether target program is trojan horse program。
Step 3, by adopting NB Algorithm to further determine whether the unmatched target program of step 2 into trojan horse program。
Concrete, step 3 includes:
By NB Algorithm, calculation procedure sample set Z comprises the adjacent plot e of target programmNormal procedure, nondeterministic program and trojan horse program probability。
Calculation procedure sample set Z Program total amount is not less than 100;
In calculation procedure sample set Z, normal procedure quantity is not less than the 30% of program sample set Z program total amount;
In calculation procedure sample set Z, nondeterministic program quantity is not less than the 30% of program sample set Z program total amount;
In calculation procedure sample set Z, trojan horse program quantity is not less than the 30% of program sample set Z program total amount。
A) stochastic variable category set C is set;
C={ normal procedure, nondeterministic program, trojan horse program };
If c1=normal procedure, c2=nondeterministic program, c3=trojan horse program;
B) by NB Algorithm, calculation procedure sample set Z comprises the adjacent plot e of target programmNormal procedure Probability p (c1|em), program sample set Z comprises the adjacent plot e of target programmNondeterministic program Probability p (c2|em) and program sample set Z in comprise the adjacent plot e of target programmTrojan horse program Probability p (c3|em);
p ( c 1 | e m ) = p ( e m | c 1 ) p ( c 1 ) p ( e m ) ;
p ( c 2 | e m ) = p ( e m | c 2 ) p ( c 2 ) p ( e m ) ;
p ( c 3 | e m ) = p ( e m | c 3 ) p ( c 3 ) p ( e m ) ;
C) by adjacent for target program plot [b1,b2,…bi,…,bm] substitute into p (c respectively1|em)、p(c2|em) and p (c3|em), obtain;
p ( c 1 | e m ) = Π i = 1 m p ( b i | c 1 ) p ( c 1 ) Π i = 1 m p ( b i ) ;
p ( c 2 | e m ) = Π i = 1 m p ( b i | c 2 ) p ( c 2 ) Π i = 1 m p ( b i ) ;
p ( c 3 | e m ) = Π i = 1 m p ( b i | c 3 ) p ( c 3 ) Π i = 1 m p ( b i ) ;
D) to p (c1|em)、p(c2|em) and p (c3|em) compare, it is judged that whether target program is trojan horse program, it is judged that process is as follows:
As p (c1|em) for maximum time, it is determined that target program is normal procedure;
As p (c2|em) for maximum time, it is determined that target program is nondeterministic program;
As p (c3|em) for maximum time, it is determined that target program is trojan horse program。
E) when target program is judged to wooden horse, then most like adjacent plot is found based on Euclidean distance。
Step 4, the adjacent plot suffix event prediction target program follow-on attack behavior when target program is judged to trojan horse program, according to target program。
Concrete, step 4 includes:
When judging target program as trojan horse program, target program in history trojan horse program sequence of events K adjoins the follow-up adjacent plot adjacent plot suffix event as target program of plot, and the adjacent plot suffix event of target program is target program follow-on attack behavior。
Such as: the adjacent plot of target program is [a2,a4,a6], history trojan horse program sequence of events K frequently adjoins plot [a1,a2,a4,a6,a7,a8,a9] comprise the adjacent plot [a of target program2,a4,a6], then [a7,a8,a9] for the follow-up adjacent plot of the adjacent plot of target program, the adjacent plot suffix event that follow-up adjacent plot is target program of the adjacent plot of target program, the adjacent plot suffix event [a of target program7,a8,a9] it is target program follow-on attack behavior。
By the explanation of detailed description of the invention, it should can be reach technological means that predetermined purpose takes and effect is able to more deeply and concrete understanding to the present invention, however appended diagram be only to provide with reference to and purposes of discussion, be not used for the present invention is any limitation as。

Claims (7)

1. the network attack detecting method based on big data association, it is characterised in that including:
Step one, carries out big data mining analysis to history trojan horse program, excavates the frequently adjacent plot of history trojan horse program, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse;
Step 2, carrying out contrast by the adjacent plot of target program with the frequently adjacent plot of the history trojan horse program in frequent episodic knowledge storehouse and mates, if matching, then judging that described target program is as trojan horse program;If not matching, then NB Algorithm is adopted to judge whether described target program is trojan horse program;
Step 3, the adjacent plot suffix event when judging target program as trojan horse program, according to target program, it was predicted that target program follow-on attack behavior。
2. the network attack detecting method based on big data association according to claim 1, it is characterised in that step one, specifically includes:
Analyzing each history trojan horse program crawler behavior feature at different periods, described crawler behavior feature comprises history trojan horse program crawler behavior characteristic vector S and history trojan horse program sequence of events K;
S=[(a1,t1),(a2,t2),…(ai,ti),…(an,tn)];
K=[a1,a2,…ai,…an];
Wherein, aiIt is that history trojan horse program is at tiThe crawler behavior feature of period;
The order arranging day part appearance according to the sequencing of time from front to back is as follows: t1, t2..., tn
The span of variable i: 1≤i≤n;
N is the sequence of events length of trojan horse program;
All history trojan horse program sequence of events K are stored in data base, history of forming trojan horse program sequence of events storehouse;
By automat, all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined plot excavate, the frequently adjacent plot of history trojan horse program constitute frequent episodic knowledge storehouse。
3. the network attack detecting method based on big data association according to claim 2, it is characterized in that, described plot that all history trojan horse program sequence of events K in history trojan horse program sequence of events storehouse are frequently adjoined by automat is excavated, constituted frequent episodic knowledge storehouse by the frequently adjacent plot of history trojan horse program, specifically include:
All history trojan horse program sequence of events K from history trojan horse program sequence of events storehouse excavate, by automat, the frequently adjacent plot set E that length is jj
The rest may be inferred, according to the frequently adjacent plot set E that tap length is jjMethod tap length from the frequently adjacent plot set of 2 to M, and constituted frequent episodic knowledge storehouse by length from the frequently adjacent plot set of length each in the scope of 2 to M;
In any program, event is called adjacent plot successively;
The span of variable j is: 2≤j≤M;
M is the longest adjacent plot length of history trojan horse program sequence of events K。
4. the network attack detecting method based on big data association according to claim 3, it is characterised in that described automat excavates the adjacent plot set E that length is jj, specifically include:
The adjacent plot set of any two history trojan horse program sequence of events K in described history trojan horse program sequence of events storehouse is disassembled that to be two length be j, the adjacent plot that length in two adjacent plot set is j is carried out contrast coupling, when adjacent plot is mated completely, the adjacent plot of coupling be the length that automat is excavated is the adjacent plot of j, and is that the adjacent plot occurrence number of j adds 1 to length;
When the occurrence number of adjacent plot is not less than support threshold, will abut against plot and put into the frequently adjacent plot set E that length is jjIn;
When the occurrence number of adjacent plot is less than support threshold, not will abut against plot and put into the frequently adjacent plot set E that length is jjIn。
5. the network attack detecting method based on big data association according to claim 1, it is characterised in that step 2, specifically includes:
Target program is grown most adjacent plot excavate, generate the adjacent plot e of target programm
em=[b1,b2,…bi,…,bm];
Wherein, biFor the crawler behavior feature that target program occurs successively;
The span of variable i is: 1≤i≤m;
M is the sequence of events length of target program;
By adjacent for target program plot emContrast with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, if the adjacent plot e of target programmMate with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, then judge that target program is as trojan horse program;
If target program adjoins plot emDo not mate with the frequently adjacent plot in frequently adjacent episodic knowledge storehouse, then adopt NB Algorithm to judge whether described target program is trojan horse program。
6. the network attack detecting method based on big data association according to claim 1, it is characterised in that described employing NB Algorithm judges whether described target program is trojan horse program, specifically includes:
B1: stochastic variable category set C is set;
C={ normal procedure, nondeterministic program, trojan horse program };
If c1=normal procedure, c2=nondeterministic program, c3=trojan horse program;
B2: by NB Algorithm, comprises the adjacent plot e of target program in calculation procedure sample set ZmNormal procedure Probability p (c1|em), program sample set Z comprises the adjacent plot e of target programmNondeterministic program Probability p (c2|em) and program sample set Z in comprise the adjacent plot e of target programmTrojan horse program Probability p (c3|em);
B3: by adjacent for target program plot [b1, b2,…bi,…,bm] substitute into p (c respectively1|em)、p(c2|em) and p (c3|em), obtain:
B4: to p (c1|em)、p(c2|em) and p (c3|em) compare, it is judged that whether target program is trojan horse program, it is judged that process is as follows:
If p is (c1|em) for maximum, then judge that target program is as normal procedure;
If p is (c2|em) for maximum, then judge that target program is as nondeterministic program;
If p is (c3|em) for maximum, then judge that target program is as trojan horse program;
If it is determined that target program is trojan horse program, then find most like adjacent plot based on Euclidean distance。
7. the network attack detecting method based on big data association according to claim 1, it is characterised in that step 3, specifically includes:
When judging target program as trojan horse program, target program in history trojan horse program sequence of events K adjoins the follow-up adjacent plot adjacent plot suffix event as target program of plot, and the adjacent plot suffix event of target program is target program follow-on attack behavior。
CN201610131314.4A 2016-03-09 2016-03-09 One kind being based on the associated network attack detecting method of big data Active CN105704136B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610131314.4A CN105704136B (en) 2016-03-09 2016-03-09 One kind being based on the associated network attack detecting method of big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610131314.4A CN105704136B (en) 2016-03-09 2016-03-09 One kind being based on the associated network attack detecting method of big data

Publications (2)

Publication Number Publication Date
CN105704136A true CN105704136A (en) 2016-06-22
CN105704136B CN105704136B (en) 2019-04-05

Family

ID=56221053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610131314.4A Active CN105704136B (en) 2016-03-09 2016-03-09 One kind being based on the associated network attack detecting method of big data

Country Status (1)

Country Link
CN (1) CN105704136B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108173884A (en) * 2018-03-20 2018-06-15 国家计算机网络与信息安全管理中心 Based on network attack with the ddos attack population analysis method of behavior

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102118245A (en) * 2009-12-31 2011-07-06 中国人民解放军国防科学技术大学 Scale prediction knowledge training method and prediction method for large-scale network security events
WO2015128609A1 (en) * 2014-02-28 2015-09-03 British Telecommunications Public Limited Company Profiling for malicious encrypted network traffic identification

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102118245A (en) * 2009-12-31 2011-07-06 中国人民解放军国防科学技术大学 Scale prediction knowledge training method and prediction method for large-scale network security events
WO2015128609A1 (en) * 2014-02-28 2015-09-03 British Telecommunications Public Limited Company Profiling for malicious encrypted network traffic identification

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIANG LIAN,LEI CHEN: ""Efficient Similarity Search over Future Stream Time Series"", 《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 *
商海波: ""木马的行为分析及新型反木马策略的研究"", 《浙江工业大学硕士学位论文》 *
杨尹,韩伟红,程文聪: ""基于时序分析的木马控制行为识别方法"", 《信息网络安全》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108173884A (en) * 2018-03-20 2018-06-15 国家计算机网络与信息安全管理中心 Based on network attack with the ddos attack population analysis method of behavior
CN108173884B (en) * 2018-03-20 2021-05-04 国家计算机网络与信息安全管理中心 DDoS attack group analysis method based on network attack accompanying behaviors

Also Published As

Publication number Publication date
CN105704136B (en) 2019-04-05

Similar Documents

Publication Publication Date Title
US9836617B2 (en) Code repository intrusion detection
CN104376262B (en) A kind of Android malware detection method based on Dalvik instructions and authority combination
Li et al. A novel rule-based Intrusion Detection System using data mining
CN105471882A (en) Behavior characteristics-based network attack detection method and device
CN102768638B (en) Software behavior credibility detecting method based on state transition diagram
CN106411921A (en) Multi-step attack prediction method based on cause-and-effect Byesian network
Sun et al. Effective malware detection scheme based on classified behavior graph in IIoT
WO2023015783A1 (en) Intelligent terminal operating system vulnerability repairing method and system based on vulnerability intelligence
CN110162975A (en) A kind of multistep abnormal point detecting method based on neighbour's propagation clustering algorithm
Kim et al. Cost-effective valuable data detection based on the reliability of artificial intelligence
CN110046501B (en) Malicious code detection method inspired by biological genes
CN109739720B (en) Abnormality detection method, abnormality detection device, storage medium, and electronic apparatus
CN105468975A (en) Method, device and system for tracking malicious code misinformation
CN114357459A (en) Information security detection method for block chain system
CN105704136A (en) Big data association-based network attack detection method
CN116702229B (en) Safety house information safety control method and system
Liu et al. HMMs based masquerade detection for network security on with parallel computing
Masud Rana et al. Contaminant spread forecasting and confirmatory sampling location identification in a water-distribution system
CN107623677B (en) Method and device for determining data security
CN117009832A (en) Abnormal command detection method and device, electronic equipment and storage medium
CN115801361A (en) Network security operation and maintenance capability assessment method and system
CN110489611B (en) Intelligent clue analysis method and system
CN112511568A (en) Correlation analysis method, device and storage medium for network security event
Zeng et al. A new anomaly detection method based on rough set reduction and HMM
Xinguang et al. Intrusion detection based on system calls and homogeneous Markov chains

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant