CN107370752A - A kind of efficient remote control Trojan detection method - Google Patents

A kind of efficient remote control Trojan detection method Download PDF

Info

Publication number
CN107370752A
CN107370752A CN201710719001.5A CN201710719001A CN107370752A CN 107370752 A CN107370752 A CN 107370752A CN 201710719001 A CN201710719001 A CN 201710719001A CN 107370752 A CN107370752 A CN 107370752A
Authority
CN
China
Prior art keywords
mrow
msub
mfrac
module
remote control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710719001.5A
Other languages
Chinese (zh)
Other versions
CN107370752B (en
Inventor
姜伟
吴贤达
庄俊玺
潘邵芹
田原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201710719001.5A priority Critical patent/CN107370752B/en
Publication of CN107370752A publication Critical patent/CN107370752A/en
Application granted granted Critical
Publication of CN107370752B publication Critical patent/CN107370752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Abstract

The invention discloses a kind of efficient remote control Trojan detection method, this method judges to whether there is remote control Trojan in network by network behavior feature.This method can be applied in the detection of real network flow, and rate of false alarm is close to 0.Whole method includes following four-stage:First stage, flow collection;Second stage, behavioural characteristic extraction;Phase III, the realization of method:SMOTE over-samplings and XGBoost sorting techniques are combined, SMOTE over-sampling algorithms solve the classification problem of unbalanced dataset in data plane.The very high sorting algorithm XGBoost sorting techniques of the emerging precision in machine learning field are used for trojan horse detection first, while reaching compared with high-accuracy, solve the classification problem of unbalanced dataset from algorithm aspect.Fourth stage, the optimized evaluation of method.This method is focused on finding rule by the excavation of mixture of networks flow, is adapted to the identification work of wooden horse known to completing, can also detect unknown remote control Trojan.

Description

A kind of efficient remote control Trojan detection method
Technical field
The invention belongs to areas of information technology, and in particular to a kind of efficient remote control Trojan detection method.Realize accurate inspection Measure the known remote control Trojan mixed in flow, additionally it is possible to identify unknown remote control Trojan, to safeguarding network security, reduce state Family, the loss of enterprises and individuals have great importance.
Background technology
In recent years, remote control Trojan is constantly used for remote control and information stealth by attacker, and network security is brought Serious threat, cause and have a strong impact on and massive losses to country, enterprises and individuals.Remote control Trojan is by control terminal (client) Formed with two parts of controlled end (server end).Under normal circumstances, attacker is attacked using spear type fishing and social engineering Hit and find the machine that can infect, the TCP/IP or udp protocol of standard are utilized after finding, realize the real-time of control terminal and controlled end Communication.Attacker sends control instruction by control terminal, and controlled end performs phase after monitoring control instruction in victim host The controlled action answered, and result is returned into control terminal by network.With traditional security threat from virus and wooden horse not Be:Such wooden horse is full-featured, the data theft being usually used in APT attacks and privacy pry, there is disguised and length to hold Long property, it is very harmful.Well-known remote control Trojan, such as:Grey pigeon, Gh0st, PcShare, Nuclear, DarkComent, XtreamRAT, glacial epoch, PlugX, etc., kind up to more than 30, some unknown and variants of these known wooden horses are in network In stealthily affect our privacy.More seriously, once host computer system is broken, this main frame of invader's can Remote control Trojan is distributed to other pregnable computers, establishes Botnet.Current intruding detection system is in design It is designed primarily directed to all kinds of safety problems in LAN, may have been omitted the particularity of remote control Trojan, so as to far controls Wooden horse is likely to the testing mechanism around intruding detection system.How quickly and efficiently to detect and further take precautions against remote control Trojan, The significant challenge faced as security fields.
According to the remote control Trojan detection technique based on varying environment, the detection of remote control Trojan can be divided into Intrusion Detection based on host and be based on The detection of network, and the detection mode of fusion main frame and network characterization.As wooden horse speed of mutation is more and more faster so that be based on The detection efficiency of Host behavior feature substantially reduces, and the detection based on network behavior is more suitable for detecting present in network newly Raw unknown threat.The general character of Intrusion Detection based on host and the detection mode of fusion main frame and network characterization is to be required for above carrying in main frame Take behavioural characteristic.In order to which the method that we obtain has more preferable transplantability, we do not consider to extract behavioural characteristic from main frame, only close How note selects the feature on effective network, coordinates and finds appropriate detection algorithm, generates efficient remote control Trojan detection side Method.In recent years, the method for machine learning was applied to the trojan horse detection based on communication behavior by most of researchers, but most of Existing methods detection rate of false alarm is higher and is not particularly suited for the detection of extraordinary remote control Trojan.
Dan Jiang et al., it is directed to just being detected initial stage in remote control Trojan communication, from transport network layer data parlor Go out seven network characterizations every the extracting data less than 1S.Detection method is realized by random forests algorithm.Although this experiment has Higher accuracy rate, but rate of false alarm is higher, the selection of sample are remote control Trojan sample and 10 sections of normal use samples, method It is not particularly suited for mixing flow.
Li Wei et al., the labor communication feature of wooden horse have chosen periodicity DNS, up-downgoing byte ratio, up and down 7 network behavior features such as ratio, parcel accounting of row bag, have selected KNN and C4.5 algorithms, but it is higher to equally exist wrong report Problem.
Shicong Li et al., detection to wooden horse is realized by clustering algorithm, the algorithms selection Internet and IP layers Feature, complete the detection method for mixing flow, but the feature that this method is chosen is applied to the detection of remote control class wooden horse simultaneously It is not necessarily effective.It is proposed that detection method accuracy rate and rate of false alarm be better than the method.
Basic conception of the present invention is explained below.
Stream:The flow collected is subjected to Screening Treatment, chooses the flow based on Transmission Control Protocol, and according to [source IP address, Purpose IP address] difference extract different " stream ".Every in this method stream k be one section using flag bit as " SYN " is The three-way handshake of beginning starts to obtain, and the flow untill time threshold T (T=300S) is reached, this section of flow total length is designated as Fk(k=1,2,3 ... k).
Session:Session is that restructuring by flowing and filtering are formed.Each stream be decomposed into 1 to n it is different [source IP address, Source port, purpose IP address, destination interface] communication " session ".
Periodically stream:" TTI " between obtaining per two neighboring bag is defined as t, TLinternalFor depositing stream In all bags interval, be denoted as TLinternal={ t0, t1, t2.........tN-1};By all TLinternalMiddle element sum note For total time, represented with SUMT;All streams in T range are referred to as " periodically stream ", all time interval collection is periodically flowed and is combined into TLinternal
The content of the invention
The technical problems to be solved by the invention are the detections to remote control class wooden horse, mainly provide a kind of effective detection and far control The detection method of wooden horse, known and unknown remote control Trojan is detected exactly for realizing.Including:
1. collection network communication data packet, extracted according to the difference of [source IP address, purpose IP address] different " stream ";
2. pair stream requirement each captured to using flag bit as " SYN " is that the three-way handshake of starting starts, until when reaching Between one section of flow analysis untill threshold value T (T=300S), extract following feature:
f0:Flag bit is all bag numbers of [FIN, ACK] or [RAT, ACK] during statistics periodically flows;
f1:Count the session number that regular stream includes;
f2:Most long session is proposed from regular stream, variance is asked to the sequence of all uplink packets of most long session composition;
f3:It is that [PUSH, ACK] bag size subtracts descending each flag bit to calculate up average each flag bit in periodically stream For the value of [PUSH, ACK] bag size;1 is entered as if greater than 0,0 is entered as equal to 0, -1 is entered as less than 0;Remember in T time The byte for the bag that up flag is PUSH and be Pbup, number Cbup;Descending flag bit for [PUSH, ACK] bag byte and For Pbdown, number Cbdown.Then have:
f4:Periodically lines per second is averagely descended to send byte number in stream.We have tried to achieve all descending total bytes of giving out a contract for a project in T time Number Pdown, and according to TLinternalTry to achieve the descending total time T used that gives out a contract for a project in T timedown
f5:Periodically up average bytes per second divided by descending average bytes per second in stream.According to TLinternalTry to achieve T Up total time used of giving out a contract for a project is T in timeup, all up total bytes P that give out a contract for a project in T timeup, then:
f6:The number of bag of the size more than 90 periodically wrapped in stream;
f7:The downstream packets number of transmission per second, i.e. time used in the total number divided by downstream packets of downstream packets in periodically flowing;Note The number of all downstream packets is C in T timedown, then have
3. making label for the stream each captured, remote control Trojan communication flows is designated as 1, and proper communication flow is designated as 0.By label Database is stored in corresponding 8 kinds of behavioural characteristic data, generates training set;
4. SMOTE sampling algorithms and XGBoost sorting algorithms are combined, while by data plane and algorithm aspect Improve to solve the classification problem of unbalanced dataset.Then handle to obtain new training set by SMOTE sampling algorithms.Profit Classification learning is carried out to new compound training collection with XGBoost algorithms, obtains original grader.
5. utilizing raster search method, realize and systematically travel through Various Classifiers on Regional parameter combination, determined by cross validation Optimal parameter, then use these parameter optimization original classification devices in whole training.
6. using real network flow as detection object, the testing result of analysis method.
The beneficial effects of the invention are as follows:The present invention have chosen the size based on network packet, bag number, mark and when Between etc. feature, effectively realize remote control Trojan detection method.The main contributions of this method be to combine SMOTE over-samplings and XGBoost sorting techniques, by being improved while data plane and algorithm aspect to solve the classification problem of unbalanced dataset. Generation method is not limited to the detection to host side flow, at the same available for detection network key node with the presence or absence of known or The unknown remote control Trojan of person.
Brief description of the drawings
Fig. 1 this method schematic flow sheets.
Embodiment
S1. this remote control Trojan detection method, the generation of method mainly include following four module:Flow collection module, OK It is characterized extraction module, grader creation module, the optimized evaluation module of grader.
S2. flow collection module is responsible for the data set that acquisition method creates and detection is required;
S21. flow collection:Using NetAnalyzer and wireshark softwares, seven meters are captured under controllable environment The communication flows (wherein two implantation trojan horse programs) of calculation machine, these communication flows can be divided into three kinds, first, collecting both at home and abroad 24 kinds of remote control Trojan sample communications flows, second, known 10 kinds normal application software communication flows, third, the net mixed Network flow.Finally we have collected the communication flows of 291.17 hours altogether, and these communication flows are deposited with .pcap file formats Storage.
S22. flow screens:The filtering of communication flows.The stream based on Transmission Control Protocol is chosen from the .pcap files after preservation Amount, and extracted different " stream " according to the difference of [source IP address, purpose IP address].
S23. the restructuring of communication flows meets following two condition:(1) using flag bit as " SYN " be starting three-way handshake Start, one section until time threshold T (T=300S) is reached untill flows, each stream can by 1 to it is N number of it is different [source IP address, Source port, purpose IP address, destination interface] communication " session " composition;(2) duration of whole section of stream is more than 1S, i.e., does not consider to be less than The stream that 1S just terminates;
S3. behavior characteristic extraction module is responsible for analyzing remote control Trojan and the difference of mainframe network communication stream, searches out effectively Suitable for the network service feature of such detection.Every section of regular stream after processing is designated as Fk(k=1,2,3...k), behavior carries Modulus block comprises the following steps:
S31. count flag bit in bidirectional flow and be designated as f for the total number of [FIN, ACK] or [RAT, ACK]0
S32. F is countedkThe number of bag of the size of middle bag more than 90;Filter and restructuring will be periodically flowed through, be decomposed into 1 and arrive n Individual different [source IP address, source port, purpose IP address, destination interface] communication " session " composition, M is designated as by most long sessions, Count the number of session and be designated as f1;To MsIn all uplink packets form new sequence and calculate the variance note of this section of sequence For f2
S33. F is periodically flowedkIn up average each flag bit be that [PUSH, ACK] bag size subtracts descending each flag bit For [PUSH, ACK] bag size, f is designated as3;1 is entered as if greater than 0,0 is entered as equal to 0, -1 is entered as less than 0;Remember T time Interior up flag bit is the byte of the bag of [PUSH, ACK] and is Pbup, number Cbup;Descending flag bit is [PUSH, ACK] The byte of bag and be Pbdown, number Cbdown.Then have:
S35. calculating in periodically stream averagely descends lines per second to send byte number, is designated as f4;We tried to achieve it is all in T time under The capable total bytes P that gives out a contract for a projectdown, and according to TLinternalTry to achieve the descending total time T used that gives out a contract for a project in T timedown
S36. up average bytes per second divided by descending average bytes per second in periodically stream are calculated and is designated as f5;According to TLinternalIt is T to try to achieve up total time used of giving out a contract for a project in T timeup, all up total bytes P that give out a contract for a project in T timeup, then:
S37. the downstream packets number scale for calculating transmission per second in periodically stream is f7, i.e. the total number of downstream packets divided by downstream packets institute Time;The number for remembering all downstream packets in T time is Cdown, then have
S4. grader creation module is responsible for dividing the training set newly synthesized using SMOTE algorithms using XGBoost algorithms Class learns, and generates an original classification device;
S41. every stream of capture is labelled, remote control Trojan communication flows is designated as 1, and proper communication flow is designated as 0.Will mark Label and corresponding 8 kinds of behavioural characteristics deposit database, generation method training set;Training set T1 is 291.17 small by seven machines When flow screening and filtering after obtain 1862 streams, 119 stream therein is produced by remote control Trojan.We are by T1 70% Data are used as training, are designated as TR1, and remaining 30% data are used as test, are designated as TE1;Followed by SMOTE algorithm process TR1 Data, normal stream and the ratio of remote control Trojan stream are in initial TR1:1214:89;
S42. classification ratio imbalance problem in such training set sample is considered, realizes SMOTE sampling algorithms.By SMOTE algorithms obtain the ratio of normal stream and remote control Trojan in new synthesis sample after certain operations are performed to raw data set For:1214:1246. training sets newly synthesized are designated as Tsynthesis. in addition, test set TE2 is amounted to by another five machines The flow of 145.83 hours is sieved through 1342 streams obtained after filter, and 86 stream therein is produced by remote control Trojan.
S43. classification learning training is carried out using XGBoost algorithms, in order to effectively avoid learning and owe study shape The generation of state, We conducted K-fold cross validations.K-fold Cross Validation are that initial data is divided into K groups, Each subset data is made into one-time authentication collection respectively, remaining K-1 groups subset data can so obtain K side as training set Method, performance indications sheet of the average of verifying the classification accuracy that collects final by the use of this K method as grader under this K-CV Wen Zhong, this method cross validation K values are set to 6.Ultimately generate a detection method;
S5. the optimized evaluation module of grader refers to choose the important parameter for choosing original classification device, and assesses most The Detection results of excellent grader.
S51. raster search method is utilized, realizes and systematically travels through many kinds of parameters combination, is determined by cross validation optimal Parameter, then use these parameter setting optimization methods in whole training.It is 72 to determine parameter to include estimator number, minimum Leaf node sample weights and for 1, the depth capacity of tree is 6, and the ratio of each tree stochastical sampling is 0.9, and every is adopted at random The accounting of the columns of sample is 0.8, and the least disadvantage function drop-out value needed for node split is 0.2.
S52. optimal parameter is brought into original classification device, generates optimal detection grader.
S53. test set being put into detection grader and identified, detection grader is judged the data in test set, Remote control Trojan communication such as be present, then corresponding communication flows output is 1, otherwise for 0. test result indicates that being given birth to according to as above method Into grader can effective detection go out whole remote control Trojan communications in test set (as shown in table 3, rate of false alarm be almost 0;
The wooden horse title of 1. model selections of table and corresponding version number
RAT samples Version number RAT samples Version number
Nuclear 3 Gh0st 2
Bandook 1 Upper emerging remote control 1
Great white shark 1 DarkComent 2
Grey pigeon 1 remote 1
Bozok 1 Taidoor 1
CyberGate RAT 1 PoisionIvy 2
Pandora RAT 1 SpyNet 1
Comet Rat 1 Kong Juyuan is controlled 1
Star RAT 1 Xtreme RAT 2
Pcshare 1 njRAT 3
VanToM RAT 1 Plugx 2
X RAT 1 HAKOPS RAT 1
2. 4 kinds of detection cross validation testing result contrasts of table
3. 3 kinds of method testing result contrasts of table

Claims (2)

1. a kind of remote control Trojan detection method, it is characterised in that the generation of method includes four main modulars;Module one, module Two, module three, module four represents respectively:Flow collection module, behavior characteristic extraction module, grader creation module, grader Optimized evaluation module;Flow collection module is responsible for gathering the data set that grader creates and detection is required;Filter out based on biography The communication traffic of defeated layer Transmission Control Protocol, communication traffic is divided according to [source IP address, purpose IP address], obtains a plurality of stream;Behavior Characteristic extracting module is responsible for analyzing remote control Trojan and the difference of mainframe network communication stream, searches out the network suitable for such detection Communication feature;Grader creation module generates original classification device using the training set generated;The optimized evaluation module of grader Refer to the grader match parameter for generation, optimize original classification device, obtain new grader, recycle new grader to surveying Testing result is assessed in examination collection classification;
Module one is realized in the following manner:To after each division stream k requirement to using flag bit as " SYN " be originate three times Shake hands beginning, one section of flow analysis untill time threshold T is reached, this section of flow is designated as periodically flowing Fk(k=1,2, 3...k);Time threshold T is setting value;
Behavior characteristic extraction module periodically flows F to every section after the processing of module onek(k=1,2,3...k), is carried in accordance with the following steps Take feature:
Step 1:Flag bit is designated as f for the total number of [FIN, ACK] or [RAT, ACK] in statistics bidirectional flow0
Step 2:Count FkThe number of bag of the size of middle bag more than 90;Filter and restructuring will be periodically flowed through, be decomposed into 1 to n Different [source IP address, source port, purpose IP address, destination interface] communication " session " compositions, M is designated as by most long sessions, system Count the number of session and be designated as f1;To MsIn all uplink packets form new sequence and calculate the variance of this section of sequence and be designated as f2
Step 3:Periodically stream FkIn up average each flag bit be that [PUSH, ACK] bag size subtracts descending each flag bit and is [PUSH, ACK] bag size, is designated as f3;1 is entered as if greater than 0,0 is entered as equal to 0, -1 is entered as less than 0;Remember in T time The byte for the bag that up flag is PUSH and be Pbup, number Cbup;Descending flag is the byte of the bag of [PUSH, ACK] and is Pbdown, number Cbdown;Then have:
<mfenced open = "{" close = "}"> <mtable> <mtr> <mtd> <mrow> <mfrac> <msub> <mi>P</mi> <mrow> <mi>b</mi> <mi>u</mi> <mi>p</mi> </mrow> </msub> <msub> <mi>C</mi> <mrow> <mi>b</mi> <mi>u</mi> <mi>p</mi> </mrow> </msub> </mfrac> <mo>&gt;</mo> <mfrac> <msub> <mi>P</mi> <mrow> <mi>b</mi> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> <msub> <mi>C</mi> <mrow> <mi>b</mi> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> </mfrac> <mo>,</mo> <msub> <mi>f</mi> <mn>3</mn> </msub> <mo>=</mo> <mn>1</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mfrac> <msub> <mi>P</mi> <mrow> <mi>b</mi> <mi>u</mi> <mi>p</mi> </mrow> </msub> <msub> <mi>C</mi> <mrow> <mi>b</mi> <mi>u</mi> <mi>p</mi> </mrow> </msub> </mfrac> <mo>=</mo> <mfrac> <msub> <mi>P</mi> <mrow> <mi>b</mi> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> <msub> <mi>C</mi> <mrow> <mi>b</mi> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> </mfrac> <mo>,</mo> <msub> <mi>f</mi> <mn>3</mn> </msub> <mo>=</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mfrac> <msub> <mi>P</mi> <mrow> <mi>b</mi> <mi>u</mi> <mi>p</mi> </mrow> </msub> <msub> <mi>C</mi> <mrow> <mi>b</mi> <mi>u</mi> <mi>p</mi> </mrow> </msub> </mfrac> <mo>&lt;</mo> <mfrac> <msub> <mi>P</mi> <mrow> <mi>b</mi> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> <msub> <mi>C</mi> <mrow> <mi>b</mi> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> </mfrac> <mo>,</mo> <msub> <mi>f</mi> <mn>3</mn> </msub> <mo>=</mo> <mo>-</mo> <mn>1</mn> </mrow> </mtd> </mtr> </mtable> </mfenced>
Step 4:Calculating in periodically stream averagely descends lines per second to send byte number, is designated as f4;We have tried to achieve all descending in T time Give out a contract for a project total bytes Pdown, and according to TLinternalTry to achieve the descending total time T used that gives out a contract for a project in T timedown
<mrow> <msub> <mi>f</mi> <mn>4</mn> </msub> <mo>=</mo> <mfrac> <msub> <mi>P</mi> <mrow> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> <msub> <mi>T</mi> <mrow> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> </mfrac> </mrow>
Step 5:Calculate up average bytes per second divided by descending average bytes per second in periodically stream and be designated as f5;According to TLinternalIt is T to try to achieve up total time used of giving out a contract for a project in T timeup, all up total bytes P that give out a contract for a project in T timeup, then:
<mrow> <msub> <mi>f</mi> <mn>5</mn> </msub> <mo>=</mo> <mfrac> <mrow> <msub> <mi>P</mi> <mrow> <mi>u</mi> <mi>p</mi> </mrow> </msub> <mo>.</mo> <msub> <mi>T</mi> <mrow> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> </mrow> <mrow> <msub> <mi>T</mi> <mrow> <mi>u</mi> <mi>p</mi> </mrow> </msub> <mo>.</mo> <msub> <mi>P</mi> <mrow> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> </mrow> </mfrac> </mrow>
Step 6:The number of bag of the size that statistics is wrapped in periodically flowing more than 90, is designated as f6;
Step 7:The downstream packets number scale for calculating transmission per second in periodically stream is f7, i.e., used in the total number divided by downstream packets of downstream packets Time;The number for remembering all downstream packets in T time is Cdown, then have
<mrow> <msub> <mi>f</mi> <mn>7</mn> </msub> <mo>=</mo> <mfrac> <msub> <mi>C</mi> <mrow> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> <msub> <mi>T</mi> <mrow> <mi>d</mi> <mi>o</mi> <mi>w</mi> <mi>n</mi> </mrow> </msub> </mfrac> </mrow> 1
Module three is realized in the following manner:
Every stream of capture is labelled, remote control Trojan communication flows is designated as 1, and proper communication flow is designated as 0;By label and correspondingly 8 kinds of behavioural characteristic values deposit database, as method training set;Asked for classification ratio imbalance in such training set sample Topic, realizes SMOTE sampling algorithms, generates new compound training collection;New compound training collection is divided using XGBoost algorithms Class learns, and generates an original classification device.
2. method according to claim 1, it is characterised in that:Module four is realized in the following manner:
Step 1:Using raster search method, realize and systematically travel through many kinds of parameters combination, optimal ginseng is determined by cross validation Number, these parameter settings optimization original classification device is then used in whole training;It is 72 to determine parameter to include estimator number, Minimum leaf node sample weights and for 1, the depth capacity of tree is 6, and the ratio of each tree stochastical sampling is 0.9, every with The accounting of the columns of machine sampling is 0.8, and the least disadvantage function drop-out value needed for node split is 0.2;
Step 2:The optimal parameter that step 1 is obtained, bring into the original classification device of generation, generate optimum classifier;
Step 3:Test sample is handled using module one and module two;Data to be tested are generated after behavioural characteristic is extracted Test set be put into recognition classifier in optimum classifier and will export judged result to the data in test set, remote control such as be present Wooden horse communicates, then corresponding communication flows output is 1, is otherwise 0.
CN201710719001.5A 2017-08-21 2017-08-21 Efficient remote control Trojan detection method Active CN107370752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710719001.5A CN107370752B (en) 2017-08-21 2017-08-21 Efficient remote control Trojan detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710719001.5A CN107370752B (en) 2017-08-21 2017-08-21 Efficient remote control Trojan detection method

Publications (2)

Publication Number Publication Date
CN107370752A true CN107370752A (en) 2017-11-21
CN107370752B CN107370752B (en) 2020-09-25

Family

ID=60308969

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710719001.5A Active CN107370752B (en) 2017-08-21 2017-08-21 Efficient remote control Trojan detection method

Country Status (1)

Country Link
CN (1) CN107370752B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809989A (en) * 2018-06-14 2018-11-13 北京中油瑞飞信息技术有限责任公司 A kind of detection method and device of Botnet
CN109104437A (en) * 2018-10-22 2018-12-28 盛科网络(苏州)有限公司 Routed domain, the method and apparatus for handling IP packet in routed domain
CN109684834A (en) * 2018-12-21 2019-04-26 福州大学 A kind of gate leve hardware Trojan horse recognition method based on XGBoost
CN110929301A (en) * 2019-11-20 2020-03-27 海宁利伊电子科技有限公司 Hardware Trojan horse detection method based on lifting algorithm
CN111967343A (en) * 2020-07-27 2020-11-20 广东工业大学 Detection method based on simple neural network and extreme gradient lifting model fusion
CN111983429A (en) * 2020-08-19 2020-11-24 Oppo广东移动通信有限公司 Chip verification system, chip verification method, terminal and storage medium
CN112818344A (en) * 2020-08-17 2021-05-18 北京辰信领创信息技术有限公司 Method for improving virus killing rate by applying artificial intelligence algorithm
CN113806338A (en) * 2021-11-18 2021-12-17 深圳索信达数据技术有限公司 Data discrimination method and system based on data sample imaging

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060089994A1 (en) * 2002-03-05 2006-04-27 Hayes John W Concealing a network connected device
CN103475663A (en) * 2013-09-13 2013-12-25 无锡华御信息技术有限公司 Trojan recognition method based on network communication behavior characteristics
CN104168272A (en) * 2014-08-04 2014-11-26 国家电网公司 Trojan horse detection method based on communication behavior clustering
CN105227408A (en) * 2015-10-22 2016-01-06 蓝盾信息安全技术股份有限公司 A kind of intelligent wooden horse recognition device and method
CN106790193A (en) * 2016-12-30 2017-05-31 山石网科通信技术有限公司 The method for detecting abnormality and device of Intrusion Detection based on host network behavior

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060089994A1 (en) * 2002-03-05 2006-04-27 Hayes John W Concealing a network connected device
CN103475663A (en) * 2013-09-13 2013-12-25 无锡华御信息技术有限公司 Trojan recognition method based on network communication behavior characteristics
CN104168272A (en) * 2014-08-04 2014-11-26 国家电网公司 Trojan horse detection method based on communication behavior clustering
CN105227408A (en) * 2015-10-22 2016-01-06 蓝盾信息安全技术股份有限公司 A kind of intelligent wooden horse recognition device and method
CN106790193A (en) * 2016-12-30 2017-05-31 山石网科通信技术有限公司 The method for detecting abnormality and device of Intrusion Detection based on host network behavior

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DAN JIANG等: ""An Approach to Detect Remote Access Trojan in the Early Stage of Communication"", 《2015 IEEE 29TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS》 *
李巍等: ""远控型木马通信三阶段流量行为特征分析"", 《信息网络安全》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809989A (en) * 2018-06-14 2018-11-13 北京中油瑞飞信息技术有限责任公司 A kind of detection method and device of Botnet
CN108809989B (en) * 2018-06-14 2021-04-23 北京中油瑞飞信息技术有限责任公司 Botnet detection method and device
CN109104437A (en) * 2018-10-22 2018-12-28 盛科网络(苏州)有限公司 Routed domain, the method and apparatus for handling IP packet in routed domain
CN109684834A (en) * 2018-12-21 2019-04-26 福州大学 A kind of gate leve hardware Trojan horse recognition method based on XGBoost
CN109684834B (en) * 2018-12-21 2022-10-25 福州大学 XGboost-based gate-level hardware Trojan horse identification method
CN110929301A (en) * 2019-11-20 2020-03-27 海宁利伊电子科技有限公司 Hardware Trojan horse detection method based on lifting algorithm
CN110929301B (en) * 2019-11-20 2022-07-26 海宁利伊电子科技有限公司 Hardware Trojan horse detection method based on lifting algorithm
CN111967343A (en) * 2020-07-27 2020-11-20 广东工业大学 Detection method based on simple neural network and extreme gradient lifting model fusion
CN112818344A (en) * 2020-08-17 2021-05-18 北京辰信领创信息技术有限公司 Method for improving virus killing rate by applying artificial intelligence algorithm
CN111983429A (en) * 2020-08-19 2020-11-24 Oppo广东移动通信有限公司 Chip verification system, chip verification method, terminal and storage medium
CN113806338A (en) * 2021-11-18 2021-12-17 深圳索信达数据技术有限公司 Data discrimination method and system based on data sample imaging

Also Published As

Publication number Publication date
CN107370752B (en) 2020-09-25

Similar Documents

Publication Publication Date Title
CN107370752A (en) A kind of efficient remote control Trojan detection method
Hwang et al. An unsupervised deep learning model for early network traffic anomaly detection
CN104270392B (en) A kind of network protocol identification method learnt based on three grader coorinated trainings and system
CN109120630B (en) SDN network DDoS attack detection method based on BP neural network optimization
CN107733851A (en) DNS tunnels Trojan detecting method based on communication behavior analysis
Bilge et al. Disclosure: detecting botnet command and control servers through large-scale netflow analysis
CN103023725B (en) Anomaly detection method based on network flow analysis
Gogoi et al. MLH-IDS: a multi-level hybrid intrusion detection method
CN111817982A (en) Encrypted flow identification method for category imbalance
CN106657141A (en) Android malware real-time detection method based on network flow analysis
Soe et al. Rule generation for signature based detection systems of cyber attacks in iot environments
CN104660464B (en) A kind of network anomaly detection method based on non-extension entropy
EP1907940A2 (en) Method and apparatus for whole-network anomaly diagnosis and method to detect and classify network anomalies using traffic feature distributions
CN111224994A (en) Botnet detection method based on feature selection
Alshammari et al. Investigating two different approaches for encrypted traffic classification
Adams et al. Data analysis for network cyber-security
Sun et al. Detection and classification of malicious patterns in network traffic using Benford's law
CN103840983A (en) WEB tunnel detection method based on protocol behavior analysis
CN104283897A (en) Trojan horse communication feature fast extraction method based on clustering analysis of multiple data streams
CN111385145A (en) Encryption flow identification method based on ensemble learning
CN106330611A (en) Anonymous protocol classification method based on statistical feature classification
CN111200600B (en) Internet of things equipment flow sequence fingerprint feature extraction method
CN112800424A (en) Botnet malicious traffic monitoring method based on random forest
CN106101071A (en) The method that defence link drain type CC that a kind of Behavior-based control triggers is attacked
CN116684877A (en) GYAC-LSTM-based 5G network traffic anomaly detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant