CN102201937B - Method for detecting Trojan quickly based on heartbeat behavior analysis - Google Patents

Method for detecting Trojan quickly based on heartbeat behavior analysis Download PDF

Info

Publication number
CN102201937B
CN102201937B CN 201110157821 CN201110157821A CN102201937B CN 102201937 B CN102201937 B CN 102201937B CN 201110157821 CN201110157821 CN 201110157821 CN 201110157821 A CN201110157821 A CN 201110157821A CN 102201937 B CN102201937 B CN 102201937B
Authority
CN
China
Prior art keywords
heartbeat
packet
session
wooden horse
trojan
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201110157821
Other languages
Chinese (zh)
Other versions
CN102201937A (en
Inventor
刘胜利
杨杰
陈嘉勇
孟磊
吴林锦
曾诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University Strategic Support Force of PLA
Original Assignee
刘胜利
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 刘胜利 filed Critical 刘胜利
Priority to CN 201110157821 priority Critical patent/CN102201937B/en
Publication of CN102201937A publication Critical patent/CN102201937A/en
Application granted granted Critical
Publication of CN102201937B publication Critical patent/CN102201937B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method for detecting Trojan quickly based on heartbeat behavior analysis, which is characterized in that the differences between a Trojan communication behavior and a normal network communication behavior are analyzed at the stage by analyzing whether a heartbeat gap between two adjacent heartbeat processes is regular and the ratio of the numbers of packets received and sent respectively by a controlled terminal is equal; essential distinctions between the Trojan communication behavior and the normal network communication behavior are mined and behavioral characteristics are extracted as well as suspected Trojan is detected. The method is utilized to realize effective detection of the Trojan communications in networks by analyzing the Trojan heartbeat behaviors so as to disconnect the controlled Trojan terminal with an attacker in real time, thereby preventing stealing information.

Description

Quick Trojan detecting method based on the heartbeat behavioural analysis
Technical field
The present invention relates to a kind of Trojan Horse Detection of analyzing based on communication behavior, particularly relate to a kind of quick Trojan detecting method based on the heartbeat behavioural analysis.
Background technology
The current attack great majority of stealing secret information are to adopt wooden horse to realize, the characteristics of wooden horse maximum namely are that its behavior is often with stronger disguise.After wooden horse successfully is implanted to object-computer, the wooden horse control end must and controlled terminal communicate to controlled terminal assign control command or control controlled terminal the information of obtaining is returned to control end.The disguise of communication has determined the survival ability of wooden horse to a great extent.The Network Covert Channel technology of rising in recent years is about to communication data and is embedded into the technology of transmitting in the normal network communication protocol, has satisfied greatly the demand of wooden horse communication.Utilizing Network Covert Channel to communicate becomes the major way that wooden horse carries out communication, and the assailant often sets up convert channel by common protocol such as HTTP, HTTPS and controlled main frame carried out Long-distance Control, steal information.Developing rapidly of the wooden horse communication technology caused serious threat to national security is stable.Therefore, the network service behavior that how effectively to detect wooden horse just becomes important theory and technology problem of information security field.
At present, Trojan detecting method based on communication behavior is a lot, and main method concentrates on the detection of the interactive operation behavior between assailant and the controlled terminal, the method that detects for wooden horse heartbeat behavior also do not occur, and all there are certain defect in these class methods, and do not possess good versatility.
Borders etc. utilize the time interval, the request of HTTP request to wrap the various filters of the latent structures such as size, package head format, bandwidth occupancy, request rule and detect wooden horse communication.Yet wooden horse can just can be walked around the various filters that Borders etc. constructs by some communication details of simple modification.For example: wooden horse only need will the request bag size be limited in and can make the large small filter of request bag lose effect in a certain threshold value.
Pack etc. have proposed a kind of method that the HTTP convert channel is detected by the behavior profile of usage data stream.Behavior profile is based on a large amount of tolerance, such as sum and the connect hours of average data bag size, small data packets and large packet ratio, the variation of packet model, all sending/receiving packets.If the observation characteristic of a data flow departs from the behavior profile of normal HTTP packet, it then very likely is the HTTP convert channel.Method detects mainly for the HTTP tunnel, and versatility is relatively poor.
The Elman network is trained in continuous T CP ISN number of utilizing normal protocol to produce such as Tumoian, then ISN number of ISN number of reality predicting with neural net compared, then thinking when the difference of actual value and predicted value surpasses predefined threshold value has convert channel to exist.The author has realized the detection to the NUSHU convert channel by this method.But can only detecting specific wooden horse communication, the method do not possess equally versatility.
Zhang and Paxson utilize data packet interarrival times and data package size to describe a kind of wooden horse communication interaction model, for detection of rogue programs such as wooden horse and back doors.This model carries out following description to the wooden horse communication behavior: 1, the adjacent data packet interarrival times meets the Pareto distribution in the wooden horse communication process; 2, small data packets has command interaction owing in the wooden horse communication process, so should account for certain proportion.But can make by different algorithms the adjacent data packet interarrival times satisfy various distribution requirements in the actual wooden horse communication process, the data packet interarrival times can be subject to the impact of network topology to a great extent in addition, so there is certain drawback with it in the data packet interarrival times as behavior description.And the short command in the wooden horse communication process can be hidden in the larger html page information, can not realize effective detection so emphasize the ratio of the small data packets in the communication process.
Below the basic conception that the present invention relates to is made an explanation.
The wooden horse heartbeat: in order to characterize the viability of self, wooden horse can be set up between the client and server end and keep a session, until the trojan horse program of any end is closed or network connection disconnects.The maintenance of this session realizes by sending packet to the other side.Because the most of mode that adopts timed sending of this packet, its existing way and meaning are similar to the heartbeat of animal, therefore be called as " heartbeat packet ".
Heartbeat interval: twice adjacent " heartbeat " interprocedual has certain time interval, is referred to as " heartbeat interval ".Whether be steady state value according to " heartbeat interval ", wooden horse heartbeat mode can be divided into following two kinds: 1, regularly long heartbeat, namely " heartbeat interval " is steady state value.2, become the duration heartbeat.Because regularly long heartbeat rule is obvious, is difficult to resist statistical analysis.Therefore the assailant is normal adopts various algorithms with " heartbeat interval " randomization, makes it no longer have obvious statistical nature and resists detection.Especially, regularly long heartbeat also can be considered the ordinary situation that becomes the duration heartbeat.
The heartbeat process: wooden horse is when each transmission " heartbeat packet ", wooden horse controlled terminal and control end program may also can send some other packets to the other side, expression is to the affirmation of receiving packet, with " heartbeat packet " with follow a group acknowledge packet of its transmission to be called " heartbeat process ".
The wooden horse communication process: the wooden horse communication process can be divided into two stages: keep connecting without operational phase and operational phase.After wooden horse was implanted to goal systems, the assailant only can operate (this moment, wooden horse communication was in the operational phase) to wooden horse within the limited time period, and all the other most of the time wooden horses all are in idle condition.The part wooden horse under idle condition, keep with the assailant between associated process be called and keep being connected without the operational phase.
Four-tuple: claim that { source IP address, source port, purpose IP address, destination interface } is four-tuple.
Four-tuple of equal value: if four-tuple { a 1, b 1, c 1, d 1And { a 2, b 2, c 2, d 2Satisfy: a 1=c 2And b 1=d 2And c 1=a 2And d 1=b 2, then claim { a 1, b 1, c 1, d 1And { a 2, b 2, c 2, d 2It is four-tuple of equal value.
Summary of the invention
The objective of the invention is in order in time cut off contacting between wooden horse controlled terminal and the assailant, effectively to stop the generation of stolen penetralia spare by the analysis of wooden horse heartbeat behavior being realized the effective detection to wooden horse communication in the network.A kind of quick Trojan detecting method based on the heartbeat behavioural analysis specifically is provided.
Technical scheme: a kind of quick Trojan detecting method based on the heartbeat behavioural analysis, whether have by " heartbeat interval " analyzed between the adjacent two heartbeat processes whether the controlled terminal reception equates with the data packet number ratio that sends in regularity and the heartbeat process, detect doubtful wooden horse.
For ease of extracting the heartbeat behavioural characteristic, need network data is constructed as session chain sheet form.The efficient of establishment session chained list directly affects the extraction efficiency to the heartbeat behavioural characteristic, this is proposed a kind of algorithm of Rapid Establishment session chained list.
The network data of catching is put in order according to BlueDrama: with the IP address of monitored object and port as source IP address and source port.Packet is carried out sessionizing according to four-tuple of equal value, i.e. each session is by four-tuple unique identification of equal value (this moment each session chained list comprise bidirectional traffic), and selects the session chained list as the data structure of store session.Select the session chained list as the reason of the data structure of recording conversation to be: because network service is a dynamic process, the packet in the session constantly increases along with the carrying out of communication, also dynamic change will occur thereupon for the data structure of preserving session.Set up in the session chained list process, need to search position corresponding to packet according to the four-tuple of equal value of chained list node, and be inserted into to this position.Therefore, the recording mode of session and seek rate will directly affect Session reassemble efficient.
Session can use Multidimensional numerical or multistage chained list to preserve; Multidimensional numerical have storage efficiency high, search conveniently, the advantage such as access speed is fast, it is its memory allocated space in advance that but Multidimensional numerical requires, and can't change the Multidimensional numerical size in case set up, and easily causes the space waste, and BlueDrama quantity is unfixing, can't allocate the space in advance for it; The advantage of chained list is that capable of dynamic adds or deletion, do not need to allocate in advance the space, but that shortcoming is seek rate is slow.
The present invention adopts in conjunction with the session of recombinating of the array linked list structure of Hash table and multistage chained list.The array linked list structure refers to the data structure that array and chained list combine; Array linked list can be by sacrificing less memory space Effective Raise search efficiency.Can set according to the different qualities of each element in the four-tuple of equal value the link order of array linked list, be made as the first order of array linked list by and the most uniform element of respective session distributed number moderate span, set successively the link order, to obtain higher Session reassemble efficient.Make a concrete analysis of as follows:
If number of sessions is S, if all sessions are set up with the form of traditional single linked list, all to carry out sequential search to the session chained list after receiving packet at every turn, the average computation complexity of sequential search is O (S/2).
With the form arrangement session of array linked list, to establish array and have n subscript, the session chained list number of i subscript serial connection of array is α i, then i lower target probability of receive data bag adding array is
Figure BDA0000067971700000051
Therefore the average time complexity that chained list is inquired about is:
O ( α i S · α i 2 ) = O ( α i 2 ) 2 S
Can get according to theorem " root mean square is more than or equal to arithmetic average ":
Σ i = 1 n α i 2 n ≥ Σ i = 1 n α i n = S n
The inequality both sides square can be got simultaneously:
Σ i = 1 n α i 2 ≥ S 2 n
And if only if α 12=...=α nThe time, wherein
Figure BDA0000067971700000055
Namely The time
Figure BDA0000067971700000057
Minimum.
Hence one can see that, and when all chained list node mean allocation each lower timestamp to array, the time complexity that packet is searched is minimum, less than the computation complexity of single linked list.Therefore when setting up the session chained list, should choose suitable arrangement of elements order according to span and the corresponding number of sessions distribution situation of element in the four-tuple of equal value.
Span and the corresponding number of sessions distribution situation of each element are as follows in the four-tuple of equal value:
(1) source IP address: be often referred to intranet host IP address.The span of source IP address is: 10.0.0.0~10.255.255.25,172.16.0.0~172.31.255.255,192.168.0.0~192.168.255.255, the IP space of relative the Internet, the distribution of sessions that the source IP address space is little and it is corresponding is even.
(2) source port: according to the RFC protocol specification, source port number is generally any number between 1024~65535.The value space of source port is larger, and its corresponding number of sessions skewness.
(3) purpose IP address: the span of purpose IP address is whole IPv4 address space, and the value space is huge, and its corresponding number of sessions skewness.
(4) destination interface: destination interface is generally the formulation port of agreement, scope mainly concentrates between 1~1023, but in the current network service take agreements such as HTTP, HTTPS as main, therefore the destination interface of most of network service is the ports such as 80,443,8080, and its corresponding number of sessions is inhomogeneities very.
In sum, the source IP address span is less and be evenly distributed, and corresponding number of sessions distributes also more evenly, is applicable to the first order as array linked list.Take monitoring objective as C class local area network (LAN) as example, the construction method of array linked list is as follows: because last 1 byte distribution situation of source IP address is the most even, set up Hash table therefore it can be considered as the cryptographic Hash of source IP address, source IP address is set to the first order of array linked list.The rest may be inferred, respectively with source port, purpose IP address and destination interface second and third and the level Four as array linked list, based on the session list structure of array linked list as shown in Figure 1.
Detection to the wooden horse communication behavior can realize by detecting " heartbeat packet "." heartbeat packet " has obvious statistical law, and the extraction of " heartbeat packet " feature adopts traditional statistical analysis technique in conjunction with Time-Frequency Analysis Method.According to wooden horse communication behavior characteristics, at first judge in the session whether all have " class heartbeat behavior ", and calculate the behavior " heartbeat interval ".
The time stamp data stream that arrives packet in the note session is (unit: second):
Figure BDA0000067971700000061
Get The minimum " heartbeat interval " of supposing the wooden horse heartbeat is min{ Δ t} (for example, getting min{ Δ t}=5), and is then right
Figure BDA0000067971700000063
If, z i〉=min{ Δ t} then thinks to have " class heartbeat behavior " in the session.Otherwise there be not " class heartbeat behavior " usually in the error control mechanism according to TCP/IP in the normal conversation.
Exist in session under " class heartbeat behavior " prerequisite, this method is extracted 2 session statistical natures and is being connected the communication behavior that keeps without the operational phase for detection of wooden horse.
(1) " heartbeat process " receives equal with the packet ratio of transmission.
Because the heartbeat behavior of most of wooden horse has self-similarity in communication process, so the packet ratio that transmit leg (or recipient) receives in " heartbeat process " and sends equates.
To the long-time section of a continuous m t (t>min{ Δ t}), calculate respectively the packet ratio beta that receives and send i, i=1,2 ... m ' ..., m.If
Figure BDA0000067971700000071
In do not have the situation that has at least the individual value of m ' to equate, then judge not have doubtful wooden horse heartbeat behavior in the session.Usually get m=10, t=900, m '=5.
(2) stationarity of " heartbeat interval " is less than threshold value.
" heartbeat interval " of wooden horse is not invariable, in order to hide statistical analysis, the part wooden horse has designed special algorithm and has been used for producing variable " heartbeat interval ", its objective is with " heartbeat interval " that change to hide constant heartbeat process, makes the wooden horse heartbeat become irregular and follows.
Adopt time frequency analysis to judge whether network communication data flow contains wooden horse heartbeat rule.After the time interval of proper network communication data packet was transformed into frequency domain, corresponding intermediate frequency and high frequency coefficient were all larger, and this shows that the time interval of proper network communication data packet shows the characteristic of non-stationary signal.This with during proper network is communicated by letter since the randomness that manual operation causes conform to.Wooden horse " heartbeat interval " is then opposite, because the wooden horse heartbeat has certain rule, causes it to show the characteristic of relative stationary signal.The wooden horse that wherein adopts regularly rectangular formula to carry out heartbeat is because the heartbeat rule is very obvious, so that the medium-high frequency coefficient of its signal is almost 0; And to becoming the wooden horse of duration heartbeat, although adopted the mode of various camouflages, but still fail to simulate random behavior, so though its characteristics of signals shows certain fluctuation, with proper network communicate by letter compare comparatively steady.Therefore the detected characteristics of utilizing time-frequency analysis technology to extract is not only effective to the wooden horse of regularly long " heartbeat interval ", effective equally to the wooden horse of the change duration " heartbeat interval " of introducing the pseudorandom fluctuation.
The time interval sampled result (unit: second) of getting the one-way coversation data flow is: Wherein X represents the sampling set of the packet time interval, x iRepresent i sampled value, n represents sample size.
Figure BDA0000067971700000082
For X being carried out the characteristic vector after the discrete Fourier transform (DFT) (DFT), wherein y iI coefficient after the expression process DFT conversion.
The stationarity of " heartbeat interval " is defined as:
Stability = Σ i = 2 n | y i | n - 1 ≤ ω
Wherein, Stability is the stationarity of " heartbeat interval ", and ω is threshold value (usually getting ω=15).When Stability is little when equaling ω, " heartbeat interval " has stationarity.
Because the computation complexity of DFT conversion is higher, can also be based on the statistic of lower the detected wooden horse heartbeat behavior of the essence structure computation complexity of secondary haar wavelet decomposition.Remember that still X represents the sampling set of the one-way data stream packets time interval, order
Figure BDA0000067971700000084
Be the characteristic vector after the conversion.Get
t i = x i - x i - 1 2 ;
w i = t i - t i - 2 2 = x i - x i - 1 - x i - 2 + x i - 3 4 ,
Wherein, w iBe equivalent to initial data is made value after the second differnce.The stationarity of " heartbeat interval " is defined as at this moment:
Stability = Σ i = 4 n w i n - 3 ≤ ω
Wherein, Stability is the stationarity of " heartbeat interval ", and ω is threshold value (usually getting ω=0.005).If greater than ω, then judging, the stationarity of " heartbeat interval " do not have doubtful wooden horse heartbeat behavior in the session.
Beneficial effect: the present invention proposes a kind of quick Trojan detecting method based on the heartbeat behavioural analysis.The method is keeping connection to utilize heartbeat packet to keep the characteristics that the wooden horse client and server connects without the operational phase for wooden horse, in conjunction with traditional statistical analysis and time-frequency analysis technology, analyze the difference between this stage wooden horse communication behavior and the proper network communication behavior, excavate the essential distinction between the two and extract behavioural characteristic.Utilize the institute's feature of extracting realization to the detection of wooden horse.
Method proposed by the invention can detect and keep connecting without operational phase wooden horse heartbeat behavior, can realize the wooden horse in the network is detected in conjunction with existing domain name white list strobe utility, thereby help in time to cut off contacting between trojan horse program and the assailant, effectively prevent the steal secret information generation of behavior of wooden horse.
Description of drawings
Fig. 1 is session list structure figure;
Fig. 2 is that the wooden horse controlled terminal is connected the connection maintenance without operational phase " heartbeat process " and " heartbeat interval " schematic diagram with control end;
Fig. 3 is the DFT transformed samples figure in wooden horse " heartbeat interval " packet time interval of communicating by letter with proper network.
Embodiment
Embodiment one: the quick Trojan detecting method based on the heartbeat behavioural analysis is:
The network data of catching is put in order according to BlueDrama: with the IP address of monitored object and port as source IP address and source port.Packet is carried out sessionizing according to four-tuple of equal value, i.e. each session is by four-tuple unique identification of equal value (this moment each session chained list comprise bidirectional traffic), and selects the session chained list as the data structure of store session.Select the session chained list as the reason of the data structure of recording conversation to be: because network service is a dynamic process, the packet in the session constantly increases along with the carrying out of communication, also dynamic change will occur thereupon for the data structure of preserving session.Set up in the session chained list process, need to search position corresponding to packet according to the four-tuple of equal value of chained list node, and be inserted into to this position.Therefore, the recording mode of session and seek rate will directly affect Session reassemble efficient.
Session can use Multidimensional numerical or multistage chained list to preserve; Multidimensional numerical have storage efficiency high, search conveniently, the advantage such as access speed is fast, it is its memory allocated space in advance that but Multidimensional numerical requires, and can't change the Multidimensional numerical size in case set up, and easily causes the space waste, and BlueDrama quantity is unfixing, can't allocate the space in advance for it; The advantage of chained list is that capable of dynamic adds or deletion, do not need to allocate in advance the space, but that shortcoming is seek rate is slow.
The present invention adopts in conjunction with the session of recombinating of the array linked list structure of Hash table and multistage chained list.The array linked list structure refers to the data structure that array and chained list combine; Array linked list can be by sacrificing less memory space Effective Raise search efficiency.Can set according to the different qualities of each element in the four-tuple of equal value the link order of array linked list, be made as the first order of array linked list by and the most uniform element of respective session distributed number moderate span, set successively the link order, to obtain higher Session reassemble efficient.Make a concrete analysis of as follows:
If number of sessions is S, if all sessions are set up with the form of traditional single linked list, all to carry out sequential search to the session chained list after receiving packet at every turn, the average computation complexity of sequential search is O (S/2).With the form arrangement session of array linked list, to establish array and have n subscript, the session chained list number of i subscript serial connection of array is α i, then i lower target probability of receive data bag adding array is
Figure BDA0000067971700000101
Therefore the average time complexity that chained list is inquired about is:
O ( α i S · α i 2 ) = O ( α i 2 ) 2 S
Can get according to theorem " root mean square is more than or equal to arithmetic average ":
Σ i = 1 n α i 2 n ≥ Σ i = 1 n α i n = S n
The inequality both sides square can be got simultaneously:
Σ i = 1 n α i 2 ≥ S 2 n
And if only if α 12=...=α nThe time, wherein
Figure BDA0000067971700000112
Namely The time
Figure BDA0000067971700000114
Minimum.
Hence one can see that, and when all chained list node mean allocation each lower timestamp to array, the time complexity that packet is searched is minimum, less than the computation complexity of single linked list.Therefore when setting up the session chained list, should choose suitable arrangement of elements order according to span and the corresponding number of sessions distribution situation of element in the four-tuple of equal value.Span and the corresponding number of sessions distribution situation of each element are as follows in the four-tuple of equal value:
(1) source IP address: be often referred to intranet host IP address.The span of source IP address is:
10.0.0.0~10.255.255.25,172.16.0.0~172.31.255.255,192.168.0.0~192.168.255.255, the IP space of relative the Internet, the distribution of sessions that the source IP address space is little and it is corresponding is even.
(2) source port: according to the RFC protocol specification, source port number is generally any number between 1024~65535.The value space of source port is larger, and its corresponding number of sessions skewness.
(3) purpose IP address: the span of purpose IP address is whole IPv4 address space, and the value space is huge, and its corresponding number of sessions skewness.
(4) destination interface: destination interface is generally the formulation port of agreement, scope mainly concentrates between 1~1023, but in the current network service take agreements such as HTTP, HTTPS as main, therefore the destination interface of most of network service is the ports such as 80,443,8080, and its corresponding number of sessions is inhomogeneities very.
In sum, the source IP address span is less and be evenly distributed, and corresponding number of sessions distributes also more evenly, is applicable to the first order as array linked list.Take monitoring objective as C class local area network (LAN) as example, the construction method of array linked list is as follows: because last 1 byte distribution situation of source IP address is the most even, set up Hash table therefore it can be considered as the cryptographic Hash of source IP address, source IP address is set to the first order of array linked list.The rest may be inferred, respectively with source port, purpose IP address and destination interface second and third and the level Four as array linked list, based on the session list structure of array linked list as shown in Figure 1.
Detection to the wooden horse communication behavior can realize by detecting " heartbeat packet "." heartbeat packet " has obvious statistical law, and the extraction of " heartbeat packet " feature adopts traditional statistical analysis technique in conjunction with Time-Frequency Analysis Method.According to wooden horse communication behavior characteristics, at first judge in the session whether all have " class heartbeat behavior ", and calculate the behavior " heartbeat interval ".
The time stamp data stream that arrives packet in the note session is (unit: second):
Figure BDA0000067971700000121
Get
Figure BDA0000067971700000122
The minimum " heartbeat interval " of supposing the wooden horse heartbeat is min{ Δ t} (for example, getting min{ Δ t}=5), and is then right If, z i〉=min{ Δ t} then thinks to have " class heartbeat behavior " in the session.Otherwise there be not " class heartbeat behavior " usually in the error control mechanism according to TCP/IP in the normal conversation.
Exist in session under " class heartbeat behavior " prerequisite, this method is extracted 2 session statistical natures and is being connected the communication behavior that keeps without the operational phase for detection of wooden horse.
(1) " heartbeat process " receives equal with the packet ratio of transmission.
Because the heartbeat behavior of most of wooden horse has self-similarity in communication process, so the packet ratio that transmit leg (or recipient) receives in " heartbeat process " and sends equates.
To the long-time section of a continuous m t (t>min{ Δ t}), calculate respectively the packet ratio beta that receives and send i, i=1,2 ..., m.If
Figure BDA0000067971700000124
In do not have the situation that has at least the individual value of m ' to equate, then judge not have doubtful wooden horse heartbeat behavior in the session.Usually get m=10, t=900, m '=5.
(2) stationarity of " heartbeat interval " is less than threshold value.。
" heartbeat interval " of wooden horse is not invariable, in order to hide statistical analysis, the part wooden horse has designed special algorithm and has been used for producing variable " heartbeat interval ", its objective is with " heartbeat interval " that change to hide constant heartbeat process, makes the wooden horse heartbeat become irregular and follows.
Adopt time frequency analysis to judge whether network communication data flow contains wooden horse heartbeat rule.After the time interval of proper network communication data packet was transformed into frequency domain, corresponding intermediate frequency and high frequency coefficient were all larger, and this shows that the time interval of proper network communication data packet shows the characteristic of non-stationary signal.This with during proper network is communicated by letter since the randomness that manual operation causes conform to.Wooden horse " heartbeat interval " is then opposite, because the wooden horse heartbeat has certain rule, causes it to show the characteristic of relative stationary signal.The wooden horse that wherein adopts regularly rectangular formula to carry out heartbeat is because the heartbeat rule is very obvious, so that the medium-high frequency coefficient of its signal is almost 0; And to becoming the wooden horse of duration heartbeat, although adopted the mode of various camouflages, but still fail to simulate random behavior, so though its characteristics of signals shows certain fluctuation, with proper network communicate by letter compare comparatively steady.Therefore the detected characteristics of utilizing time-frequency analysis technology to extract is not only effective to the wooden horse of regularly long " heartbeat interval ", effective equally to the wooden horse of the change duration " heartbeat interval " of introducing the pseudorandom fluctuation.
The time interval sampled result (unit: second) of getting the one-way coversation data flow is:
Figure BDA0000067971700000131
Wherein X represents the sampling set of the packet time interval, x iRepresent i sampled value, n represents sample size.
Figure BDA0000067971700000132
For X being carried out the characteristic vector after the discrete Fourier transform (DFT) (DFT), wherein y iI coefficient after the expression process DFT conversion.
The stationarity of " heartbeat interval " is defined as:
Stability = Σ i = 2 n | y i | n - 1 ≤ ω
Wherein, Stability is the stationarity of " heartbeat interval ", and ω is threshold value (usually getting ω=15).When Stability is little when equaling ω, " heartbeat interval " has stationarity.
Embodiment two: something in common no longer repeats among the present embodiment and the embodiment one, difference is: because the computation complexity of DFT conversion is higher, and can also be based on the statistic of lower the detected wooden horse heartbeat behavior of the essence structure computation complexity of secondary haar wavelet decomposition.Remember that still X represents the sampling set of the one-way data stream packets time interval, order
Figure BDA0000067971700000134
Be the characteristic vector after the conversion.Get
t i = x i - x i - 1 2 ;
w i = t i - t i - 2 2 = x i - x i - 1 - x i - 2 + x i - 3 4 ,
Wherein, w iBe equivalent to initial data is made value after the second differnce.The stationarity of " heartbeat interval " is defined as at this moment:
Stability = Σ i = 4 n w i n - 3 ≤ ω
Wherein, Stability is the stationarity of " heartbeat interval ", and ω is threshold value (usually getting ω=0.005).If greater than ω, then judging, the stationarity of " heartbeat interval " do not have doubtful wooden horse heartbeat behavior in the session.

Claims (9)

1. quick Trojan detecting method based on the heartbeat behavioural analysis, it is characterized in that: at first, the network data of catching is put in order according to BlueDrama: packet is carried out sessionizing according to four-tuple of equal value, be each session by four-tuple unique identification of equal value, and select the session chained list as the data structure of store session; Each session chained list will identify with four-tuple of equal value, searches corresponding session according to four-tuple of equal value in the packet, and packet information is added in the corresponding session chained list; Secondly, in connecting the extraction that keeps without operational phase wooden horse communication feature, extract two session statistical natures and connecting the communication behavior that keeps without the operational phase for detection of wooden horse, these two session statistical natures are: packet ratio and " stationarity of heartbeat interval " that " heartbeat process " receives and send, packet ratio feature for " heartbeat process " receives and send then is doubtful wooden horse heartbeat behavior if the packet that receives and send is in equal proportions; For the stationarity feature of " heartbeat interval ", if being less than or equal to threshold value, the stationarity of " heartbeat interval " then is doubtful wooden horse heartbeat behavior; The definition of described four-tuple of equal value: defining { source IP address, source port, purpose IP address, destination interface } when being four-tuple, if four-tuple { a 1, b 1, c 1, d 1And { a 2, b 2, c 2, d 2Satisfy: a 1=c 2And b 1=d 2And c 1=a 2And d 1=b 2, then claim { a 1, b 1, c 1, d 1And { a 2, b 2, c 2, d 2It is four-tuple of equal value; The definition of described " heartbeat ": a session can be set up and keep to wooden horse between the client and server end, the maintenance of this session realizes by sending packet to the other side, and this packet adopts the mode of timed sending to be called " heartbeat packet " or to be called " heartbeat "; The definition of described heartbeat interval: twice adjacent " heartbeat " interprocedual has certain time interval, is referred to as " heartbeat interval "; The definition of described heartbeat process: wooden horse is when each transmission " heartbeat packet ", wooden horse controlled terminal and control end program may also can send some other packets to the other side, expression is to the affirmation of receiving packet, with " heartbeat packet " with follow a group acknowledge packet of its transmission to be called " heartbeat process ".
2. Trojan detecting method according to claim 1 is characterized in that: the judgement that the packet that receives and send about " heartbeat process " is in equal proportions, to long-time section of continuous m Δ t, the unit of Δ t be second to calculate the packet ratio beta of reception and transmission i, i=1,2..., m ' ... m, if
Figure FDA0000365755730000021
In do not have the situation that has at least the individual value of m ' to equate, then judge not have doubtful wooden horse heartbeat behavior in the session.
3. Trojan detecting method according to claim 2 is characterized in that: get m=10, △ t=900, m '=5.
4. Trojan detecting method according to claim 1, it is characterized in that: be less than or equal to the judgement of threshold value about the stationarity of " heartbeat interval ", the time interval sampled result of getting the one-way coversation data flow is:
Figure FDA0000365755730000022
Wherein X represents the sampling set of the packet time interval, x iRepresent i sampled value, n represents sample size;
Figure FDA0000365755730000023
For X being carried out the characteristic vector after the discrete Fourier transform (DFT), wherein y iI coefficient after the expression process DFT conversion; The stationarity of " heartbeat interval " is defined as Stability: when satisfying
Figure FDA0000365755730000024
The time, the stationarity of " heartbeat interval " is less than or equal to threshold value, then is judged to be trojan horse program, otherwise then is the proper network communication, and wherein ω is ' heartbeat interval ' stationarity threshold value.
5. Trojan detecting method according to claim 4 is characterized in that: get ω=15.
6. Trojan detecting method according to claim 4, it is characterized in that: the stationarity of heartbeat interval is defined as stability, when satisfying
Figure FDA0000365755730000025
The time, the stationarity of " heartbeat interval " is less than or equal to threshold value, then is judged to be trojan horse program, otherwise then is the proper network communication, wherein, w i = t i - t i - 2 2 = x i - x i - 1 - x i - 2 + x i - 3 4 .
7. Trojan detecting method according to claim 6 is characterized in that: get ω=0.005.
8. Trojan detecting method according to claim 1, it is characterized in that: the link order of setting array linked list according to the different qualities of each element in the four-tuple of equal value, moderate and the most uniform element of respective session distributed number is made as the first order of array linked list span, sets successively the link order.
9. Trojan detecting method according to claim 1, it is characterized in that: described four-tuple is source IP address, source port, purpose IP address, destination interface.
CN 201110157821 2011-06-13 2011-06-13 Method for detecting Trojan quickly based on heartbeat behavior analysis Active CN102201937B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110157821 CN102201937B (en) 2011-06-13 2011-06-13 Method for detecting Trojan quickly based on heartbeat behavior analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110157821 CN102201937B (en) 2011-06-13 2011-06-13 Method for detecting Trojan quickly based on heartbeat behavior analysis

Publications (2)

Publication Number Publication Date
CN102201937A CN102201937A (en) 2011-09-28
CN102201937B true CN102201937B (en) 2013-10-23

Family

ID=44662342

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110157821 Active CN102201937B (en) 2011-06-13 2011-06-13 Method for detecting Trojan quickly based on heartbeat behavior analysis

Country Status (1)

Country Link
CN (1) CN102201937B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103139008A (en) * 2011-11-23 2013-06-05 中兴通讯股份有限公司 Self-adaption method and device capable of detecting message heartbeat period
CN102523223B (en) * 2011-12-20 2014-08-27 北京神州绿盟信息安全科技股份有限公司 Trojan detection method and apparatus thereof
CN103491107B (en) * 2013-10-14 2017-01-04 刘胜利 Wooden horse communication feature rapid extracting method based on network data flow bunch cluster
CN104468507B (en) * 2014-10-28 2018-01-30 刘胜利 Based on the Trojan detecting method without control terminal flow analysis
CN106416171B (en) * 2014-12-30 2020-06-16 华为技术有限公司 Characteristic information analysis method and device
CN105262729B (en) * 2015-09-11 2018-07-31 携程计算机技术(上海)有限公司 Trojan detecting method and system
CN106027548B (en) * 2016-06-28 2019-05-17 武汉斗鱼网络科技有限公司 Net cast website generates the system and method for white list based on page heartbeat event
CN106992992B (en) * 2017-05-24 2020-02-11 南京中孚信息技术有限公司 Trojan horse detection method based on communication behaviors
CN107360190B (en) * 2017-08-28 2021-01-08 刘胜利 Trojan communication behavior detection method based on sequence pattern recognition
CN108390864B (en) * 2018-02-01 2020-12-11 杭州安恒信息技术股份有限公司 Trojan horse detection method and system based on attack chain behavior analysis
CN110430226B (en) * 2019-09-16 2021-08-17 腾讯科技(深圳)有限公司 Network attack detection method and device, computer equipment and storage medium
CN113420941A (en) * 2021-07-16 2021-09-21 湖南快乐阳光互动娱乐传媒有限公司 Risk prediction method and device for user behavior
CN113722705B (en) * 2021-11-02 2022-02-08 北京微步在线科技有限公司 Malicious program clearing method and device
CN114024770B (en) * 2021-12-10 2024-02-13 天融信雄安网络安全技术有限公司 Trojan intrusion detection method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101202652A (en) * 2006-12-15 2008-06-18 北京大学 Device for classifying and recognizing network application flow quantity and method thereof
CN101635658A (en) * 2009-08-26 2010-01-27 中国科学院计算技术研究所 Method and system for detecting abnormality of network secret stealing behavior
CN101854275A (en) * 2010-05-25 2010-10-06 军工思波信息科技产业有限公司 Method and device for detecting Trojans by analyzing network behaviors
CN101895521A (en) * 2009-05-22 2010-11-24 中国科学院研究生院 Network worm detection and characteristic automatic extraction method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101202652A (en) * 2006-12-15 2008-06-18 北京大学 Device for classifying and recognizing network application flow quantity and method thereof
CN101895521A (en) * 2009-05-22 2010-11-24 中国科学院研究生院 Network worm detection and characteristic automatic extraction method and system
CN101635658A (en) * 2009-08-26 2010-01-27 中国科学院计算技术研究所 Method and system for detecting abnormality of network secret stealing behavior
CN101854275A (en) * 2010-05-25 2010-10-06 军工思波信息科技产业有限公司 Method and device for detecting Trojans by analyzing network behaviors

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
木马网络通信特征提取模型的设计与实现;邢云东等;《计算机工程与设计》;20101028;第31卷(第20期);4382-4384、4446页 *
李焕洲等.基于行为特征库的木马检测模型设计.《四川师范大学学报(自然科学版)》.2011,第34卷(第1期),123-127页. *
邢云东等.木马网络通信特征提取模型的设计与实现.《计算机工程与设计》.2010,第31卷(第20期),4382-4384、4446页.

Also Published As

Publication number Publication date
CN102201937A (en) 2011-09-28

Similar Documents

Publication Publication Date Title
CN102201937B (en) Method for detecting Trojan quickly based on heartbeat behavior analysis
CN104283897B (en) Wooden horse communication feature rapid extracting method based on multiple data stream cluster analysis
Yu et al. An efficient SDN-based DDoS attack detection and rapid response platform in vehicular networks
Li et al. DDoS attack detection and wavelets
CN103491107B (en) Wooden horse communication feature rapid extracting method based on network data flow bunch cluster
EP2661049A2 (en) System and method for malware detection
CN113206860B (en) DRDoS attack detection method based on machine learning and feature selection
CN103957203B (en) A kind of network security protection system
CN108574668B (en) DDoS attack flow peak value prediction method based on machine learning
CN102438025A (en) Indirect distributed denial of service attack defense method and system based on Web agency
Karimi et al. Distributed network traffic feature extraction for a real-time IDS
Eslahi et al. An efficient false alarm reduction approach in HTTP-based botnet detection
CN104468507A (en) Torjan detection method based on uncontrolled end flow analysis
CN103036743B (en) A kind of detection method of TCP heartbeat behavior of wooden horse of stealing secret information
Gu et al. Multiple-features-based semisupervised clustering DDoS detection method
Wang et al. Botnet detection using social graph analysis
Liu et al. Real-time diagnosis of network anomaly based on statistical traffic analysis
CN105791236B (en) A kind of wooden horse communication channel detection method and system
CN115396163B (en) Malicious periodic behavior detection method
CN101883030B (en) Detection method of P2P nodes based on random measure of IP addresses
Kaur et al. A novel multi scale approach for detecting high bandwidth aggregates in network traffic
CN115499179A (en) Method for detecting DoH tunnel flow in backbone network
CN104468601A (en) P2P worm detecting system and method
He et al. PeerSorter: classifying generic P2P traffic in real-time
Brun et al. Random neural networks and deep learning for attack detection at the edge

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent for invention or patent application
CB03 Change of inventor or designer information

Inventor after: Liu Shengli

Inventor after: Yang Jie

Inventor after: Chen Jiayong

Inventor after: Meng Lei

Inventor after: Wu Linjin

Inventor after: Zeng Cheng

Inventor before: Liu Shengli

Inventor before: Chen Jiayong

Inventor before: Meng Lei

Inventor before: Wu Linjin

Inventor before: Zeng Cheng

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: LIU SHENGLI CHEN JIAYONG MENG LEI WU LINJIN CENG CHENG TO: LIU SHENGLI YANG JIE CHEN JIAYONG MENG LEI WU LINJIN CENG CHENG

C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160926

Address after: 450000 B, building 8, No. 1, No. 18-19, welfare Road, Jinshui District, Henan, Zhengzhou

Patentee after: Henan Jindun information security level Technical Evaluation Center Co. Ltd.

Address before: Located in Henan city of Zhengzhou Province Kim street 450002 No. 7 No. 19 Building 1 unit 302

Patentee before: Liu Shengli

TR01 Transfer of patent right

Effective date of registration: 20190103

Address after: 610000 Chengdu High-tech Zone, Sichuan Province, 2 buildings and 3 floors, No. 4, Xinhang Road

Patentee after: Sichuan Yuxin'an Electronic Technology Co., Ltd.

Address before: 450000 Floor 18-19, Block B, Office Building No. 1, Fucai Road, Jinshui District, Zhengzhou City, Henan Province

Patentee before: Henan Jindun information security level Technical Evaluation Center Co. Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200716

Address after: Room 302, unit 1, building 19, No.7, Jianxue street, Jinshui District, Zhengzhou City, Henan Province

Patentee after: Liu Shengli

Address before: 610000 Chengdu High-tech Zone, Sichuan Province, 2 buildings and 3 floors, No. 4, Xinhang Road

Patentee before: Sichuan Yuxin'an Electronic Technology Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210111

Address after: 450000 Science Avenue 62, Zhengzhou High-tech Zone, Henan Province

Patentee after: Information Engineering University of the Chinese People's Liberation Army Strategic Support Force

Address before: Unit 302, unit 1, building 19, No.7 Jianxue street, Jinshui District, Zhengzhou City, Henan Province, 450000

Patentee before: Liu Shengli