CN106789961A - A kind of complex network application reverse process method based on hidden Markov model - Google Patents
A kind of complex network application reverse process method based on hidden Markov model Download PDFInfo
- Publication number
- CN106789961A CN106789961A CN201611094418.9A CN201611094418A CN106789961A CN 106789961 A CN106789961 A CN 106789961A CN 201611094418 A CN201611094418 A CN 201611094418A CN 106789961 A CN106789961 A CN 106789961A
- Authority
- CN
- China
- Prior art keywords
- hidden markov
- markov model
- network
- flow
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/20—Network architectures or network communication protocols for network security for managing network security; network security policies in general
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The present invention relates to a kind of complex network application reverse process method based on hidden Markov model, it is characterised in that comprise the following steps:First, network traffics are acquired with classification, known mass flow are identified and is isolated unknown flow rate;2nd, cluster analysis is carried out to unknown flow rate, and to one class mark of assignment of traffic of each cluster;3rd, the known mass flow identified to every class and unknown flow rate carry out frequent episode and analyze and extract frequent item set;4th, position variance of the frequent episode in flow is calculated and according to position variance filtering protocol keyword;5th, network traffics are carried out with space linking parsing and time Time-Series analysis, the topology connections maps and catenation sequence of flow are obtained;Six, corporations' division is carried out to network traffics;Seven, the coefficient of coup parameter Estimation based on multichain coupled hidden markov model.Present invention introduces the multichain hidden Markov model of coupling, recover the interactive rule of complex protocol, for effectively monitoring and management and control complex network application provide a kind of infrastructural support technology.
Description
Technical field
The present invention relates to a kind of complex network application reverse process method based on hidden Markov model.
Background technology
Network traffics scale on current internet is big, and as many as application service, the complexity of communication protocol is considerably increased
The challenge of the network security technologys such as abnormality detection, intruding detection system and fire wall, causes internet to be faced with tighter
High network security problem.All be due to hacker make use of communication protocol or software systems leak and by specific control-
Bidding protocol implements network intrusions and attack in the way of remote control, therefore, we must have found potential association prior to hacker
View leak simultaneously stamps patch, is familiar with the specification of control-bidding protocol that hacker uses, grasps these agreement medium-long ranges comprehensively in time
The order of control, could effectively defend various Cyberthreats or can in time be found when attacking and block attack.For
These Defensive Targets can be reached, we must obtain the complete protocol specification of agreement.Common protocol specification can be from the world
Telecommunication union (ITU), the International Organization for standardization such as International Organization for standardization (ISO) and Internet engineering task forces (IETF)
Obtained in the standardized documentation of issue.Some privately owned applications or the developer of agreement, often protect for trade secret, privacy
The reason such as close and refuse to provide relevant protocol specification document;The producer of network attack or Malware is also unwilling to disclose phase
The control answered-bidding protocol specification.In this case, network manager or security expert are just necessarily dependent upon the reverse work of agreement
Journey technology reconstructs the specification of agreement.Agreement reverse-engineering can automatically for the application such as protocol traffic identification, intrusion detection is carried
The interactive law characteristic of accurate agreement underlying dimension and flow for enriching.Work in terms of current agreement reverse-engineering
Almost both for one-to-one interactive simple protocol or application.However, increasing network application or network attack are adopted
Communication protocol is multi-party, multichannel, parallel interactively collaboration completion interactive task.Existing agreement reverse-engineering
Technology is not suitable for the conversed analysis of this complex network application.
The content of the invention
The present invention is in view of the shortcomings of the prior art, there is provided a kind of complex network application based on hidden Markov model is reverse
Processing method.The method introduces the multichain hidden Markov model of coupling, recovers the interactive rule of complex protocol, is effectively monitoring
A kind of infrastructural support technology is provided with the application of management and control complex network.
In order to achieve the above object, a kind of complex network application reverse process side based on hidden Markov model of the present invention
Method, mainly includes the following steps that:
The first step, network traffics are acquired with classification, identify known mass flow and isolate unknown flow rate;
Second step, cluster analysis is carried out to unknown flow rate, and to one class mark of assignment of traffic of each cluster;
3rd step, the known mass flow identified to each class and unknown flow rate carry out frequent episode analysis, in extraction flow
Frequent item set;
4th step, calculate position variance of the frequent episode in flow and according to position variance filtering protocol keyword;
5th step, network traffics are carried out with space linking parsing and time Time-Series analysis, obtain the topology connections maps of flow
And catenation sequence;
Network traffics are carried out corporations' division by the 6th step;
7th step, the coefficient of coup parameter Estimation based on multichain coupled hidden markov model.
Preferably, the catenation sequence refers to the sequence of message of each connection, each sequence of message in catenation sequence
Represented with its type of message.
Preferably, the 6th step is to network using the Newman fast algorithms based on node application layer characteristic similarity
Flow carries out corporations' division.
For the complex network application model based on multichain coupled hidden markov model (CHMM), it is assumed that complex network should
Participant state procedure obeys a hidden markov process, then each node can use a hidden Ma Er in network
Can husband model describe.Therefore, in CHMM couple HMM numbers C for network node number, namely complex network application ginseng
With person's number.In patent of the present invention, all of HMM has identical length T.Assuming that each hidden Ma Erke in CHMM
The state space of husband's model is all S={ 1,2,3 }, represents three kinds of states of network node:Data mode is sent, data shape is received
State and idle condition.Patent of the present invention numbers the message format of complex network application, and different message formats represents different
Type of message.And the observation SPACE V of the corresponding hidden Markov model of each node={ 1,2 ..., M } between node to occur handing over
The protocol massages type changed.The initial state distribution of network node is usedTo represent, whereinRepresent c-th node
Original state be the probability of j, and meetInitial state distribution describes network node when interaction starts
Wait state in which.
State transition probability between network node is usedRepresent, whereinRepresent the state i from c' nodes
To the probability that the j states of c nodes are shifted, and meetEspecially, as c'=c,Represent a node
The state transition probability for itself changing over time.State transition probability matrix A describes the change law of network node behavior state.
The observation probability of network node is usedTo represent, whereinRepresent that c-th node of t connects
The type of message sent from the c' node is received,Represent that the node of c-th HMM of t observation is received
Probability, and meetObservation probability matrix description network node under Different activity states interaction content or
The distribution characteristics of type.
So far, λ=(π, A, B, Θ) just constitutes a protocol interaction model for complex network application.
And for node application layer characteristic similarity, it is assumed that whole network is expressed as G=(V, E), wherein V=1 ...,
N } represent nodes set, E={ evv'(v, v' ∈ V) represent nodes between connection side.The present invention is based on section
The Jaccard similarities of the connective definition node between the application layer message characteristic and node of point weigh the message between node
Characteristic similarity.Application layer message characteristic can be represented with the application layer protocol keyword set occurred in message, and two have
The common factor of the protocol collection that the adjacent node of side connection is used is more, and the similitude of the application layer message of the two nodes is got over
Greatly.Assuming that it is Kv that the protocol of node v integrates, then the message characteristic similarity between corporations i and corporations j is defined as:
Present invention introduces the multichain hidden Markov model of coupling, recover the interactive rule of complex protocol, be effectively monitoring
A kind of infrastructural support technology is provided with the application of management and control complex network.
Brief description of the drawings
Fig. 1 is implementation process diagram of the invention;
Fig. 2 is the Newman fast algorithm schematic diagrames based on application layer characteristic similarity;
Fig. 3 is coefficient of coup parameter Estimation flow chart.
Specific embodiment
The preferred embodiments of the present invention are described in detail below in conjunction with the accompanying drawings, so that advantages and features of the invention energy
It is easier to be readily appreciated by one skilled in the art, apparent is clearly defined so as to be made to protection scope of the present invention.
Reference picture 1~3, a kind of complex network application reverse process side based on hidden Markov model of the embodiment of the present invention
Method, mainly includes the following steps that:
The first step, network traffics are acquired with classification, identify known mass flow and isolate unknown flow rate;
Second step, cluster analysis is carried out to unknown flow rate, and to one class mark of assignment of traffic of each cluster;
3rd step, the known mass flow identified to each class and unknown flow rate carry out frequent episode analysis, in extraction flow
Frequent item set;
4th step, calculate position variance of the frequent episode in flow and according to position variance filtering protocol keyword;Its side
Method can be to set a variance threshold values σo, position variance is less than σoFrequent episode be protocol;
5th step, network traffics are carried out with space linking parsing and time Time-Series analysis, obtain the topology connections maps of flow
And catenation sequence;The catenation sequence refers to the sequence of message of each connection, and each sequence of message in catenation sequence uses it
Type of message represent;
Network traffics are carried out corporations by the 6th step using the Newman fast algorithms based on node application layer characteristic similarity
Divide;
7th step, the coefficient of coup parameter Estimation based on multichain coupled hidden markov model.
For the complex network application model based on multichain coupled hidden markov model (CHMM), it is assumed that complex network should
Participant state procedure obeys a hidden markov process, then each node can use a hidden Ma Er in network
Can husband model describe.Therefore, in CHMM couple HMM numbers C for network node number, namely complex network application ginseng
With person's number.In patent of the present invention, all of HMM has identical length T.Assuming that each hidden Ma Erke in CHMM
The state space of husband's model is all S={ 1,2,3 }, represents three kinds of states of network node:Data mode is sent, data shape is received
State and idle condition.Patent of the present invention numbers the message format of complex network application, and different message formats represents different
Type of message.And the observation SPACE V of the corresponding hidden Markov model of each node={ 1,2 ..., M } between node to occur handing over
The protocol massages type changed.The initial state distribution of network node is usedTo represent, whereinRepresent c-th node
Original state be the probability of j, and meetInitial state distribution describes network node when interaction starts
Wait state in which.
State transition probability between network node is usedRepresent, whereinRepresent the state i from c' nodes
To the probability that the j states of c nodes are shifted, and meetEspecially, as c'=c,Represent a node
The state transition probability for itself changing over time.State transition probability matrix A describes the change law of network node behavior state.
The observation probability of network node is usedTo represent, whereinRepresent that c-th node of t connects
The type of message sent from the c' node is received,Represent that the node of c-th HMM of t observation is received
Probability, and meetObservation probability matrix description network node under Different activity states interaction content or
The distribution characteristics of type.
So far, λ=(π, A, B, Θ) just constitutes a protocol interaction model for complex network application.
And for node application layer characteristic similarity, it is assumed that whole network is expressed as G=(V, E), wherein V=1 ...,
N } represent nodes set, E={ evv'(v, v' ∈ V) represent nodes between connection side.The present invention is based on section
The Jaccard similarities of the connective definition node between the application layer message characteristic and node of point weigh the message between node
Characteristic similarity.Application layer message characteristic can be represented with the application layer protocol keyword set occurred in message, and two have
The common factor of the protocol collection that the adjacent node of side connection is used is more, and the similitude of the application layer message of the two nodes is got over
Greatly.Assuming that it is Kv that the protocol of node v integrates, then the message characteristic similarity between corporations i and corporations j is defined as:
The Newman fast algorithms of node application layer characteristic similarity are based on to network traffics for the use in the 6th step
Corporations' division is carried out, specific algorithm is as shown in Figure 2.Detailed process is as follows:
Step 6.1:Input network topological diagram G={ V, E }, the protocol collection { K of each node1,K2,...,KN}。
Step 6.2:Algorithm initialization:
Using each node as a corporations Gopt ← { G1,G2,...,GN, Qopt←0。
Step 6.3:Calculate the similarity between G Zhong Ge corporations.
Step 6.4:Merge the maximum Liang Ge corporations of similarity.
Step 6.5:Calculate the modularity Q of G.
Step 6.6:Work as Q>QoptWhen, perform following operation:
Gopt←G;Qopt←Q。
Step 6.7:Judge | G |>1, if | G |>1, then jump to step 6.3.
Step 6.8:The optimal corporations for exporting network divide Gopt, modularity Q during network optimum divisionopt。
Idiographic flow for the CHMM Model couplings figure parameters estimation in the 7th step is as shown in Figure 3.Detailed process is such as
Under:
Step 7.1:The initialization of forward-backward algorithm algorithm:
Step 7.2:Iteration derivation:
Step 7.3:The more new formula of the coefficient of coup is:
Step 7.4:Judge whether to meet the condition for terminating iterative process, if it is satisfied, then terminating, otherwise jump to step
7.2。
The termination condition of wherein iterative process is:Iterations exceedes given maximum iteration Nmax, or θc'cIncrement |
Δθc'c| less than given threshold value τo。
Present invention introduces the multichain hidden Markov model of coupling, recover the interactive rule of complex protocol, be effectively monitoring
A kind of infrastructural support technology is provided with the application of management and control complex network.
The preferred embodiments of the present invention are the foregoing is only, is not intended to limit the invention, for the skill of this area
For art personnel, the present invention can have various modifications and variations.It is all within the spirit and principles in the present invention, made any repair
Change, equivalent, improvement etc., should be included within scope of the presently claimed invention.
Claims (3)
1. a kind of complex network application reverse process method based on hidden Markov model, it is characterised in that it is main include with
Lower step:
The first step, network traffics are acquired with classification, identify known mass flow and isolate unknown flow rate;
Second step, cluster analysis is carried out to unknown flow rate, and to one class mark of assignment of traffic of each cluster;
3rd step, the known mass flow identified to each class and unknown flow rate carry out frequent episode analysis, frequent in extraction flow
Item collection;
4th step, calculate position variance of the frequent episode in flow and according to position variance filtering protocol keyword;
5th step, network traffics are carried out with space linking parsing and time Time-Series analysis, obtain topology connections maps and the company of flow
Connect sequence;
Network traffics are carried out corporations' division by the 6th step;
7th step, the coefficient of coup parameter Estimation based on multichain coupled hidden markov model.
2. a kind of complex network application reverse process method based on hidden Markov model according to claim 1, its
Be characterised by, the catenation sequence refers to the sequence of message of each connection, each sequence of message in catenation sequence with it
Type of message is represented.
3. a kind of complex network application reverse process method based on hidden Markov model according to claim 1, its
It is characterised by, the 6th step is that network traffics are entered using the Newman fast algorithms based on node application layer characteristic similarity
Row corporations divide.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611094418.9A CN106789961A (en) | 2016-12-01 | 2016-12-01 | A kind of complex network application reverse process method based on hidden Markov model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611094418.9A CN106789961A (en) | 2016-12-01 | 2016-12-01 | A kind of complex network application reverse process method based on hidden Markov model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106789961A true CN106789961A (en) | 2017-05-31 |
Family
ID=58883406
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611094418.9A Pending CN106789961A (en) | 2016-12-01 | 2016-12-01 | A kind of complex network application reverse process method based on hidden Markov model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106789961A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059770A (en) * | 2019-04-30 | 2019-07-26 | 苏州大学 | Adaptive task distribution method, device and associated component based on position prediction |
CN110378103A (en) * | 2019-07-22 | 2019-10-25 | 电子科技大学 | A kind of micro- isolating and protecting method and system based on OpenFlow agreement |
CN115348158A (en) * | 2022-07-05 | 2022-11-15 | 南京银行股份有限公司 | Transaction full link analysis method and system based on banking non-standardized transaction message |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1111872A2 (en) * | 1999-12-21 | 2001-06-27 | Nortel Networks Limited | Utilizing internet protocol mobility messages and authentication, authorization and accounting messages in a communication system |
CN101707532A (en) * | 2009-10-30 | 2010-05-12 | 中山大学 | Automatic analysis method for unknown application layer protocol |
-
2016
- 2016-12-01 CN CN201611094418.9A patent/CN106789961A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1111872A2 (en) * | 1999-12-21 | 2001-06-27 | Nortel Networks Limited | Utilizing internet protocol mobility messages and authentication, authorization and accounting messages in a communication system |
CN101707532A (en) * | 2009-10-30 | 2010-05-12 | 中山大学 | Automatic analysis method for unknown application layer protocol |
Non-Patent Citations (5)
Title |
---|
LEANDROS A.MAGLARAS等: "Social Clustering of Vehicles Based on Semi-Markov Process", 《IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY》 * |
安娜等: "一种基于改进的Newman快速算法的文本聚类方法", 《科学技术与工程》 * |
朱树永等: "一种基于马尔可夫模型的协议识别技术", 《现代电子技术》 * |
罗建桢等: "基于最大似然概率的协议关键词长度确定方法", 《通信学报》 * |
贾宗维等: "一种发现社团结构的快速凝聚聚类算法", 《湘潭大学自然科学学报》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059770A (en) * | 2019-04-30 | 2019-07-26 | 苏州大学 | Adaptive task distribution method, device and associated component based on position prediction |
CN110378103A (en) * | 2019-07-22 | 2019-10-25 | 电子科技大学 | A kind of micro- isolating and protecting method and system based on OpenFlow agreement |
CN110378103B (en) * | 2019-07-22 | 2022-11-25 | 电子科技大学 | Micro-isolation protection method and system based on OpenFlow protocol |
CN115348158A (en) * | 2022-07-05 | 2022-11-15 | 南京银行股份有限公司 | Transaction full link analysis method and system based on banking non-standardized transaction message |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Duan et al. | Application of a dynamic line graph neural network for intrusion detection with semisupervised learning | |
US20230208719A1 (en) | Distributed secure state reconstruction method based on double-layer dynamic switching observer | |
Miller et al. | The role of machine learning in botnet detection | |
CN106254321B (en) | A kind of whole network abnormal data stream classification method | |
CN110474885B (en) | Alarm correlation analysis method based on time sequence and IP address | |
CN104580222A (en) | DDoS attack distributed detection and response system and method based on information entropy | |
Kebande et al. | Cloud-Centric Framework for isolating Big data as forensic evidence from IoT infrastructures | |
CN106789961A (en) | A kind of complex network application reverse process method based on hidden Markov model | |
CN108282460B (en) | Evidence chain generation method and device for network security event | |
Anwar et al. | Data-driven stealthy injection attacks on smart grid with incomplete measurements | |
CN102611713A (en) | Entropy operation-based network intrusion detection method and device | |
CN112468347A (en) | Security management method and device for cloud platform, electronic equipment and storage medium | |
Wang et al. | Time-variant graph classification | |
CN108712369B (en) | Multi-attribute constraint access control decision system and method for industrial control network | |
Zhao | Network intrusion detection system model based on data mining | |
Wang et al. | Botnet detection using social graph analysis | |
CN116668152A (en) | Anonymous network flow correlation method and device based on confusion execution feature recognition | |
CN111181969B (en) | Spontaneous flow-based Internet of things equipment identification method | |
Chakraborty et al. | Industrial control system device classification using network traffic features and neural network embeddings | |
Miller et al. | The impact of different botnet flow feature subsets on prediction accuracy using supervised and unsupervised learning methods | |
Garcia-Lebron et al. | A framework for characterizing the evolution of cyber attacker-victim relation graphs | |
CN110708341B (en) | User behavior detection method and system based on remote desktop encryption network traffic mode difference | |
CN108696390A (en) | A kind of software-defined network safety equipment and method | |
Hao et al. | Optimal malicious attack construction and robust detection in smart grid cyber security analysis | |
Ji et al. | Extracting keywords of UAVs wireless communication protocols based on association rules learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170531 |