CN108923975A - A kind of traffic behavior analysis method of Based on Distributed network - Google Patents
A kind of traffic behavior analysis method of Based on Distributed network Download PDFInfo
- Publication number
- CN108923975A CN108923975A CN201810728186.0A CN201810728186A CN108923975A CN 108923975 A CN108923975 A CN 108923975A CN 201810728186 A CN201810728186 A CN 201810728186A CN 108923975 A CN108923975 A CN 108923975A
- Authority
- CN
- China
- Prior art keywords
- network
- state
- node
- observation
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Environmental & Geological Engineering (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The present invention provides a kind of traffic behavior analysis method of Based on Distributed network, the method includes:On-premise network flow collection scheme;Acquire historical traffic data;Training pattern;Obtain traffic behavior model;Acquire real-time traffic data;Estimate the behavior of network global traffic.Entire distributed network is considered as an entirety by the present invention, by acquiring network node flow information, using network node traffic behavior sky when context relation, analyze network flow inner behavior state, it realizes the monitoring to network global traffic behavior, can assist carrying out the network managements such as scheduling of resource, abnormality detection work.
Description
Technical field
The present invention relates to network management-application fields, more particularly, to a kind of traffic behavior of Based on Distributed network
Analysis method.
Background technique
With the fast development of network information technology, network size it is unprecedented expand and various network applications it is extensive
It uses, network incorporates the fields such as politics, economy, culture dearly, and the diversity of network brings pole to the Working Life of people
Big convenience, at the same time, the complexity of network increase the difficulty of network management and maintenance so that network administrator face it is all
It is mostly difficult.The propulsion of IPv6 technology promotes network protocol from IPv4 to IPv6 transition, makes often occur IPv4/IPv6 dual stack in network
Parallel situation, which increase the difficulty of Network Abnormal investigation.Various networks include WLAN, wireless MAN, public
The access such as mobile communications network internet expands the scale of network, and the isomerism of network brings tired to the operation and maintenance of network
It is difficult.The appearance of cloud computing, the rise of social networks, the development of multimedia technology promote network application flow complicated and changeable, seriously
When can occupy the bandwidth of regular traffic in network, this brings challenges to distributing rationally for network bandwidth.On the other hand, it emerges one after another
Network security problem influence the normal operation of operator, the networks such as enterprise, often result in economic loss, the safety management of network
Important task as network administrator.
It is above-mentioned various in order to be solved the problems, such as in complicated network environment, enhance network management capabilities, establishes and stablize, is safe
Network environment, academia and industry propose many methods for analyzing network behavior.Including:Towards single-point and face
To the flow analysis method of multiple spot.In the flow analysis method towards single-point, paper " Zhao D, Traore I, Sayed B,
et al.Botnet detection based on traffic behavior analysis and flow intervals
[J].Computers&Security,2013,39(4):2-16. " 0 proposes that a kind of network flow analysis method, Main Analysis are logical
The information such as the network flow feature in communication network, including source purpose IP address, source destination port, agreement, packet length, and then detect network
In zombie host, this method is deployed in network key node, and the traffic characteristic of analysis node is stiff based on decision Tree algorithms detection
Corpse host.A kind of method for detecting abnormality based on network traffic analysis of publication, this method is by going deep into IP data packet
Analysis proposes a more complete network flow initial characteristics collection, and is used according to different types of Network Abnormal dynamic select
In the character subset of abnormality detection, class prediction is finally carried out to unknown sample according to character subset using Bayes classifier.
Green alliance's science and technology releases a network traffic analysis product (green alliance's science and technology network traffic analysis system http://
Www.nsfocus.com.cn/products/details_22_2.html), which can pass through Simple Network Management Protocol
(Simple Network Management Protocol, SNMP), Netflow agreement etc. collect routing device in network
Flow information carries out the analysis of a variety of dimensions, including traffic conditions, flow constituent, the changes in flow rate trend etc. in network.
In the flow analysis method towards multiple spot, paper " Jiang D, Xu Z, Zhang P, et al.A transform
domain-based anomaly detection approach to network-wide traffic[J].Journal of
Network&Computer Applications,2014,40(C):292-306. " proposes a kind of exception based on transform domain
Detection method utilizes the source purpose (Origin- with identical destination node for studying the traffic behavior feature of network side
Destination, OD) pair network flow information, network flow is considered as time series, obtains time series using S-transformation
Time frequency signal realizes the abnormal flow inspection of network side by comparing normal discharge, abnormal flow in the different characteristics of high fdrequency component
It surveys.Paper " Li Y, Luo X, Qian Y, et al.Network-Wide Traffic Anomaly Detection and
Localization Based on Robust Multivariate Probabilistic Calibration Model[J]
.Mathematical Problems in Engineering,2015,2015(1):1-26. " proposes a kind of network side exception stream
The method of amount detection and positioning constructs flow by OD pairs in measurement network of flow such as data packet number, byte number, fluxion
Matrix passes through assessment sample using the hidden variable probability theory method construct flow normal behaviour model that multivariable t is distributed
Mahalanobis distance realizes abnormality detection and positioning.The patent of Li Zhi roc disclose a kind of network flow analysis system and
Method, the system pass through the original flow information of each node in flow collection module acquisition network first, then extract original
Application layer traffic information in flow information, then by carrying out statistical comparison to application layer traffic information, analyze application system
In whether there is abnormal flow, realize the application layer analysis based on network flow.The patent of Guo Zulong discloses one kind and is based on
Distributed network traffic analysis system and method, the system pass through flow information in flow collection module acquisition network first,
Then network layer, transport layer and the application layer message in original flow information are extracted, then by network layer, transport layer and answering
It is analyzed and processed with layer information, total flow situation, IP to IP data on flows, IP layer network data information and application layer is assisted
View information is analyzed.Section carrys out a full flow safety analysis product of network release, and (section carrys out network full flow Safety Analysis System
https://app.huaweicloud.com/product/00301-55020-0--0) 0, which passes through complete to network link
Flow collection stores, totally according to analysis, there is sharp sensing capability to Network anomalous behaviors.
The above method can solve different network problems to a certain extent, but there is also some limitations:
(1) in the flow analysis method towards single-point, the flow by the node can be only got, office can only be analyzed
Portion's network traffic information, it is difficult to the understanding to overall network traffic behavior is obtained, and network management cannot only lean on local message, or
Only for local problem, need to formulate the scheme and strategy of total optimization from global angle.
(2) in the flow analysis method towards multiple spot, it is usually associated with the data on flows of multinode, but is not utilized
Time brought by network topology information and the network interconnection, spatial context incidence relation, it is difficult to portray network node
Between the whole flow behavior state of flow inner behavior state and the whole network, for the network with complication system characteristic
Speech, local superposition are difficult to reflect the global behavior feature of the whole network.
Summary of the invention
In order to overcome the limitation of the prior art, the present invention proposes a kind of traffic behavior analysis side of Based on Distributed network
Method.Network is considered as an entirety by this method, using network node traffic behavior sky when contextual information, analysis network is global
Traffic behavior can disclose the whole flow behavior state of the flow inner behavior state and the whole network between network node, make
Network administrator can have a global understanding to administration network.
In order to realize goal of the invention, the technical solutions adopted are as follows:
A kind of traffic behavior analysis method of Based on Distributed network can be realized to network global traffic behavioural analysis,
Specially:
Model training stage:Training data of the network history data on flows as training pattern is acquired, network flow is obtained
Behavior model;
The study stage:The real-time traffic data of acquisition are inputted into trained network-flow characteristic model, and utilize maximum
Posterior estimator criterion obtains network global traffic behavior by iterative calculation.
It preferably, further include deployment network probe acquisition data on flows, tool in administration network before acquiring data on flows
Body is to dispose probe in network node, acquires the data on flows of different grain size, different agreement level in network node, and be transmitted to
Flow analysis center carries out data analysis.
Preferably, the realization process of the model training stage is specially:It determines network-flow characteristic model structure and estimates
Count model parameter;
Determine network-flow characteristic model structure:
Distributed network flow behavioural information is divided into two layers:Hidden state layer and observation data Layer, observation data Layer be by
The network node data on flows that network probe measurement obtains is constituted, and hidden state layer is made of the behavior pattern of network node, table
Show in network in driving factors, directly drives network node flow external manifestation;Hidden state and sight are indicated using stochastic variable
Measured value, therefore hidden state layer and observation data Layer constitute two random fields, i.e., hidden state field and observation field;
Define mathematic sign:In the network that one possesses N number of node,Indicate set of network nodes,It indicates
In n-th of node of t-th of time slot,WhereinIndicate institute's nodes of locations at one's leisure
Set, hasT is number of timeslots;Use St,nIndicate node xt,nHidden state variable,It indicates random to become
Measure StnAn example, whereinHidden state set is represented, thenExpression is defined onOn
Hidden state family of random variables;Therefore, being able to use S indicates the hidden state field on [1, T],Indicate a configuration of S,
WhereinIndicate all possible configuration set of hidden state field;Use similar expression, Ot,nIndicate node xt,nObservation become
Amount,Indicate stochastic variable Ot,nAn example, whereinIndicate observation value set, thenExpression is defined onOn observation family of random variables;Therefore, being able to use O indicates
Observation field on [1, T],Indicate a configuration of O, whereinIndicate that observation site is possible to configuration set;
Relationship is developed when portraying the sky between hidden state field and observation field using HMRF model;
For hidden state field, a hypothesis is introduced:Spatially state of the node only with its a hop neighbor node has
It closes, it is only related with its state at previous moment on the time;The probability of method based on statistical learning, hidden state field can pass through
Following formula obtains:
Wherein,It indicates not including node xt,nNetwork sky when nodes of locations set,WithIt respectively indicates
Node xt,nSpatial neighbors state and time neighbor state, λ indicate hidden state field parameters;
Local probability in formula (1), is obtained by the following formula:
Wherein, m indicates node state, time transition probabilityAccording to time hidden state transition probability
Matrix A calculates, and A indicates hidden state from t moment to the state transition probability matrix at t+1 moment, i.e. time hidden state transfer composition
Single order Markov Chain;A matrix is expressed from the next:
Wherein PijSubscript i and j respectively indicate node hidden state locating for t the and t+1 moment;Space migrating probability passes through
Following formula obtains:
Wherein, Ut,n(m) edge energy function is indicated, andIndicate section
Point xt,nSpatial neighbors node,Indicate node xt,nSpatial neighbors number of nodes, wherein potential function is defined as:Vt,n
(m)=num α, wherein parameter alpha is used to portray present node and its spatial neighbors node influences each other the power of relationship, num table
Show spatial neighbors node state and the different quantity of present node state;
For observation field, network node observation is obtained by network probe, i.e. the observation field of networkIt is given data;If the observation of a node is only related with the state of the node, observation
Output probability of the field under hidden state-driven is obtained by the following formula:
Wherein, even multiply symbol subscript (t, n) expressionPr[Ot,n=k | St,n=m, θm] indicate t when
It carves, node n exports the probability that observation is k under conditions of state is m, calculates for convenience, by observation Ot,nIt carries out discrete
Change, frequency of use is next approximate instead of probability, i.e., comes approximate condition probability, parameter θ in the frequency distribution of state m using observationm
The distribution parameter for indicating observation in the state of specific is indicated used here as output probability matrix B, referred to as observes field parameters, B
Matrix is expressed from the next:
Wherein PmkIndicate the probability that node is k in state m output observation;
Network-flow characteristic model structure is determined as a result, and network-flow characteristic model is portrayed by HMRF model, therefore model
Parameter is Ω={ A, α, B },
Estimate model parameter;
After collecting historical traffic data and determining traffic behavior model structure, historical traffic data training is utilized
Model parameter Ω={ A, α, B };For the ease of practical engineering application, the frequency of use approximation probability in calculating process, therefore count
It needs before calculating to observation Ot,nCarry out discretization;
Its training process inputs historical traffic data o, i.e. network node observation, output model parameter Ω={ A, α, B };
Estimate that model parameter process step is as follows:
(3-1) initializes iteration poll initial value i, iteration stopping condition Iter, initial hidden state field s(1);
Wherein iteration poll initial value i is initialized as 1;Iteration stopping condition setting is iteration stopping number Iter, according to
Experience is preferably arranged to 5-8 times;In addition, iteration stopping condition can also be set as front and back iterative process parameter variation range twice
Threshold value stops iteration when variation range is less than given threshold value;Initial hidden state field s(1)According to historical traffic data observation
It is initialized using clustering algorithm, cluster categorical measure monitors demand according to real network and determines, categorical measure corresponding network
Nodes ' behavior number of states, therefore categorical measure reflects the granularity portrayed network-flow characteristic, behavior state quantity is more,
The traffic behavior granularity that can be portrayed is thinner;
(3-2) updates model parameter according to the configuring condition of hidden state field, when the frequency that foundation time state jumps updates
Between hidden state transition probability matrix A, be transferred to the frequency of state j if moment t is in state i moment t+1 and be denoted as Aij, then in A
State transition probability PijEstimated value obtained by following formula:
Empirically formula determines α value, and between the preferred 0.5-10 of empirical value, α is bigger, indicates to interact between node and get over
Greatly, the state of neighbor node influences the state of present node bigger, and vice versa;Frequency according to corresponding state output observation
Rate distributed update output probability matrix B, if the frequency that state is m in sample and observation is k is Bmk, then being exported in B general
Rate PmkEstimated value obtained by following formula:
Meanwhile iteration wrap count adds 1, i.e. i=i+1;
(3-3) judges whether to meet stop condition, that is, judges i > Iter;
If 3-3-1) being judged as NO, hidden state field s is updated according to estimation behavior state process(i), input data is history
Data on flows o, "current" model parameterOutput data is to update hidden state field s(i), wherein estimation behavior state process
Initial hidden state field uses current hidden state field s(i-1);
Return step (3-2);
If 3-3-2) being judged as YES, final mask parameter Ω={ A, α, B } is exported;
It, can be based on model parameter Ω={ A, α, the B } that historical traffic data training obtains, as stream according to above step
Measure behavior model.
Preferably, the process in the study stage is:
According to the traffic behavior model and collected network node real-time traffic data got, network can be estimated
Node flow inner behavior state,
The above process is equivalent under conditions of setting models parameter Ω and observation field o, one for estimating hidden state field
ConfigurationAccording to MAP estimation criterion, optimal hidden state field estimated value is foundIt is equivalent to solve following formula:
According to Bayes' theorem, haveSince Pr [o] is a constant,
Pr[s|o,Ω]∝Pr[o|s,Ω]·Pr[s|Ω];Wherein prior probability Pr [s | Ω] and likelihood probability Pr [o | s, Ω] point
Not Tong Guo formula (1), (2) be calculated, that is, be obtained by the following formula:
The hidden state field of network global optimum is obtained using the mode of iterative calculation, estimation behavior state process is input data
For real-time traffic data o, model parameter Ω={ A, α, B }, output data is the hidden state field estimated value of networkEstimation behavior shape
State process step is as follows:
(4-1) initializes iteration poll initial value i, initial hidden state field s(0), iteration stopping condition Iter;
Iteration poll initial value i is initialized as 1;Priori knowledge init state according to state field and observation field relationship
, or clustering algorithm init state field is used according to observation field;Iteration stopping condition setting is iteration stopping number Iter,
Rule of thumb it is preferably arranged to 3-5 times;
(4-2) traverses all possible state value for each node in network, is selected most according to MAP estimation criterion
The state value of maximum probability is equivalent to solve following formula as node current iteration round estimated result:
Meanwhile iteration wrap count adds 1, i.e. i=i+1;
(4-3) judges whether to meet stop condition, that is, judges i > Iter;
If 4-3-1) being judged as NO, return step (4-2) updates each node state again;
If 4-3-2) being judged as YES, end-state field estimated value is exportedThus network global traffic behavior shape is obtained
State, network administrator can obtain the monitoring to the whole network behavior accordingly.
Compared with prior art, the beneficial effect of technical solution of the present invention is:The present invention discloses a kind of Based on Distributed net
The traffic behavior analysis method of network.In administration network area, on-premise network probe collection network node data on flows utilizes net
The contextual information when sky of network node flow behavior, establishes network-flow characteristic model, and this method can estimate network flow
Inner behavior state, makes network administrator have a global understanding to administration network, and scheduling of resource, abnormality detection are carried out in guidance
Equal network managements work.
Detailed description of the invention
Fig. 1 is this method overall procedure schematic diagram;
Fig. 2 is this method actual deployment block schematic illustration;
Fig. 3 is this method network node traffic behavior information schematic diagram;
Fig. 4 is that this method estimates model parameter flow chart;
Fig. 5 is that this method estimates behavior state flow chart;
Fig. 6 is that different moments network node data packet reaches mode state figure in embodiment;
Fig. 7 is the Fitted probability density function that data packet corresponds to normalization observation to expression patterns in embodiment.
Specific embodiment
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;In order to better illustrate this embodiment, attached
Scheme certain components to have omission, zoom in or out, does not represent the size of actual product;
To those skilled in the art, it is to be understood that certain known features and its explanation, which may be omitted, in attached drawing
's.The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.
The present invention overcomes the limitation of the prior art, and network is considered as an entirety, utilizes network node traffic behavior
Sky when contextual relevance information, this relevance be originated from network itself interconnection and surrounding time network flow correlation
Property, i.e., there is interaction in adjacent network node, there are similitude and node surrounding time streams for the traffic behavior of adjacent node
There are similitude, this method can disclose the whole flow of flow inner behavior state and the whole network between network node for amount behavior
Behavior state, enables network administrator to have a global understanding to administration network, and scheduling of resource, abnormality detection are carried out in guidance
Equal network managements work.
Overall framework
A kind of traffic behavior analysis method of Based on Distributed network, this method belong to network management-application, realize to net
Network global traffic behavioural analysis, this method overall procedure schematic diagram is as shown in Figure 1, include six steps, respectively:Step S1,
On-premise network flow collection scheme;Step S2 acquires historical traffic data;Step S3, training pattern;Step S4 obtains flow
Behavior model;Step S5 acquires real-time traffic data;Step S6 estimates the behavior of network global traffic.
By realizing in network node on-premise network probe, the network node data on flows of capture is transmitted to the step S1
Flow analysis center is further analysed;
The step S2 refers to that network probe acquires data on flows to network node, as the training of traffic behavior model
Data;
The step S3, which refers to, trains behavior when can portray network flow sky according to collected historical traffic data
Model, this method using Hidden Markov random field (Hidden Markov Random Field, HMRF) mathematical model to point
The dynamic changing process of cloth network flow models;
The step S4 refers to that flow analysis center obtains traffic behavior model;
The step S5 refers to is acquired network flow data to be analyzed in real time by network probe in practical applications;
The step S6 refers to be estimated to obtain network global traffic behavior shape by real-time traffic data by traffic behavior model
State, network administrator can obtain the monitoring to the whole network behavior accordingly, and guide and carry out the networks pipes such as scheduling of resource, abnormality detection
Science and engineering is made.
Execution method of the invention is as follows:Deployment network probe acquires historical traffic data in administration network, is made
For model training data input model training process, corresponding traffic behavior model is obtained by training, it in practical applications, will
Collected real-time traffic data input trained traffic behavior model, utilize maximum a posteriori (Maximum A
Posteriori, MAP) estimation criterion obtains network global traffic behavior by iterative calculation, it is realized accordingly to the whole network behavior
Monitoring assists further network management to work.
The each step content of this method is described in detail below with reference to Fig. 1.
Step S1, on-premise network flow collection scheme
In order to analyze network-flow characteristic, it is necessary first to on-premise network flow collection scheme.As shown in Fig. 2, this method is logical
The deployment network probe acquisition network flow on the node of administration network is crossed, while the data on flows of acquisition is transmitted to flow point
Analysis center is analyzed for subsequent network traffic behavior.On-premise network flow collection scheme mainly includes following sub-step, step S1-
1, network node disposes probe, step S1-2, and network probe acquires network flow data, step S1-3, network probe and flow
Analysis center's communication.
Step S1-1, network node dispose probe.This programme can be suitable for heterogeneous networks scene, including traditional routing
Device, exchange mechanism at internet, the network based on SDN, network and above-mentioned hybrid network based on NFV.The net of deployment
Network probe is a functional entity, and probe can be physical equipment, such as private server or hardware probe, be also possible to collect
At software function entity on network devices, such as NetFlow or sFlow function, SNMP- on router or interchanger
Virtual probe agent service or realized by NFV.
Step S1-2, network probe acquire network flow data.This method can acquire different grain size stream in network node
Measure the information such as data, including packet data recording, stream rank record, traffic statistics record.For varigrained data on flows, adopt
With different flow collection schemes.
The complete flow packet data recording information of network node is captured, needs to pass through in network equipment position deployment services device
The mode of Port Mirroring can capture complete data on flows package informatin, including capture packet time stamp, IP address, port,
The information such as agreement, packet length, server undertake storage data on flows, preliminary treatment data on flows, divide with flow as network probe
Analyse the tasks such as center to center communications.For the network flow for acquiring network node stream rank, the network of NetFlow or sFlow is being supported to set
Standby upper starting NetFlow or sFlow function, and acquire IP flow information on equipment all of the port, NetFlow or sFlow message
The flow information for being included mainly includes data package size, flow per second, total flow etc., NetFlow or sFlow function conduct
Network probe is integrated on network devices, is exported in a manner of traffic messages and is sent to flow analysis center.To acquire network section
The traffic statistics of point start SNMP-agent clothes using the flow collection method based on snmp protocol on network devices
Business, as the network probe of network node, the traffic statistics in equipment are stored in local management information in a specified pattern
In library (Management Information Base, MIB), the MIB data in the flow analysis center requests network equipment are realized
Acquisition to the various effective discharge data of network node.
Step S1-3, network probe and flow analysis center to center communications.For capturing the network of complete flow packet information
Local data on flows is transmitted to flow analysis by probe, the communication mode based on client-server, each network probe
Center;In the flow collection scheme based on NetFlow or sFlow, network probe is by the flow of acquisition actively to as acquisition
The flow analysis center transmitted traffic data of device;In the flow collection scheme based on SNMP, flow analysis center is as SNMP-
Manager actively requests traffic statistics to the network probe as SNMP-agent.
According to above scheme, realize the deployment of network flow acquisition scheme, can collect network node different grain size,
The flow information of different agreement layers can be handled flow according to real network regulatory requirement.
Step S2 acquires historical traffic data
According to the network flow acquisition scheme of step S1 actual deployment, flow analysis center is by network probe in network section
Point collects historical traffic data, as the training data of model training process.Historical traffic data include different grain size,
The information such as the data on flows of different agreement level, such as byte number, data packet arrival rate, IP address, application protocol, are also possible to
By the flow information being further processed, such as Fourier transformation, wavelet transformation are done to basic time domain data on flows, calculates and flows
Measure the comentropy etc. of statistical variable.
In order to reduce the traffic from network probe to flow analysis center, frequency domain is such as calculated for some preliminary processing
Calculating task can be deployed on network probe by signal or comentropy, and only the processing result of flow is sent in flow analysis
The heart.
According to aforesaid way, the available network node historical traffic data in flow analysis center accordingly can be with training net
Network traffic behavior model.
Step S3, training pattern
In step s3, it needs to complete 2 sub-steps:Step S3-1, determines traffic behavior model structure, step S3-2,
Estimate model parameter.
In step S3-1, firstly, introducing this method to the modeling approach of network-flow characteristic.As shown in figure 3, for net
The traffic behavior information of node is divided into two parts by a node (such as switch or router) in network, this method:It is considerable
It surveys and unobservable part.Wherein, Observable part refers to the data on flows that acquisition can be directly measured by network probe, such as
The information such as byte number, data packet arrival rate, IP address, application protocol, these measured values reflect the external table of node flow behavior
It is existing, hereinafter referred to as " observation ".Unobservable part refers to the internal factor of driving node flow external manifestation, as behavior pattern,
Inherent mechanism etc., these factors can not be obtained by network probe measurement, can only be estimated according to the observables of node, under
Text is known as " hidden state ".
Distributed network is expanded to, as shown in Fig. 2, distributed network flow behavioural information is divided into two layers by modeling method:
Hidden state layer and observation data Layer.The network node data on flows that observation data Layer is obtained by network probe measurement is constituted, hidden shape
State layer is made of the behavior pattern of network node, indicates to directly drive the external table of network node flow in driving factors in network
It is existing.Indicate hidden state and observation using stochastic variable herein, thus hidden state layer and observation data Layer constitute two with
(hereafter " state " is stated equivalent " hidden state ", and " state field " states equivalent " hidden state for airport, i.e., hidden state field and observation field
").
In step S3-1, secondly, defining mathematic sign used in this method.Possess the network of N number of node at one
In,Indicate set of network nodes,It indicates the of t-th time slotA node, whereinIt indicates institute's nodes of locations set at one's leisure, hasT is number of timeslots.It uses
St,nIndicate node xt,nHidden state variable,Indicate stochastic variable St,nAn example, whereinRepresent hidden state set
It closes, thenExpression is defined onOn hidden state family of random variables.Therefore, it is possible to use S table
Show the hidden state field on [1, T],Indicate a configuration of S, whereinIndicate all possible configuration set of hidden state field.
Use similar expression, Ot,nIndicate node xt,nObservation variable,Indicate stochastic variable Ot,nAn example,
WhereinIndicate observation value set, thenExpression is defined onOn observation stochastic variable
Race.Therefore, it is possible to use O indicates the observation field on [1, T],Indicate a configuration of O, whereinIndicate observation field
All possible configuration set.
Relationship is developed in step S3-1, when finally, portraying the sky between hidden state field and observation field using HMRF model.
For hidden state field, this modeling method introduces an important hypothesis:Spatially a node is only jumped with its one adjacent
The state for occupying node is related, only related with its state at previous moment on the time.Method based on statistical learning, state field
Probability can be obtained by the following formula:
Wherein,It indicates not including node xt,nNetwork sky when nodes of locations set,WithIt respectively indicates
Node xt,nSpatial neighbors state and time neighbor state, λ indicate state field parameters.
Local probability in formula (1), is obtained by the following formula:
Wherein, m indicates node state, time transition probabilityAccording to time hidden state transition probability
Matrix A calculates, and A indicates hidden state from t moment to the state transition probability matrix at t+1 moment, i.e. time hidden state transfer composition
Single order Markov Chain.A matrix is expressed from the next:
Wherein PijSubscript i and j respectively indicate node hidden state locating for t the and t+1 moment.Space migrating probability passes through
Following formula obtains:
Wherein, Ut,n(m) edge energy function is indicated, andIndicate section
Point xt,nSpatial neighbors node,Indicate node xt,nSpatial neighbors number of nodes, wherein potential function is defined as:Vt,n
(m)=num α, wherein parameter alpha is used to portray present node and its spatial neighbors node influences each other the power of relationship, num table
Show spatial neighbors node state and the different quantity of present node state.
For above-mentioned observation field, network node observation passes through the flow collection scheme disposed in step sl and directly obtains
It obtains, i.e. the observation field of networkIt is given data.This method thinks the observation of a node
Only related with the state of the node, output probability of the observation field under hidden state-driven is obtained by the following formula:
Wherein, even multiply symbol subscript (t, n) expressionPr[Ot,n=k | St,n=m, θm] indicate t when
It carves, node n exports the probability that observation is k under conditions of state is m, calculates for convenience, by observation Ot,nIt carries out discrete
Change, frequency of use is next approximate instead of probability, i.e., comes approximate condition probability, parameter θ in the frequency distribution of state m using observationm
The distribution parameter for indicating observation in the state of specific is indicated used here as output probability matrix B, referred to as observes field parameters, B
Matrix is expressed from the next:
Wherein PmkIndicate the probability that node is k in state m output observation.
Step S3-1 is completed as a result, determines traffic behavior model structure, traffic behavior model is portrayed by HMRF model,
Therefore model parameter is Ω={ A, α, B }, following introduction step S3-2, estimates model parameter.
In step S3-2, after collecting historical traffic data and determining traffic behavior model structure, benefit is needed
With historical traffic data training pattern parameter Ω={ A, α, B }.For the ease of practical engineering application, this method is in calculating process
Frequency of use approximation probability, therefore need before calculating to observation Ot,nCarry out discretization.Estimate model parameter process such as Fig. 4
Shown, training process inputs historical traffic data o, i.e. network node observation, output model parameter Ω={ A, α, B }.Estimation
Model parameter process step is as follows:
(1) iteration poll initial value i, iteration stopping condition Iter, original state field s are initialized(1)。
Wherein iteration poll initial value i is initialized as 1.Iteration stopping condition setting is iteration stopping number Iter, according to
Experience is preferably arranged to 5-8 times;In addition, iteration stopping condition may be set to be front and back iterative process parameter variation range twice
Threshold value stops iteration when variation range is less than given threshold value.Original state field s(1)Made according to historical traffic data observation
It is initialized with clustering algorithm, such as Kmeans algorithm, clusters categorical measure and determined according to real network monitoring demand, classification number
Corresponding network nodes ' behavior number of states is measured, therefore categorical measure reflects the granularity portrayed network-flow characteristic, behavior shape
State quantity is more, and the traffic behavior granularity that can be portrayed is thinner.
(2) model parameter is updated according to the configuring condition of state field, the frequency renewal time jumped according to time state is hidden
State transition probability matrix A is transferred to the frequency of state j and is denoted as A if moment t is in state i moment t+1ij, then state in A
Transition probability PijEstimated value obtained by following formula:
Empirically formula determines α value, and between the preferred 0.5-10 of empirical value, α is bigger, indicates to interact between node and get over
Greatly, the state of neighbor node influences the state of present node bigger, and vice versa;Frequency according to corresponding state output observation
Rate distributed update output probability matrix B, if the frequency that state is m in sample and observation is k is Bmk, then being exported in B general
Rate PmkEstimated value obtained by following formula:
Meanwhile iteration wrap count adds 1, i.e. i=i+1.
(3) judge whether to meet stop condition, that is, judge i > Iter.
If 1) be judged as NO, state field s is updated according to estimation behavior state process in step S6(i), input data is to go through
History data on flows o, "current" model parameterOutput data is to update state field s(i), wherein behavior shape is estimated in step S6
(1) step original state field of state process step uses current state field s(i-1)。
Return step (2).
If 2) be judged as YES, final mask parameter Ω={ A, α, B } is exported.
It, can be based on model parameter Ω={ A, α, the B } that historical traffic data training obtains, as stream according to above step
Measure behavior model.
Step S4 obtains traffic behavior model
Evolution process when the HMRF model that this method proposes can portray the sky of network-flow characteristic, physically discloses net
The reason of network flow external manifestation.Model parameter mainly includes Ω={ A, α, B }, the change of network-flow characteristic mode time dimension
Change can be used time hidden state-transition matrix A and portray, and the interaction relationship use space state field parameter alpha of space nodes is carved
It draws, the relationship between observation and hidden state is portrayed using output probability matrix B.
Traffic behavior model works in flow analysis center, can be with flexible Application model according to real network regulatory requirement.
For the different type historical traffic data obtained by step S2, for example, byte number, data packet arrival rate, IP address, using association
The information such as view, can train to obtain different models according to step S3 at flow analysis center, such as data packet arrival rate model, stream
Measure IP address model, application protocol model or the model for merging various flow rate data.
According to aforesaid way, flow analysis center can obtain different traffic behavior models, for actual flow behavior point
Analysis uses, and can provide a user the traffic behavior analysis of various dimensions.
Step S5 acquires real-time traffic data
According to the network flow acquisition scheme disposed in step S1, real-time traffic data can be acquired in practical applications.
It, can real-time or periodical polling request net according to the deployment strategy of network administrator at flow analysis center
Network probe data obtains network flow data to be analyzed, including byte number, data packet arrival rate, IP address, application protocol etc.
Information, the network flow data reflect the flow external manifestation in current network environment.
According to actual monitoring demand, certain types of data on flows is selected, after sliding-model control, as observation
Value input corresponding discharge behavior model, for further estimating the behavior of network global traffic.
According to aforesaid way, flow analysis center can collect real-time traffic data.
Step S6 estimates the behavior of network global traffic
At flow analysis center, according to collected net in the traffic behavior model and step S5 got in step S4
Network node real-time traffic data, can estimate network node flow inner behavior state, and network administrator can obtain pair accordingly
The monitoring of the whole network behavior and guides and carries out the network managements such as scheduling of resource, abnormality detection work.
The above process is equivalent under conditions of setting models parameter Ω and observation field o, one for estimating hidden state field
ConfigurationAccording to MAP estimation criterion, optimal hidden state field estimated value is foundIt is equivalent to solve following formula:
According to Bayes' theorem, haveSince Pr [o] is a constant,
Pr[s|o,Ω]∝Pr[o|s,Ω]·Pr[s|Ω].Wherein prior probability Pr [s | Ω] and likelihood probability Pr [o | s, Ω] point
Not Tong Guo formula (1), (2) be calculated, that is, be obtained by the following formula:
This method obtains network global optimum state field using the mode of iterative calculation, estimates behavior state process such as Fig. 5
It is shown.Input data is real-time traffic data o, model parameter Ω={ A, α, B }, and output data is network state field estimated value
Estimate that behavior state process step is as follows:
(1) iteration poll initial value i, original state field s are initialized(0), iteration stopping condition Iter.
Iteration poll initial value i is initialized as 1.Priori knowledge init state according to state field and observation field relationship
, or clustering algorithm init state field is used according to observation field.Iteration stopping condition setting is iteration stopping number Iter,
Rule of thumb it is preferably arranged to 3-5 times.
(2) for each node in network, all possible state value is traversed, maximum is selected according to MAP estimation criterion
The state value of probability is equivalent to solve following formula as node current iteration round estimated result:
Meanwhile iteration wrap count adds 1, i.e. i=i+1.
(3) judge whether to meet stop condition, that is, judge i > Iter.
If 1) be judged as NO, return step (2) updates each node state again.
If 2) be judged as YES, end-state field estimated value is exportedThus network global traffic behavior state, net are obtained
Network administrator can obtain the monitoring to the whole network behavior accordingly.
Particularly, network is considered as an entirety due to this method, the network behavior state estimated is an overall situation
Optimal result indicates the state that network node most probable occurs, the tasks such as scheduling of resource, abnormality detection carried out according to this state
It is totally optimal plan and strategy.
For collected different type data on flows, the meaning of corresponding behavior state is different, therefore this traffic behavior point
Analysis method can provide the analysis to network flow various dimensions.Such as monitoring network node data packet arrival rate, traffic behavior shape
State reflects that the data packet of present node can be known in network not after estimation obtains network global behavior state to expression patterns
The data packet of different zones is to expression patterns in the same time, and when network node to expression patterns is in high traffic condition, network administrator can
To avoid the service of starting consumption massive band width, preferentially meet high priority bandwidth demand, and when network node is at expression patterns
In low flow condition, it can suitably loosen limitation.Such as monitoring network node application protocol network flow, the reflection of traffic behavior state
Present node application protocol composition information can know in network different moments after estimation obtains network global behavior state
The application protocol composition information of different zones, occupies certain P2P flows the node of larger proportion, and network administrator can be
Appropriate situation limits its flow velocity, guarantees regular traffic operation.
This method is estimated to obtain network global traffic behavior state, and the flow scheduling of the whole network can be carried out according to this, such as
Carry out load balancing or GreenNet application.According to the network difference node flow size state that estimation obtains, from global view
Dispatching algorithm is designed at angle, by the flow of high flow capacity node toward low discharge node scheduling, realizes the whole network flow load balance.Work as network
When interior joint is in low flow state, under the premise of guaranteeing that network connectivty and link utilization constrain, part is closed
Node realizes GreenNet application to reduce the energy consumption of network to greatest extent in the case where guaranteeing network basic performance.
Due to contextual information when this method utilizes network node flow sky, can estimate to obtain network global traffic
Behavior state can disclose contacting for network behavior state and flow external manifestation, and network administrator is helped to establish the view of the whole network
The situation and Long-term change trend of network are made a general survey of in angle, grasp the information such as network load condition, the service condition of network application resources.Base
It is modeled in historical traffic data, distribution and trend feature of the network in time, space, flow direction can be established, help network pipe
Reason person carries out business demand, hot spot, trend etc. to go deep into excavation, the auxiliary network planning and design.According to various dimensions data on flows
As the model that observation is established, evolution process when variation reflects network flow sky when the sky of traffic behavior state, Ke Yili
With situation of change when the sky of network node state, network overall situation normal discharge behavior model is constructed, auxiliary realizes abnormality detection.
Embodiment
Embodiment illustrates the advantage of this method by taking the data packet to expression patterns for analyzing network node as an example.As shown in fig. 6, showing
Example network includes 50 network nodes and 88 links, certain research network (certain German research network topology information of topological source Germany
http://sndlib.zib.de/home.action), different nodes capture data on flows in administration network, here with data
Packet arrival rate is as node observation, and node data packet to expression patterns is traffic behavior state, according to this Based on Distributed network
Traffic behavior analysis method perceive the whole network different time different zones data packet to expression patterns, whole network data packet is arrived in realization
The monitoring of expression patterns.According to above-mentioned implementation steps S1-S6, firstly, disposing flow collection scheme in the entire network;Secondly,
Historical data packet arrival rate data are acquired in network node;Again, according to historical data packet arrival rate data training pattern;From
Secondary, flow analysis center obtains data packet arrival rate model;Then, real time data packet arrival rate data are acquired in network node,
Input data packet arrival rate model;Finally, by iterative calculation obtain portraying network packet to expression patterns global status maps.
Network administrator can make a general survey of the situation and Long-term change trend of network, grasp network load condition, realize GreenNet, resource accordingly
The application such as scheduling.
Different moments network node data packet reaches mode state figure as shown in fig. 6, different colours represent network node institute
The different data packet at place to expression patterns, only shown in figure two kinds therein to expression patterns, mode 1 in grayed-out nodes corresponding diagram 7,
Mode 2 in dark node corresponding diagram 7, Fig. 7 indicate both data packets to the Fitted probability of the corresponding normalization observation of expression patterns
Density function reflects output distribution of the observation under different conditions, is reached according to the network global data packet that estimation obtains
Mode, network administrator can the data packet of different nodes arrives expression patterns in awareness network in real time, in perception global network flow
In behavior state, know that network node data packet to expression patterns and data packet arrival rate relationship, can be obtained according to behavior state
The node load situation of the whole network is taken, auxiliary realizes the application such as GreenNet, scheduling of resource.
Since this method is a non-supervisory learning process, belong to a kind of multi-categorizer, by this method and typical cluster
Method Kmeans is compared.Expression patterns are arrived according to actual data packet arrival rate estimation current network node, are arrived in data packet
When expression patterns quantity is 5, Kmeans and this method entirety accuracy rate and macro F1 value performance comparison are as shown in Table 1, and difference reaches mould
Accurate rate, recall rate, F1 value performance comparison are as shown in Table 2 under formula (state).In terms of Performance Evaluation, this method is chosen whole
Accuracy rate, macro F1 value, accurate rate, recall rate, F1 value are as evaluation index.Wherein, accurate rate (precision), recall rate
(recall), F1 value is that evaluation index is commonly used in two classification problems, and it is all that accurate rate P indicates that the positive class quantity being estimated correctly accounts for
Estimation is positive the ratio of class quantity, and recall rate R indicates that the positive class quantity being estimated correctly accounts for the ratio of all true positive class quantity,
F1 value is the harmomic mean of accurate rate and recall rate, i.e. F1=2PR/ (P+R), these three indexs are for measuring this method estimation often
A kind of performance of state value, index value is higher, and to represent performance better.Whole accuracy rate indicates that all sample numbers correctly estimated account for
The ratio of total number of samples, macro F1 value are the arithmetic mean of instantaneous values to the F1 value of each state, the two indexs represent model
Overall performance, equally, index value is higher, and to represent performance better.
One Kmeans of table and this method entirety accuracy rate and macro F1 value performance comparison
The experimental results showed that this method is better than Kmeans method in each performance indicator, this method obtains preferable performance
The reason is that when network node be in it is different to expression patterns when, it is possible to create identical arrival rate, i.e. observation under different mode
Value, which exists, to be overlapped, and since network node behavior is there are time continuity and spatial coherence, the HMRF model of this method considers
The status information of neighbor node when network sky, it is thus possible to preferably the node for belonging to homologous state is distinguished, however
Kmeans method directly divides state according to observation data, therefore cannot obtain preferable effect on distinguishing network node state
Fruit, this is that behavioural information gives estimation network behavior state bring gain when HMRF introduces empty.Therefore estimated using this method
The state value effect arrived is preferable, describes the state of network entirety, estimates that the result of behavior state is used for according to this method
Monitoring each dimension network flow data will be more acurrate, and the scheme and strategy for formulating network management accordingly are also total optimization.
Kmeans and this method accurate rate, recall rate, F1 value performance comparison when table two-state number is 5
Be worth explanation, the present embodiment is only an example of this method, but this method be not limited only to analyze it is single
The network flow data of dimension, such as data packet arrival rate.This Based on Distributed network traffic behavior analysis method can provide
Various dimensions network traffic analysis, using a variety of dimension datas on flows as observation, estimation obtains overall network traffic behavior state
Distribution, network state represent in current network in operating mode, can disclose between network node flow inner behavior state with
And the whole flow behavior state of the whole network, network administrator can know whole network data on flows and corresponding operating mode letter
Breath analyzes the space time distribution of behavior state, realizes and monitor to network-flow characteristic various dimensions.
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair
The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description
To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this
Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention
Protection scope within.
Claims (4)
1. a kind of traffic behavior analysis method of Based on Distributed network, which is characterized in that be specially:
Model training stage:Training data of the network history data on flows as training pattern is acquired, network-flow characteristic is obtained
Model;
The study stage:The real-time traffic data of acquisition are inputted into trained network-flow characteristic model, and utilize maximum a posteriori
Estimation criterion obtains network global traffic behavior by iterative calculation.
2. the method according to claim 1, wherein further including in the middle part of administration network before acquiring data on flows
Affix one's name to network probe acquire data on flows, specifically dispose probe on the network node, for network node acquire different grain size,
The data on flows of different agreement level, and be transmitted to flow analysis center and carry out data analysis.
3. the method according to claim 1, wherein the realization process of the model training stage is specially:Really
Determine network-flow characteristic model structure and estimation model parameter;
Determine network-flow characteristic model structure:
Distributed network flow behavioural information is divided into two layers:Hidden state layer and observation data Layer, observation data Layer is by network
The network node data on flows that probe measurement obtains is constituted, and hidden state layer is made of the behavior pattern of network node, indicates net
In driving factors in network, network node flow external manifestation is directly driven;Hidden state and observation are indicated using stochastic variable,
Therefore hidden state layer and observation data Layer constitute two random fields, i.e., hidden state field and observation field;
Define mathematic sign:In the network that one possesses N number of node,Indicate set of network nodes,It indicates in t
N-th of node of a time slot,WhereinIndicate institute's nodes of locations set at one's leisure,
HaveT is number of timeslots;Use St,nIndicate node xt,nHidden state variable,Indicate stochastic variable St,n
An example, whereinHidden state set is represented, thenExpression is defined onOn hidden shape
State family of random variables;Therefore, being able to use S indicates the hidden state field on [1, T],Indicate a configuration of S, wherein
Indicate all possible configuration set of hidden state field;Use similar expression, Ot,nIndicate node xt,nObservation variable,Indicate stochastic variable Ot,nAn example, whereinIndicate observation value set, thenExpression is defined onOn observation family of random variables;Therefore, being able to use O indicates
Observation field on [1, T],Indicate a configuration of O, whereinIndicate that observation site is possible to configuration set;
Relationship is developed when portraying the sky between hidden state field and observation field using HMRF model;
For hidden state field, a hypothesis is introduced:Spatially a node is only related with its state of a hop neighbor node, when
Between on it is only related with its state at previous moment;The probability of method based on statistical learning, hidden state field can be by following
Formula obtains:
Wherein,It indicates not including node xt,nNetwork sky when nodes of locations set,WithRespectively indicate node
xt,nSpatial neighbors state and time neighbor state, λ indicate hidden state field parameters;
Local probability in formula (1), is obtained by the following formula:
Wherein, m indicates node state, time transition probabilityAccording to time hidden state transition probability matrix A
It calculates, A indicates hidden state from t moment to the state transition probability matrix at t+1 moment, i.e. time hidden state transfer composition single order horse
Er Kefu chain;A matrix is expressed from the next:
Wherein PijSubscript i and j respectively indicate node hidden state locating for t the and t+1 moment;Space migrating probability passes through following
Formula obtains:
Wherein, Ut,n(m) edge energy function is indicated, and Indicate node xt,n
Spatial neighbors node,Indicate node xt,nSpatial neighbors number of nodes, wherein potential function is defined as:Vt,n(m)=
Num α, wherein parameter alpha is used to portray present node and its spatial neighbors node influences each other the power of relationship, and num indicates empty
Between neighbor node state and the different quantity of present node state;
For observation field, network node observation is obtained by network probe, i.e. the observation field of networkIt is given data;If the observation of a node is only related with the state of the node, observation
Output probability of the field under hidden state-driven is obtained by the following formula:
Wherein, even multiply symbol subscript (t, n) expressionPr[Ot,n=k | St,n=m, θm] indicate t moment, section
Point n exports the probability that observation is k under conditions of state is m, calculates for convenience, by observation Ot,nDiscretization is carried out, is made
Probability is replaced come approximate with frequency, i.e., comes approximate condition probability, parameter θ in the frequency distribution of state m using observationmIt indicates
The distribution parameter of observation in the state of specific, used here as output probability matrix B indicate, referred to as observation field parameters, B matrix by
Following formula indicates:
Wherein PmkIndicate the probability that node is k in state m output observation;
Network-flow characteristic model structure is determined as a result, and network-flow characteristic model is portrayed by HMRF model, therefore model parameter
For Ω={ A, α, B },
Estimate model parameter;
After collecting historical traffic data and determining traffic behavior model structure, historical traffic data training pattern is utilized
Parameter Ω={ A, α, B };For the ease of practical engineering application, the frequency of use approximation probability in calculating process, therefore calculate it
Before need to observation Ot,nCarry out discretization;
Its training process inputs historical traffic data o, i.e. network node observation, output model parameter Ω={ A, α, B };Estimation
Model parameter process step is as follows:
(3-1) initializes iteration poll initial value i, iteration stopping condition Iter, initial hidden state field s(1);
Wherein iteration poll initial value i is initialized as 1;Iteration stopping condition setting is iteration stopping number Iter, rule of thumb
It is preferably arranged to 5-8 times;In addition, iteration stopping condition can also be set as front and back iterative process parameter variation range threshold twice
Value stops iteration when variation range is less than given threshold value;Initial hidden state field s(1)Made according to historical traffic data observation
It is initialized with clustering algorithm, cluster categorical measure monitors demand according to real network and determines, categorical measure corresponding network section
Point behavior state quantity, therefore categorical measure reflects the granularity portrayed network-flow characteristic, behavior state quantity is more, energy
The traffic behavior granularity enough portrayed is thinner;
(3-2) updates model parameter according to the configuring condition of hidden state field, and the frequency renewal time jumped according to time state is hidden
State transition probability matrix A is transferred to the frequency of state j and is denoted as A if moment t is in state i moment t+1ij, then state in A
Transition probability PijEstimated value obtained by following formula:
Empirically formula determines α value, and between the preferred 0.5-10 of empirical value, α is bigger, indicates to interact between node bigger,
The state of neighbor node is bigger on the influence of the state of present node, and vice versa;Frequency according to corresponding state output observation
Distributed update output probability matrix B, if the frequency that state is m in sample and observation is k is Bmk, then output probability in B
PmkEstimated value obtained by following formula:
Meanwhile iteration wrap count adds 1, i.e. i=i+1;
(3-3) judges whether to meet stop condition, that is, judges i > Iter;
If 3-3-1) being judged as NO, hidden state field s is updated according to estimation behavior state process(i), input data is historical traffic
Data o, "current" model parameterOutput data is to update hidden state field s(i), wherein estimate the initial of behavior state process
Hidden state field uses current hidden state field s(i-1);
Return step (3-2);
If 3-3-2) being judged as YES, final mask parameter Ω={ A, α, B } is exported;
It, can be based on model parameter Ω={ A, α, the B } that historical traffic data training obtains, as flow row according to above step
For model.
4. according to the method described in claim 3, it is characterized in that, the process in the study stage is:
According to the traffic behavior model and collected network node real-time traffic data got, network node can be estimated
Flow inner behavior state,
The above process is equivalent under conditions of setting models parameter Ω and observation field o, estimates a configuration of hidden state fieldAccording to MAP estimation criterion, optimal hidden state field estimated value is foundIt is equivalent to solve following formula:
According to Bayes' theorem, haveSince Pr [o] is a constant, Pr [s |
o,Ω]∝Pr[o|s,Ω]·Pr[s|Ω];Wherein prior probability Pr [s | Ω] and likelihood probability Pr [o | s, Ω] lead to respectively
Cross formula (1), (2) are calculated, that is, be obtained by the following formula:
The hidden state field of network global optimum is obtained using the mode of iterative calculation, estimation behavior state process is that input data is real
When data on flows o, model parameter Ω={ A, α, B }, output data is the hidden state field estimated value of networkEstimate behavior state stream
Steps are as follows for journey:
(4-1) initializes iteration poll initial value i, initial hidden state field s(0), iteration stopping condition Iter;
Iteration poll initial value i is initialized as 1;According to the priori knowledge init state field of state field and observation field relationship, or
Person uses clustering algorithm init state field according to observation field;Iteration stopping condition setting is iteration stopping number Iter, according to
Experience is preferably arranged to 3-5 times;
(4-2) traverses all possible state value for each node in network, is selected most probably according to MAP estimation criterion
The state value of rate is equivalent to solve following formula as node current iteration round estimated result:
Meanwhile iteration wrap count adds 1, i.e. i=i+1;
(4-3) judges whether to meet stop condition, that is, judges i > Iter;
If 4-3-1) being judged as NO, return step (4-2) updates each node state again;
If 4-3-2) being judged as YES, end-state field estimated value is exportedThus network global traffic behavior state, net are obtained
Network administrator can obtain the monitoring to the whole network behavior accordingly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810728186.0A CN108923975B (en) | 2018-07-05 | 2018-07-05 | Traffic behavior analysis method for distributed network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810728186.0A CN108923975B (en) | 2018-07-05 | 2018-07-05 | Traffic behavior analysis method for distributed network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108923975A true CN108923975A (en) | 2018-11-30 |
CN108923975B CN108923975B (en) | 2021-08-10 |
Family
ID=64424625
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810728186.0A Active CN108923975B (en) | 2018-07-05 | 2018-07-05 | Traffic behavior analysis method for distributed network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108923975B (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109831386A (en) * | 2019-03-08 | 2019-05-31 | 西安交通大学 | Optimal route selection algorithm based on machine learning under a kind of SDN |
CN109951462A (en) * | 2019-03-07 | 2019-06-28 | 中国科学院信息工程研究所 | A kind of application software Traffic anomaly detection system and method based on holographic modeling |
CN110139125A (en) * | 2019-06-18 | 2019-08-16 | 洛阳师范学院 | Video sharing method based on demand perception and caching resource under mobile radio network |
CN110691003A (en) * | 2019-09-04 | 2020-01-14 | 北京天融信网络安全技术有限公司 | Network traffic classification method, device and storage medium |
CN111224940A (en) * | 2019-11-15 | 2020-06-02 | 中国科学院信息工程研究所 | Anonymous service traffic correlation identification method and system nested in encrypted tunnel |
CN111294284A (en) * | 2018-12-10 | 2020-06-16 | 华为技术有限公司 | Traffic scheduling method and device |
CN111698269A (en) * | 2020-04-07 | 2020-09-22 | 中博信息技术研究院有限公司 | Network intrusion detection method based on Plackett-Luce model |
CN112039906A (en) * | 2020-09-03 | 2020-12-04 | 华侨大学 | Cloud computing-oriented network flow anomaly detection system and method |
CN112134738A (en) * | 2020-09-24 | 2020-12-25 | 中电科仪器仪表有限公司 | Network multidimensional data flow simulation device based on composite two-dimensional Sketch |
CN112202593A (en) * | 2020-09-03 | 2021-01-08 | 深圳前海微众银行股份有限公司 | Data acquisition method, device, network management system and computer storage medium |
CN112653588A (en) * | 2020-07-10 | 2021-04-13 | 深圳市唯特视科技有限公司 | Adaptive network traffic collection method, system, electronic device and storage medium |
CN112769972A (en) * | 2020-12-22 | 2021-05-07 | 赛尔网络有限公司 | Flow analysis method and device for IPv6 network, electronic equipment and storage medium |
CN112788066A (en) * | 2021-02-26 | 2021-05-11 | 中南大学 | Abnormal flow detection method and system for Internet of things equipment and storage medium |
CN113569368A (en) * | 2021-09-17 | 2021-10-29 | 支付宝(杭州)信息技术有限公司 | Protocol-based modeling method and device |
CN113783788A (en) * | 2021-09-16 | 2021-12-10 | 航天新通科技有限公司 | Network optimization system and method based on flow prediction |
CN114039758A (en) * | 2021-11-02 | 2022-02-11 | 中邮科通信技术股份有限公司 | Network security threat identification method based on event detection mode |
CN114338419A (en) * | 2021-12-15 | 2022-04-12 | 中电信数智科技有限公司 | IPv6 global networking edge node monitoring and early warning method and system |
CN114598904A (en) * | 2020-11-20 | 2022-06-07 | 中国移动通信集团广东有限公司 | Fault positioning method and device for IPTV service |
CN115277249A (en) * | 2022-09-22 | 2022-11-01 | 山东省计算中心(国家超级计算济南中心) | Network security situation perception method based on cooperation of multi-layer heterogeneous network |
CN116471066A (en) * | 2023-04-06 | 2023-07-21 | 华能信息技术有限公司 | Flow analysis method based on flow probe |
US11956117B1 (en) | 2023-05-22 | 2024-04-09 | Google Llc | Network monitoring and healing based on a behavior model |
CN112202593B (en) * | 2020-09-03 | 2024-05-31 | 深圳前海微众银行股份有限公司 | Data acquisition method, device, network management system and computer storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104753732A (en) * | 2013-12-27 | 2015-07-01 | 郭祖龙 | Distribution based network traffic analysis system and method |
CN106612289A (en) * | 2017-01-18 | 2017-05-03 | 中山大学 | Network collaborative abnormality detection method based on SDN |
-
2018
- 2018-07-05 CN CN201810728186.0A patent/CN108923975B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104753732A (en) * | 2013-12-27 | 2015-07-01 | 郭祖龙 | Distribution based network traffic analysis system and method |
CN106612289A (en) * | 2017-01-18 | 2017-05-03 | 中山大学 | Network collaborative abnormality detection method based on SDN |
Non-Patent Citations (1)
Title |
---|
谢逸 等: "A General Collaborative Framework for Modeling and Perceiving Distributed Network Behavior", 《 IEEE/ACM TRANSACTIONS ON NETWORKING 》 * |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111294284A (en) * | 2018-12-10 | 2020-06-16 | 华为技术有限公司 | Traffic scheduling method and device |
CN111294284B (en) * | 2018-12-10 | 2022-04-26 | 华为技术有限公司 | Traffic scheduling method and device |
CN109951462A (en) * | 2019-03-07 | 2019-06-28 | 中国科学院信息工程研究所 | A kind of application software Traffic anomaly detection system and method based on holographic modeling |
CN109831386B (en) * | 2019-03-08 | 2020-07-28 | 西安交通大学 | Optimal path selection algorithm based on machine learning under SDN |
CN109831386A (en) * | 2019-03-08 | 2019-05-31 | 西安交通大学 | Optimal route selection algorithm based on machine learning under a kind of SDN |
CN110139125A (en) * | 2019-06-18 | 2019-08-16 | 洛阳师范学院 | Video sharing method based on demand perception and caching resource under mobile radio network |
CN110691003A (en) * | 2019-09-04 | 2020-01-14 | 北京天融信网络安全技术有限公司 | Network traffic classification method, device and storage medium |
CN111224940B (en) * | 2019-11-15 | 2021-03-09 | 中国科学院信息工程研究所 | Anonymous service traffic correlation identification method and system nested in encrypted tunnel |
CN111224940A (en) * | 2019-11-15 | 2020-06-02 | 中国科学院信息工程研究所 | Anonymous service traffic correlation identification method and system nested in encrypted tunnel |
CN111698269A (en) * | 2020-04-07 | 2020-09-22 | 中博信息技术研究院有限公司 | Network intrusion detection method based on Plackett-Luce model |
CN112653588A (en) * | 2020-07-10 | 2021-04-13 | 深圳市唯特视科技有限公司 | Adaptive network traffic collection method, system, electronic device and storage medium |
CN112202593A (en) * | 2020-09-03 | 2021-01-08 | 深圳前海微众银行股份有限公司 | Data acquisition method, device, network management system and computer storage medium |
CN112202593B (en) * | 2020-09-03 | 2024-05-31 | 深圳前海微众银行股份有限公司 | Data acquisition method, device, network management system and computer storage medium |
CN112039906B (en) * | 2020-09-03 | 2022-03-18 | 华侨大学 | Cloud computing-oriented network flow anomaly detection system and method |
CN112039906A (en) * | 2020-09-03 | 2020-12-04 | 华侨大学 | Cloud computing-oriented network flow anomaly detection system and method |
CN112134738A (en) * | 2020-09-24 | 2020-12-25 | 中电科仪器仪表有限公司 | Network multidimensional data flow simulation device based on composite two-dimensional Sketch |
CN112134738B (en) * | 2020-09-24 | 2023-03-24 | 中电科思仪科技股份有限公司 | Network multidimensional data flow simulation device based on composite two-dimensional Sketch |
CN114598904A (en) * | 2020-11-20 | 2022-06-07 | 中国移动通信集团广东有限公司 | Fault positioning method and device for IPTV service |
CN114598904B (en) * | 2020-11-20 | 2023-06-30 | 中国移动通信集团广东有限公司 | Fault positioning method and device for IPTV service |
CN112769972A (en) * | 2020-12-22 | 2021-05-07 | 赛尔网络有限公司 | Flow analysis method and device for IPv6 network, electronic equipment and storage medium |
CN112769972B (en) * | 2020-12-22 | 2023-02-28 | 赛尔网络有限公司 | Flow analysis method and device for IPv6 network, electronic equipment and storage medium |
CN112788066A (en) * | 2021-02-26 | 2021-05-11 | 中南大学 | Abnormal flow detection method and system for Internet of things equipment and storage medium |
CN112788066B (en) * | 2021-02-26 | 2022-01-14 | 中南大学 | Abnormal flow detection method and system for Internet of things equipment and storage medium |
CN113783788A (en) * | 2021-09-16 | 2021-12-10 | 航天新通科技有限公司 | Network optimization system and method based on flow prediction |
CN113783788B (en) * | 2021-09-16 | 2022-06-17 | 航天新通科技有限公司 | Network optimization system and method based on flow prediction |
CN113569368B (en) * | 2021-09-17 | 2022-01-11 | 支付宝(杭州)信息技术有限公司 | Protocol-based modeling method and device |
CN113569368A (en) * | 2021-09-17 | 2021-10-29 | 支付宝(杭州)信息技术有限公司 | Protocol-based modeling method and device |
CN114039758A (en) * | 2021-11-02 | 2022-02-11 | 中邮科通信技术股份有限公司 | Network security threat identification method based on event detection mode |
CN114338419A (en) * | 2021-12-15 | 2022-04-12 | 中电信数智科技有限公司 | IPv6 global networking edge node monitoring and early warning method and system |
CN114338419B (en) * | 2021-12-15 | 2024-04-16 | 中电信数智科技有限公司 | IPv6 global networking edge node monitoring and early warning method and system |
CN115277249A (en) * | 2022-09-22 | 2022-11-01 | 山东省计算中心(国家超级计算济南中心) | Network security situation perception method based on cooperation of multi-layer heterogeneous network |
CN115277249B (en) * | 2022-09-22 | 2022-12-20 | 山东省计算中心(国家超级计算济南中心) | Network security situation perception method based on cooperation of multi-layer heterogeneous network |
CN116471066A (en) * | 2023-04-06 | 2023-07-21 | 华能信息技术有限公司 | Flow analysis method based on flow probe |
US11956117B1 (en) | 2023-05-22 | 2024-04-09 | Google Llc | Network monitoring and healing based on a behavior model |
Also Published As
Publication number | Publication date |
---|---|
CN108923975B (en) | 2021-08-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108923975A (en) | A kind of traffic behavior analysis method of Based on Distributed network | |
Wang et al. | Spatio-temporal analysis and prediction of cellular traffic in metropolis | |
Afuwape et al. | Performance evaluation of secured network traffic classification using a machine learning approach | |
Lau et al. | Probabilistic fault detector for wireless sensor network | |
De Farias et al. | A multi-sensor data fusion technique using data correlations among multiple applications | |
JP2018512823A (en) | Integrated discovery of communities and roles in corporate networks | |
CN106612289A (en) | Network collaborative abnormality detection method based on SDN | |
CN107409064A (en) | For supporting the method and system of anormal detection in network | |
Jeong et al. | Cluster-aided mobility predictions | |
Coluccia et al. | Distribution-based anomaly detection via generalized likelihood ratio test: A general maximum entropy approach | |
Cerroni et al. | Decentralized detection of network attacks through P2P data clustering of SNMP data | |
Al-Yaseen et al. | Real-time intrusion detection system using multi-agent system | |
Yang et al. | Fog intelligence for network anomaly detection | |
Zhang et al. | Faulty sensor data detection in wireless sensor networks using logistical regression | |
Kumarage et al. | Granular evaluation of anomalies in wireless sensor networks using dynamic data partitioning with an entropy criteria | |
Zhang et al. | Robustness of mobile ad hoc networks under centrality-based attacks | |
Neto et al. | FedSA: Accelerating intrusion detection in collaborative environments with federated simulated annealing | |
Geepthi et al. | RETRACTED ARTICLE: Network traffic detection for peer-to-peer traffic matrices on bayesian network in WSN | |
Nguyen et al. | A binary independent component analysis approach to tree topology inference | |
Zhang et al. | Modelling critical node attacks in MANETs | |
Bezahaf et al. | Self-generated intent-based system | |
Glass et al. | Automatically identifying the sources of large Internet events | |
Sahay et al. | Traffic convergence detection in IoT LLNs: a multilayer perceptron based mechanism | |
Zhang et al. | Robustness analysis of mobile ad hoc networks using human mobility traces | |
CN109756379A (en) | A kind of network performance abnormality detection and localization method based on the decomposition of matrix difference |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |