CN1658576A

CN1658576A - Detection and defence method for data flous of large network station

Info

Publication number: CN1658576A
Application number: CN 200510033423
Authority: CN
Inventors: 余顺争
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2005-03-09
Filing date: 2005-03-09
Publication date: 2005-08-24
Anticipated expiration: 2025-03-09
Also published as: CN100352208C

Abstract

This invention discloses a statistical abnormality detection and anti-attack method applied to large scale of network station. It includes founding module method of posting latent semi-Markov model of accident current, highly efficient model training and normal detecting calculation of the current, and the corresponding priority alignment service and current amount control measure. This invention is suitable for establishing a statistical abnormality detection and defending system that provides normal service to the normal current and filters the DDoS flood-type attacking current, and fits to the large network stations such as sports, news, entertainment, business networks.

Description

A kind of detection of data flous of large network station and defence method

Technical field

The invention belongs to the network security technology field, particularly relate to a kind of detection and defence method of data flous of large network station.

The safety of technical background large-scale website has and the different characteristics in general website.The peak traffic of large-scale website is very huge, attacks in the easiest flood formula that suffers of peak value period.So the flood formula that the most direct effective attack to large-scale website will be DDoS (the distributed denial of service of Distributed Denial-of-Service) is attacked.The traffic carrying capacity of large-scale website also has paroxysmal characteristics, and than the traffic carrying capacity of general website, its easier attack with the flood formula is obscured mutually.The common statistics method for detecting abnormality of resisting invasion for the foundation of general website might be attack stream to normal, precipitate stream erroneous judgement, thereby become inapplicable.Therefore, the safety problem of large-scale website is a kind of new technological challenge.

The security strategy that existing large-scale website adopts mainly is by the server array that adopts vast capacity, the network insertion and distributed multistage, the multipoint configuration of super large bandwidth, with reliability that improves the website and the attack of dissolving flood formula flow.The subject matter of this security strategy is that people can not unfailingly increase power system capacity, to resist the ddos attack of any degree; Excessive power system capacity also might can not get actual utilization.So, take effective the detection and the measure of resisting ddos attack, under rational input and configuration condition, the safety that improves large-scale website is very important.

The variation of large-scale website traffic carrying capacity is an astable random process.The user of common website is subjected to the influence of user's daily schedule bigger to the visit of website, thereby the traffic carrying capacity on the server, and having is the variation in cycle with day, week significantly.Macroscopical variation characteristic of this traffic carrying capacity is used to carry out the dynamic load leveling of traffic carrying capacity between the different time zone server, to the classification of website with to the prediction of flow.This macroscopic properties of common website can remain unchanged in a few hours, thereby common website traffic model often is confined to the model peak value period, that be assumed to steady-state process.The influence that the user behavior of large-scale website is arranged by large-scale activity is bigger, flow peak before and after activity is held and during carrying out (continuing tens to dozens of minutes).Variation on this macroscopic view makes the peak value stream of large-scale website have the unstable state variation characteristic.

Many studies show that over past ten years, actual stream has second order self-similarity (second orderself-similarity) and long correlation (long-range dependence).So,, still to describe with random process with second order self-similarity or long correlation for the change at random of the traffic carrying capacity of large-scale website.Consider the broad applicability of model and the validity of parameter estimation algorithm, the present invention will adopt hiddensemi-Markov model (HSMM) to describe the change at random process of traffic carrying capacity.Hidden Markovmodel (HMM) has obtained extensive and successful application at many key areas such as speech recognition, handwritten form/literal identification, digital communication encoding and decoding, dna sequence dna classification.Compare with HMM, HSMM is more suitable for distributing in describing unstable state and non-Markovian.HSMM can (but HMM can not) describe the second order self-similarity/long correlation of actual flow and dynamic variation characteristic in time, can estimate to be used to weigh the Hurst parameter of self-similarity.So, utilize HSMM can detect professional quantitative statistics abnormal conditions, determine the size of traffic carrying capacity, or the like.

Summary of the invention

The objective of the invention is to overcome the deficiencies in the prior art, provide the fine differentiation of a kind of energy to arrive the precipitate a large amount of normal flows and the attack stream of large-scale website and shield the detection and the defence method of a kind of data flous of large network station of attack stream.

The technical solution used in the present invention is as follows:

A kind of detection of data flous of large network station and defence method, set up detection model and utilize the normal flow of large-scale website to train described detection model by adopting hidden semi-Markov model, again described detection model is applied in real time detect the data flow that arrives large-scale website, concrete detection method is to calculate the probable probability of the observation sequence of each data flow with respect to detection model, carry out priority queueing according to the distribution and the size of the probable probability of data flow then, the data flow that priority is high more is just preferential more to be obtained serving the then back more acquisition service of the data flow that priority is low more.

The modeling method and the model training method of detection model hidden semi-Markov model of the present invention comprise:

(1) sets up model: establish the data flow that large-scale website detecting and have M discrete state, be expressed as 1,2 respectively, ..., M, and remember that the set of these states is S, the state transitions relation is described with the Markov chain with M state, and matrix A is represented state transition probability, its element a _MnThe transition probability of representative from state m to state n, the transfer between the state are the processes that changes step by step from low to high or from high to low, i.e. a when | m-n|＞1 _Mn=0;

Adopt b _m(k) expression arrives the probability of k entity for given state m in the unit interval, and it meets the Poisson distribution, promptly

b_{m} (k) = P (X = k | state m) = \frac{{μ_{m}}^{k - 1}}{(k - 1)!} e^{- μ_{m}},

Wherein, k=1,2 ..., ∞, μ _m＞0, m ∈ S, μ ₁≤ μ ₂≤ ...≤μ _M

Make p again _m(d) represent the discrete probability distribution of the duration of state m, it represents that the time difference between former and later two states is the probability of d, and it meets Pareto and distributes, promptly

p_{m} (d) = d^{{- λ}_{m}} - {(d + 1)}^{{- λ}_{m}},

Wherein, d=1,2 ..., ∞, λ _m＞0, m ∈ S,

Use set omega={ A, π, λ, the μ } of parameter to represent hidden semi-Markov model, wherein π=(π then ₁, π ₂..., π _M), be initial condition probability distribution vector, λ=(λ ₁, λ ₂..., λ _M), μ=(μ ₁, μ ₂..., μ _M);

(2) model training: promptly iterate utilization forward direction algorithm, inverse algorithms and parameter estimation algorithm formula as described below, converge to one group of fixing value up to model parameter, form a perfect detection model.

Forward direction-inverse algorithms is as follows:

Make o _tRepresent t observation vector, comprise that the entity that t criticizes arrival counts r _tWith from the time interval q the zero hour to the zero hour that t criticizes that t-1 criticizes _T-1, i.e. o _t=(q _T-1, r _t), o _a ^bRepresentative is individual to b observation vector sequence from a, o ₁ ^TThen represent whole observation sequence, its length is T, s _tThe state of representative stream when arriving for t batch, 1≤t≤T defines following variable again,

α_{t} (m) = \Pr [o_{1}^{t}, s_{t} = m | Ω],

β_{t} (m) = \Pr [o_{t + 1}^{T} | s_{t} = m, Ω],

γ_{t} (m) = \Pr [s_{t} = m | o_{1}^{T}, Ω] = \frac{α_{t} (m) β_{t} (m)}{\Pr [o_{1}^{T} | Ω]},

ξ_{t} (m, n) = \Pr [s_{t} = m, s_{t + 1} = n | o_{1}^{T}, Ω] = \frac{α_{t} (m) p_{m} (q_{t}) a_{mn} b_{n} (r_{t + 1}) β_{t + 1} (n)}{\Pr [o_{1}^{T} | Ω]},

The forward direction algorithm is as follows:

α ₁(m)＝π _mb _m(r ₁)，

α_{t} (m) = (\underset{M &GreaterEqual; n = m - 1, m, m + 1 &GreaterEqual; 1}{Σ} α_{t - 1} (n) p_{n} (q_{t - 1}) a_{nm}) b_{m} (r_{t}), t = 2, . . ., T, m &Element; S,

Inverse algorithms is as follows:

β _T(m)＝1，

β_{t} (m) = \underset{M &GreaterEqual; n = m - 1, m, m + 1 &GreaterEqual; 1}{Σ} p_{m} (q_{t}) a_{mn} β_{t + 1} (n) b_{n} (r_{t + 1}), t = T - 1, T - 2, . . ., 1, m &Element; S,

Parameter estimation algorithm is as follows:

Parameter lambda _mMaximum probable estimation

{\hat{λ}}_{m} = \arg \max_{λ_{m}} \underset{d &GreaterEqual; 1}{Σ} {\hat{p}}_{m} (d) \ln (d^{- λ_{m}} - {(d + 1)}^{- λ_{m}}),

Or approximate obtaining

{\hat{λ}}_{m} \approx \frac{Σ_{t = 1}^{T} γ_{t} (m)}{Σ_{t = 1}^{T} γ_{t} (m) (\ln q_{t} + \frac{1}{2} \ln \frac{q_{t} + 1}{q_{t}})} = \frac{2 Σ_{t = 1}^{T} γ_{t} (m)}{Σ_{t = 1}^{T} γ_{t} (m) (\ln q_{t} (q_{t} + 1))},

Parameter μ _mMaximum probable estimation

{\hat{μ}}_{m} = \frac{Σ_{t = 1}^{T} γ_{t} (m) (r_{t} - 1)}{Σ_{t = 1}^{T} γ_{t} (m)},

Initial condition probability distribution π _mMaximum probable estimation

{\hat{π}}_{m} = \frac{γ_{1} (m)}{Σ_{m = 1}^{M} γ_{1} (m)},

State transition probability a _MnMaximum probable estimation

{\hat{a}}_{mn} = \frac{Σ_{t = 1}^{T - 1} ξ_{t} (m, n)}{Σ_{n = 1}^{M} Σ_{t = 1}^{T - 1} ξ_{t} (m, n)},

But the training off-line of detection model of the present invention carries out, also can onlinely train, off-line training is that off line utilizes normal flow to the detection model training, model after guaranteeing to train can accurately calculate the probable probability of normal flow, described online training is then carried out when concrete work synchronously, earlier the current value of the detection model parameter that can come into operation behind the off-line training as initial value, and collection in real time arrives the data flow of website, when data flow is detected as just often, utilization repeatedly " forward direction-oppositely " algorithm and parameter estimation algorithm, till model parameter converges to one group of fixing value.

Of the present invention detection model is applied to detect the data flow that arrives large-scale website in real time, promptly calculates the probable probability of the observation sequence of each data flow with respect to detection model, the concrete computational methods of its probable probability be employing forward direction algorithm earlier,

α ₁(m)＝π _mb _m(r ₁)，

α_{t} (m) = (\underset{M &GreaterEqual; n = m - 1, m, m + 1 &GreaterEqual; 1}{Σ} α_{t - 1} (n) p_{n} (q_{t - 1}) a_{nm}) b_{m} (r_{t}), t = 2, . . ., T, m &Element; S,

Adopt following formula to calculate again:

\Pr [o_{1}^{t} | Ω] = Σ_{m = 1}^{M} α_{t} (m) .

In the technique scheme, described data flow refers to that per second arrives the request number of website or the data volume of packet count or byte number or linking number or session number or page number or number of users or above-mentioned combination in any, and described data flow comprise the stream that comes from unique user or come from acting server representative the gathering stream or arrive the stream that gathers of large-scale website of a group user's convergence flow or all new users.

The present invention sets up detection model by hidden semi-Markov model; Again by normal flow training detection model; Arrive the data flow of large-scale website again with this detection model real-time testing; Carry out priority queueing by data flow with respect to the size and the distribution of the probable probability of detection model, thereby the probable probability distribution that normal flow will have bigger probable probability and meet normal flow is endowed higher priority, thereby attack stream will not meet the normal flow model and have lower probable probability or exceed outside the probable probability distribution of normal flow and be endowed lower priority.Therefore, normal flow will obtain the normal service that the website provides with high priority, the minimum data flow of priority is when Internet resources are in short supply, to be abandoned, thereby realize well distinguishing precipitate a large amount of normal flows and attack stream, reaching normal stream provides normal service and prevents the purpose that attack stream is attacked large-scale website.

Description of drawings

Fig. 1 is a structural representation of the present invention.

Embodiment

The present invention is described further below in conjunction with accompanying drawing.

Structural representation of the present invention as shown in Figure 1.At first the stream under the normal operating position of network is gathered, through necessary processing, as format transformation and the unwanted information of filtering, after be saved in the normal user data collection 1. in.2. be HSMM model parameter estimation module, it comprises the iteration estimation formulas of HSMM forward direction-inverse algorithms and parameter, and this module is at first given the model parameter initialize according to default value, even a for the first time to the model training time _1,1=a _1,2=a _{M, M}=a _{M, M-1}=1/2, a _{M, m}=a _{M, m-1}=a _{M, m+1}=1/3 (1＜m＜M), π _m=1/M, 1＜λ _m=1.5＜2, μ _m=max (r _t) * m/M, M=10 carries out forward direction-inverse iteration computing then, and tries to achieve all model parameter estimation values, repeats this iterative process, up to probable probability P r[o ₁ ^T| Ω] no longer increase or increase when very little till.It is standby in 3. that the model parameter that obtains of training and normal flow will be saved in the HSMM model parameter with respect to the distribution of the entropy of this model, 4. the forward direction algoritic module comprises the forward direction algorithm of HSMM, 3. their needed model parameters are taken from the HSMM model parameter, and 4. the forward direction algoritic module will be used for the statistics abnormality detection of the stream of real-time online.

When practical application needed, the present invention also can carry out online updating to model parameter.At this moment normal user data adfluxion data 1. come from the stream of real-time collection, when the data of real-time collection are detected as just often, promptly can be input to the normal user data adfluxion and be used for real-time update in 1. model parameter, the data sequence length that is used for the model parameter real-time update can be limited to dozens of minutes to hour, so that model is suitable for the dynamic change of flow and reduces the required time of training.When carrying out the model parameter real-time update, HSMM model parameter estimation module 2. will be the current value of model parameter as initial value (rather than default value), forward direction-inverse algorithms of utilization HSMM through after the iteration repeatedly, obtains the updating value of model parameter.Upgrade the result be saved in the HSMM model parameter 3. in, 4. call in order to the forward direction algoritic module.

5. be the collection and the discriminating module of stream, when this module received a grouping, by the differentiation that its order IP address, source, agreement, port or cookies etc. flow, the entity that this stream that adds up then arrived in the current unit interval was counted r _tWhen the current unit interval finishes, calculate and criticized last time the time difference q between the arrival _T-1, and from database, extract the forward variable value { α of this stream _T-1(m) }, send into the forward direction algoritic module and 4. calculate forward variable { α _t(m) }, calculate entropy ln (Pr[o again ₁ ^t| Ω])/t, the probability that is occurred in the normal flow entropy distributes by this entropy obtains this stream " normally " degree with respect to given model parameter, size according to normal degree, with this flow down the grouping that arrives in the unit interval send into classify and list in order of importance and urgency control module 6. in the corresponding formation service of ranking, normal degree is big more, and then priority is high more; Otherwise then low more, the grouping of lowest priority when Internet resources are not enough, will be filtered.Reach the purpose of protection normal flow and filtering attack stream thus, when monitored stream is total flow or total new number of users, classify and list in order of importance and urgency control module 6. only abnormal conditions to be reported to the police.

Present embodiment: at first by one group of observation sequence, to the model training:

A) provide the initial value of model parameter set omega.Can adopt the way of various suitable initializes.But a kind of method of simple and rational initialize is a writ attitude transition probability is that equiprobability distributes, even a _1,1=a _1,2=a _{M, M}=a _{M, M-1}=1/2, a _{M, m}=a _{M, m-1}=a _{M, m+1}=1/3 (1＜m＜M), π _m=1/M; The duration distribution p of writ attitude _m(d) be heavy-tailed Pareto distribution, i.e. 1＜λ _m=1.5＜2; Make the arrival rate difference of the entity of different given states, i.e. μ _m=max (r _t) * m/M.M can be the arbitrary integer between 10～30.

B) utilization forward direction algorithm and inverse algorithms interative computation are asked forward variable { α _tAnd { β (m) } _t(m) }.。

C) the utilization parameter estimation algorithm is tried to achieve the estimated value of model parameter set omega.

D) repeating step b) with c) up to probable probability P r[o ₁ ^T| Ω] converge to certain value.

E) entropy with this group observation sequence (is lnPr[o ₁ ^T| Ω]/T) frequency distribution is as the distribution of the entropy of normal flow.

Because model training can carry out by off-line, so can not influence the on-line operation performance of system.In fact the time of model training is not long yet.If practical application needs, this model also can onlinely be trained, and the sequence length that is used to train can be limited to dozens of minutes to hour, so that model is suitable for the dynamic change of flow and reduces the required time of training.The problem that online training may face is how to guarantee that the stream that is used to train is normal flow and do not comprise attack stream.

The statistics abnormality detection that model after will training then is applied to flow, promptly calculate the probable probability of each stream for given model parameter:

A) when detection system is received the entity that comes from certain stream (differentiation of being flowed by order IP address, source, agreement, port or cookies) for the first time, add up the entity number that it arrives in this unit interval (for example 1 second), and calculate the initial value α of forward variable ₁(m), m ∈ S; Make t=1, τ ₀It is the zero hour of this unit interval.

B) current unit interval (be τ second its zero hour),, then make t=t+1, and add up the entity that this stream arrives and count r in this unit interval if receive the entity of this stream _t, and and criticized time difference q between the arrival last time _T-1=τ-τ ₀Make τ then ₀=τ

C) calculate forward variable α _t(m), m ∈ S calculates entropy ln (Pr[o again ₁ ^t| Ω])/t.

D) probability that in the normal flow entropy distributes, occurs of this entropy exactly this stream with respect to " normally " degree of given model parameter (having represented all normal users common traits).

E) repeating step b) and d).

In probable probability calculation, the topmost time may be that from memory each stream of search is at the preceding forward variable value { α that once criticizes when arriving _T-1(m), m ∈ S}.This search time can by to the Hash computing of IP address, set up search tree, shunting and handle etc. and to significantly reduce.In addition, utilize the time locality (source IP address that promptly occurred recently has very big probability to occur once more) of the IP grouping that arrives a certain website, pass through storehouse, make the IP address that occurred recently on the storehouse top, thereby make reduce the average time of search address tabulation.In fact, do not need each stream is detected separately, several stream can be gathered together and detect.For example, with the IP address of 32bit be abbreviated as 10bit the sign indicating number, " convergence flow " number that then needs to detect has only 1024.When detection system finds that wherein certain convergence flow is unusual, can carry out finer analysis to this convergence flow, promptly wherein each stream is detected and analyzes, find out and cause those unusual streams.

Calculating the entropy ln (Pr[o of a stream ₁ ^t| Ω])/t after, just can the corresponding formation service of ranking be sent in the follow-up grouping of this stream according to the size of this value probability of occurrence in the normal flow entropy distributes.The probability that this value occurs is big more, and then priority is high more; Otherwise it is then low more.The entity of lowest priority when Internet resources are not enough, will be filtered.Reach the protection normal flow thus and dissolve the purpose that ddos attack flows.

In addition, the stream that large-scale website is attacked can be the stream that produces arbitrarily, also can be the normal stream of camouflage, the normal stream of perhaps resetting.Tackling one of the stream of any generation, the stream of camouflage source address, effective way of utilizing the stream of other server reflection and the normal stream of resetting is to adopt cookies.Do not carry the grouping of the instant cookies that produces of server, can be found at an easy rate, thereby be treated with a certain discrimination.For example, for the grouping that does not have cookies,, then can filter out simply if its source IP address or port were visited this website in the recent period.Otherwise, can be used as possible new user, the service (speed that new user occurs only accounts for the very little ratio of total arrival rate of customers usually) of rate-constrained is provided by special formation even special server.So the stream that is difficult to detect and filter is those attack streams with normal address, port and cookies.This attack stream must adopt the flow of super large or coordinate numerous attack streams, thereby show " unusually " on statistical property in order to reach the attack to " flood " formula of website.Thisly can detect by detection technique module proposed by the invention unusually, and filtered out by corresponding defense technique module.

Claims

1, a kind of detection of data flous of large network station and defence method, it is characterized in that adopting hidden semi-Markov model to set up detection model and utilize the normal flow of large-scale website to train described detection model, again described detection model is applied in real time detect the data flow that arrives large-scale website, concrete detection method is to calculate the probable probability of the observation sequence of each data flow with respect to detection model, carry out priority queueing, the back more acquisition service of the data flow that priority is low more according to the distribution and the size of the probable probability of data flow then.

2, the detection of data flous of large network station according to claim 1 and defence method is characterized in that the modeling method of described hidden semi-Markov model and model training method comprise:

(1) sets up model, establish the data flow that large-scale website detecting and have M discrete state, be expressed as 1,2 respectively, ..., M, and remember that the set of these states is S, the state transitions relation is described with the Markov chain with M state, and matrix A is represented state transition probability, its element a _MnThe transition probability of representative from state m to state n, the transfer between the state are the processes that changes step by step from low to high or from high to low, i.e. a when | m-n|＞1 _Mn=0;

b_{m} (k) = P (X = k | statem) = {μ_{m}}^{k - 1} e^{- μ_{m}} / (k - 1)!,

Wherein, k=1,2 ..., ∞, μ _m＞0, m ∈ S, μ ₁≤ μ ₂≤ ...≤μ _M

p_{m} (d) = d^{- λ_{m}} - {(d + 1)}^{- λ_{m}},

Wherein, d=1,2 ..., ∞, λ _m＞0, m ∈ S,

(2) model training comprises that main forward direction-inverse algorithms is as follows:

Make o _tRepresent t observation vector, it comprises that the entity that t criticizes arrival counts r _tWith from the time interval q the zero hour to the zero hour that t criticizes that t-1 criticizes _T-1, i.e. o _t=(q _T-1, r _t), o _a ^bRepresentative is individual to b observation vector sequence from a, o _l ^TThen represent whole observation sequence, its length is T, s _tThe state of representative stream when arriving for t batch, 1≤t≤T defines following variable again,

α_{t} (m) = \Pr [o_{1}^{t}, s_{t} = m | Ω],

β_{t} (m) = \Pr [o_{t + 1}^{T} | s_{t} = m, Ω],

γ_{t} (m) = \Pr [s_{t} = m | o_{1}^{T}, Ω] = α_{t} (m) β_{t} (m) / \Pr [o_{1}^{T} | Ω],

ξ_{t} (m, n) = \Pr [s_{t} = m, s_{t + 1} = n | o_{1}^{T}, Ω] = α_{t} (m) p_{m} (q_{t}) a_{mn} b_{n} (r_{t + 1}) β_{t + 1} (n) / \Pr [o_{1}^{T} | Ω],

The forward direction algorithm is as follows: α ₁(m)=π _mb _m(r ₁),

α_{t} (m) = (\underset{m &GreaterEqual; n = m - 1, m, m + 1 &GreaterEqual; 1}{Σ} α_{t - 1} (n) p_{n} (q_{t - 1}) a_{nm}) b_{m} (r_{t}), t = 2, . . ., T, m &Element; S,

Inverse algorithms is as follows: β _T(m)=1,

β_{t} (m) = \underset{m &GreaterEqual; n = m - 1, m, m + 1 &GreaterEqual; 1}{Σ} p_{m} (q_{t}) a_{mn} β_{t + 1} (n) b_{n} (r_{t + 1}), t = T - 1, T - 2, . . ., 1, m &Element; S,

And then the estimated value by following parameter estimation algorithm computation model parameter: parameter lambda _mMaximum probable estimation

{\hat{λ}}_{m} = \arg \max_{λ_{m}} \underset{d &GreaterEqual; 1}{Σ} {\hat{p}}_{m} (d) \ln (d^{- λ_{m}} - {(d + 1)}^{- λ_{m}}),

Or approximate obtaining

{\hat{λ}}_{m} \approx \frac{Σ_{t = 1}^{T} γ_{t} (m)}{Σ_{t = 1}^{T} γ_{t} (m) (\ln q_{t} + \frac{1}{2} \ln \frac{q_{t} + 1}{q_{t}})} = \frac{2 Σ_{t = 1}^{T} γ_{t} (m)}{Σ_{t = 1}^{T} γ_{t} (m) (\ln q_{t} (q_{t} + 1)},

Parameter μ _mMaximum probable estimation

{\hat{μ}}_{m} = \frac{Σ_{t = 1}^{T} γ_{t} (m) (r_{t} - 1)}{Σ_{t = 1}^{T} γ_{t} (m)},

Initial condition probability distribution π _mMaximum probable estimation

{\hat{π}}_{m} = \frac{γ_{1} (m)}{Σ_{m = 1}^{m} γ_{1} (m)},

State transition probability a _MnMaximum probable estimation

{\hat{a}}_{mn} = \frac{Σ_{t = 1}^{T - 1} ξ_{t} (m, n)}{Σ_{n = 1}^{M} Σ_{t = 1}^{T - 1} ξ_{t} (m, n)},

Last iteration utilization forward direction algorithm, inverse algorithms and parameter estimation algorithm formula converge to one group of fixing value up to model parameter, form a perfect detection model.

3, the detection of data flous of large network station according to claim 2 and defence method, the concrete computational methods that it is characterized in that described probable probability adopt following formula to calculate for adopting the forward direction algorithm earlier again:

\Pr

[o_{1}^{t} | Ω] = Σ_{m = 1}^{M} α_{t} (m) .

4, according to the detection and the defence method of claim 1 or 2 or 3 described data flous of large network station, it is characterized in that described data flow refers to that per second arrives the request number of website or the data volume of packet count or byte number or linking number or session number or page number or number of users or above-mentioned combination in any, and described data flow comprise the stream that comes from unique user or come from acting server representative the gathering stream or arrive the stream that gathers of large-scale website of a group user's convergence flow or all new users.

5, the detection of data flous of large network station according to claim 4 and defence method, but the training off-line that it is characterized in that described detection model carries out, also can onlinely train, described online training the current value of detection model parameter as initial value, and collection in real time arrives the data flow of website, when data flow is detected as just often, use " forward direction-reverse " algorithm and parameter Estimation formula repeatedly, till model parameter converges to one group of fixing value.

6, the detection of data flous of large network station according to claim 5 and defence method, it is characterized in that carrying out priority queueing according to the distribution and the size of the probable probability of data flow, the stream that priority is high is served normally, the data flow that priority is minimum can be filtered when Internet resources are not enough.