CN112564945B - IP network flow estimation method based on time sequence prior and sparse representation - Google Patents
IP network flow estimation method based on time sequence prior and sparse representation Download PDFInfo
- Publication number
- CN112564945B CN112564945B CN202011318745.4A CN202011318745A CN112564945B CN 112564945 B CN112564945 B CN 112564945B CN 202011318745 A CN202011318745 A CN 202011318745A CN 112564945 B CN112564945 B CN 112564945B
- Authority
- CN
- China
- Prior art keywords
- matrix
- flow
- sparse representation
- traffic
- time sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 239000011159 matrix material Substances 0.000 claims abstract description 99
- 238000005457 optimization Methods 0.000 claims abstract description 11
- 238000005516 engineering process Methods 0.000 claims abstract description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000007547 defect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/147—Network analysis or design for predicting network behaviour
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/142—Network analysis or design using statistical or mathematical methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Pure & Applied Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Algebra (AREA)
- Environmental & Geological Engineering (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses an IP network flow estimation method based on time sequence prior and sparse representation, which comprises the steps of firstly, acquiring flow values transmitted among all source-destination nodes in a network to construct an incomplete flow matrix; then modeling is carried out on the space-time correlation existing in the incomplete flow matrix by utilizing a sparse representation theory and a regularization technology to form a flow matrix estimation model; and then, converting the complex original problem into a plurality of sub-problems which are easy to solve by an alternating direction multiplier method, and finding out the local optimal solution of the original problem by iteratively optimizing the global optimal solution of the sub-problems. And finally, estimating a complete flow matrix. The invention can utilize the spatial correlation of the traffic matrix while considering the time sequence correlation of adjacent network nodes based on the time sequence and the spatial correlation in the traffic matrix, and provides theoretical support for the optimization of the traffic estimation method.
Description
Technical Field
The invention relates to a network traffic estimation method, in particular to an IP network traffic estimation method based on time sequence prior and sparse representation.
Background
Traffic Matrix (TM) is common full-network-level Traffic data, records Traffic values transmitted between all source-Destination (OD) node pairs of a measured network, and is widely applied to Traffic engineering, full-network anomaly detection and other application problems. However, since the traffic matrix needs to capture the global state information of the network traffic, the cost of directly measuring all the traffic matrix data is too high, and it is almost infeasible and impractical in practical application. Estimating the traffic matrix by indirect observation can reduce the cost and overhead of direct measurement, which has become a popular research field.
Many effective methods should be used for the traffic matrix estimation, but the estimation accuracy is not enough due to the fact that the inherent space-time correlation characteristic of the traffic matrix is not utilized
Disclosure of Invention
The invention aims to: aiming at the defects of the prior art, the invention provides an IP network flow estimation method based on time sequence prior and sparse representation, and the accuracy of flow matrix estimation is improved.
The technical scheme is as follows: the invention discloses an IP network flow estimation method based on time sequence prior and sparse representation, which is characterized by comprising the following steps of:
s1, acquiring flow values transmitted among all source-destination nodes in a network to construct an incomplete flow matrix;
s2, establishing a flow matrix estimation model aiming at the space-time correlation existing in the incomplete flow matrix by using a sparse representation theory and a regularization technology;
s3, converting the original flow matrix estimation problem into a plurality of sub-problems easy to solve by an alternating direction multiplier method;
and S4, iteratively optimizing the global optimal solution of the sub-problem to find the local optimal solution of the original problem, and estimating a complete flow matrix.
The incomplete flow matrix in the step S1 is constructed according to the following steps:
assuming that the number of time intervals of one day is T and the total number of OD pairs is N, the traffic matrix can be represented as
Wherein m is ij Represents the flow value of the jth OD pair at the ith time interval, "# represents a known flow value,"? "represents missing flow values, and in the flow matrix, one column represents one sample, one OD pair for all time intervals of the day, and one row represents the flow values of all OD pairs for one time interval.
The step S2 includes the following steps:
firstly, a traffic estimation model based on a sparse representation theory is established according to the spatial correlation existing in an incomplete traffic matrix, and the expression is as follows:
wherein,for a known incomplete traffic matrix>For a complete flow matrix that needs to be solved, then>The coefficient matrix is expressed for sparseness needing to be solved, the omega set expresses a known flow value element subscript set in the flow matrix, and lambda 1 Is an adjustable parameter. P Ω (. Cndot.) is a projection operator, indicating that when the element index (i, j) ∈ Ω, the corresponding position sample element is obtained:
because the element values in the traffic matrix are not negative, X is more than or equal to 0, and in order to avoid trivial solution, the diagonal elements of the constraint sparse representation matrix W are all 0, namely diag (W) =0;
aiming at the time sequence correlation existing in the incomplete flow matrix, a flow estimation model based on time sequence prior and sparse representation is established by combining a flow estimation model based on a sparse representation theory, and the expression of the flow estimation model is as follows:
wherein λ is 2 In order to be an adjustable parameter, the device is provided with a power supply,is a Toeplitz (0,1, -1) matrix.
In step S3, the sub-problems include the flow matrix X, the sparse representation coefficient matrix W, and the error variable C representing the error of the incomplete flow matrix M and the complete flow matrix X outside the set Ω during the iteration process,
the step S4 comprises the following specific steps:
s41, introducing an error variable C for solving conveniently, and rewriting the flow estimation model into the following form:
s.t.X≥0,diag(W)=0,M=X+C,P Ω (C)=0
s42, putting the constraint terms into the objective function, defining indication functions g (X) and f (C), and converting the optimization problem into an equivalent penalty function form, wherein the expression of the indication functions is as follows:
the penalty function is expressed as follows:
s43, obtaining an expression of X, W, C and beta to be solved according to the penalty function as follows:
where ρ is a fixed parameter, preferably 1.1 or 1.2, β is a parameter of a penalty function, β is k ,β k-1 Beta values, beta, in the k-th iteration and the k-1-th iteration, respectively max Is a fixed parameter representing the maximum value of beta, preferably 10 6 F is the F norm; and performing alternate optimization solution on the above formula, wherein the flow matrix X when the preset maximum iteration times are reached is the estimated complete flow matrix, the maximum iteration times are constants within 50, and the specific numerical values are determined according to the experimental effect.
SparsityIn the solving process of the expression matrix W, each column is solved respectively, the solving process of each column is regarded as a LASSO problem, and W is set i I-th column, X, representing W i I-th column of X, X -i Representing the matrix obtained by removing the ith column from X, solving for W so as to satisfy the constraint diag (W) =0 i When X is not involved in the calculation, W i The expression of (a) is as follows:
has the beneficial effects that: compared with the prior art, the invention has the following remarkable advantages: the method has the advantages that the defect of inaccurate estimation caused by the traditional KNN estimation algorithm is effectively overcome, the advantages of the method can be better embodied under the high-dimensional condition, the spatial correlation of the flow matrix is utilized while the time sequence correlation of adjacent network nodes is considered, the obtained flow matrix is more accurate, and theoretical support is provided for optimization of the flow estimation method.
Drawings
FIG. 1 is a schematic view of a model structure according to the present invention;
FIG. 2 is a flow chart of the method of the present invention.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
As shown in fig. 1, the basic idea of the IP network traffic estimation method based on time sequence prior and sparse representation of the present invention is to collect traffic values at t times in the network, construct an incomplete traffic matrix according to the known traffic values, construct a time sequence prior and sparse representation model based on time sequence correlation and spatial correlation, solve the model by an alternating direction multiplier method, and estimate an complete traffic matrix. The method comprises the following specific steps:
step 1), constructing an incomplete flow matrix M:
the method comprises the steps of collecting traffic values transmitted among all source-Destination (OD) nodes in a network to obtain an incomplete traffic matrix M. In the flow matrix M, a column represents a sample, and is an OD pairFlow values for all time intervals during the day. One row represents the flow values of all OD pairs in a time interval, and if the time interval number of one day is T and the total number of the OD pairs is N, the flow matrix can be represented as
Wherein m is ij Represents the flow value of the jth OD pair at the ith time interval, "-" represents a known flow value, "? "represents the missing flow value. Due to the influence of the communication behavior of the user, the traffic values of the adjacent nodes in the traffic matrix are related, i.e. spatially related, and a large number of missing values in the traffic matrix are a sparse matrix.
Step 2), establishing a flow estimation model based on a sparse representation theory:
in a sparse traffic matrix, each sample can be represented as a linear combination of other samples, and the closer to the sample the higher the weight coefficient, the farther away the sample the lower or close to 0 the weight coefficient. Therefore, the traffic matrix estimation problem can be estimated using sparse representation theory. By representing the weight coefficient matrix by W, the traffic matrix estimation problem can be modeled as:
wherein,for a known incomplete traffic matrix, ->For a complete flow matrix that needs to be solved, then>The method comprises the steps of representing a coefficient matrix for sparseness needing to be solved, representing a known flow value element subscript set in a flow matrix by an omega set, and representing a known flow value element subscript set in the flow matrix by a lambda 1 Is an adjustable parameter. P Ω (. Cndot.) is a projection operator, indicating that when the element index (i, j) ∈ Ω, the corresponding position sample element is obtained:
since the element values in the traffic matrix are not negative, there is X ≧ 0, and to avoid trivial solution, the diagonal elements of the constraint sparse representation matrix W are all 0, i.e., diag (W) =0.
Step 3), establishing a flow estimation model based on time sequence prior and sparse representation:
the flow data was found to be time-ordered. For each OD stream, its own time series characteristics may be described in terms of time correlation. To characterize the time-dependent nature of the flow matrix, the flow matrix is characterized by minimizing an objective function | RX | 1 The effect of temporary stabilization of elements in the flow matrix in the time dimension can be obtained, so that the correlation in the time sequence in the flow matrix is better described. At this time, the flow estimation model based on the time sequence prior and sparse representation is as follows:
wherein λ is 2 In order to be an adjustable parameter, the device is provided with a power supply,is a Toeplitz (0,1, -1) matrix.
Step 4), converting a more complex model solution problem into three sub-problems which are easy to solve by an Alternating Direction Method of Multiprocessors (ADMM) according to the flow estimation model which is established in the step 3) and is based on time sequence prior and sparse representation, namely: a flow matrix X, a sparse representation matrix W and an error variable C in an iteration process. And (4) finding a local optimal solution of the original problem by iteratively solving the global optimal solution of the subproblem, and finally estimating a complete flow matrix X.
Further, the specific steps of the step 4) are as follows:
step 4.1), in order to conveniently solve the optimization problem, firstly introducing an error variable C and rewriting the error variable C into:
s.t.X≥0,diag(W)=0,M=X+C,P Ω (C)=0
(5)
step 4.2), in order to solve the optimization problem of step 4.1), two indicator functions are defined as follows:
wherein g (X) and f (C) have the meaning that constraint terms are put into the objective function in order for the variables to satisfy the constraint. Thus, the above optimization problem can be converted into an equivalent penalty function form:
step 4.3), alternately optimizing and solving:
step 4.3.1), for solving X effectively, without introducing variables D and S, let D = X, S = RX, the sub-problem can be transformed into an equivalent constraint optimization problem as follows (for simplicity, W is used here without affecting understanding k-1 Abbreviated as W, beta k Abbreviated as β):
the sub-problem, the corresponding Lagrangian function, can be defined as:
(1) updating X:
(2) and D, updating:
(3) and (4) updating S:
(4) - (5) update U 1 、U 2 :
(6) Updating mu t :
μ t =min(ρμ t-1 ,μ max ) (16)
Step 4.3.2), solving W:
it can be seen that each row of the W matrix is independent, each row can be separated, each subproblem can be regarded as a LASSO problem, and W is set i I-th column, X, representing W i I-th column of X, X -i The matrix obtained by removing the ith column from X is shown. To satisfy the constraint diag (W) =0, obtain W i At this time, the ith column of X does not participate in the calculation.
Step 4.3.3), solving C:
step 4.3.4), update of β
β k =min(ρβ k-1 ,β max ) (20)
And 4.4) reaching the maximum iteration times, and obtaining the estimated complete flow matrix X after the solution is finished.
The algorithm steps for sorting out the traffic estimation based on the time sequence prior and the sparse representation are shown in fig. 2:
in the field of flow estimation, the estimation missing value algorithm has advantages and disadvantages, and the IP network flow estimation method based on time sequence prior and sparse representation can effectively utilize time sequence prior information and sparse representation theory, excavate the inherent space-time correlation characteristic in the flow matrix and improve the accuracy of flow matrix estimation.
Claims (2)
1. An IP network flow estimation method based on time sequence prior and sparse representation is characterized by comprising the following steps:
s1, acquiring flow values transmitted among all source-destination nodes in a network to construct an incomplete flow matrix;
s2, establishing a flow matrix estimation model aiming at space-time correlation existing in an incomplete flow matrix by utilizing a sparse representation theory and a regularization technology;
s3, converting the original flow matrix estimation problem into a plurality of sub-problems easy to solve by an alternating direction multiplier method;
s4, iteratively optimizing the global optimal solution of the subproblem to find the local optimal solution of the original problem and estimate a complete flow matrix;
the incomplete flow matrix in the step S1 is constructed according to the following steps:
assuming that the number of time intervals of one day is T and the total number of OD pairs is N, the traffic matrix can be represented as
Wherein m is ij Represents the flow value of the jth OD pair at the ith time interval, "# represents a known flow value,"? "represents missing flow values, in the flow matrix, one column represents one sample, which is the flow value of one OD pair over all time intervals of the day, and one row represents the flow value of all OD pairs over one time interval;
the step S2 comprises the following steps:
firstly, a flow matrix estimation model based on a sparse representation theory is established, and the expression of the model is as follows:
wherein,in order for the traffic matrix to be known to be incomplete,for the complete traffic matrix to be solved,the sparse representation coefficient matrix is needed to be solved, the omega set represents a known flow value element subscript set in the flow matrix, and the lambda set represents a flow value element subscript set 1 Is an adjustable parameter; p Ω (. Cndot.) is a projection operator, indicating that when the element index (i, j) ∈ Ω, the corresponding position sample element is obtained:
because the element values in the flow matrix are not negative, X is more than or equal to 0, and in order to avoid trivial solution, the diagonal elements of the constraint sparse representation coefficient matrix W are all 0, namely diag (W) =0;
establishing a flow matrix estimation model based on time sequence prior and sparse representation, wherein the expression is as follows:s.t.X≥0,diag(W)=0,P Ω (M)=P Ω (X)
in step S3, the sub-problems include the traffic matrix X, the sparse representation coefficient matrix W, and the error variable C representing the error of the incomplete traffic matrix M and the complete traffic matrix X outside the set Ω during the iteration process,
the step S4 comprises the following specific steps:
s41, introducing an error variable C for solving conveniently, and rewriting the flow matrix estimation model into the following form:
s.t.X≥0,diag(W)=0,M=X+C,P Ω (C)=0
s42, putting the constraint terms into the objective function, defining indication functions g (X) and f (C), and converting the optimization problem into an equivalent penalty function form, wherein the expression of the indication functions is as follows:
the expression of the penalty function is as follows:
s43, obtaining an expression of X, W, C and beta to be solved according to the penalty function as follows:
where ρ is a fixed parameter, β is a parameter of a penalty function, β k ,β k-1 Beta values, beta, in the k-th iteration and the k-1 th iteration, respectively max Is a fixed parameter representing the maximum value of β, F is the F norm; and carrying out alternate optimization solution on the formula, wherein the flow matrix X when the preset maximum iteration times is reached is the estimated complete flow matrix.
2. IP network based on timing priors and sparse representation according to claim 1The flow estimation method is characterized in that in the solving process of the sparse representation coefficient matrix W, each column is respectively solved, the solving process of each column is regarded as a LASSO problem, and W is set i I-th column, X, representing W i I-th column of X, X -i Representing a matrix obtained by removing the ith column from X, solving for W so as to satisfy the constraint diag (W) =0 i When X is not involved in the calculation, W i The expression of (a) is as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011318745.4A CN112564945B (en) | 2020-11-23 | 2020-11-23 | IP network flow estimation method based on time sequence prior and sparse representation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011318745.4A CN112564945B (en) | 2020-11-23 | 2020-11-23 | IP network flow estimation method based on time sequence prior and sparse representation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112564945A CN112564945A (en) | 2021-03-26 |
CN112564945B true CN112564945B (en) | 2023-03-24 |
Family
ID=75044689
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011318745.4A Active CN112564945B (en) | 2020-11-23 | 2020-11-23 | IP network flow estimation method based on time sequence prior and sparse representation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112564945B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107133930A (en) * | 2017-04-30 | 2017-09-05 | 天津大学 | Ranks missing image fill method with rarefaction representation is rebuild based on low-rank matrix |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844518B (en) * | 2016-12-29 | 2019-02-12 | 天津中科智能识别产业技术研究院有限公司 | A kind of imperfect cross-module state search method based on sub-space learning |
-
2020
- 2020-11-23 CN CN202011318745.4A patent/CN112564945B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107133930A (en) * | 2017-04-30 | 2017-09-05 | 天津大学 | Ranks missing image fill method with rarefaction representation is rebuild based on low-rank matrix |
Also Published As
Publication number | Publication date |
---|---|
CN112564945A (en) | 2021-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105825230B (en) | Forecast of Soil Moisture Content method and system based on depth confidence network model | |
CN109902259B (en) | A kind of reconstructing method of the missing space-time data of lightweight | |
JP6129028B2 (en) | Energy consumption prediction method for building power equipment | |
CN100568249C (en) | System and method with ultimate principle emulation semiconductor-assisted manufacture process | |
CN108304685A (en) | A kind of non-linear degradation equipment method for predicting residual useful life and system | |
CN111795761B (en) | Method for predicting cabinet inlet air temperature based on reinforcement learning model | |
CN112132430B (en) | Reliability evaluation method and system for distributed state sensor of power distribution main equipment | |
CN113094860B (en) | Industrial control network flow modeling method based on attention mechanism | |
CN115186488A (en) | Time-space modeling method and system for lithium battery temperature field | |
CN115495991A (en) | Rainfall interval prediction method based on time convolution network | |
CN114861928B (en) | Quantum measurement method and device and computing equipment | |
CN112564945B (en) | IP network flow estimation method based on time sequence prior and sparse representation | |
CN112215495B (en) | Pollution source contribution calculation method based on long-time and short-time memory neural network | |
CN117553840A (en) | Instrument based on intelligent management and system thereof | |
CN107656905B (en) | Air quality data real-time calibration method using error transfer | |
CN110909492B (en) | Sewage treatment process soft measurement method based on extreme gradient lifting algorithm | |
CN109061544B (en) | Electric energy metering error estimation method | |
CN117131654A (en) | Target observation method based on nonlinear optimal disturbance of pre-analysis initial guess condition | |
CN116937559A (en) | Power system load prediction system and method based on cyclic neural network and tensor decomposition | |
CN116306832A (en) | Multimode generation antagonistic neural network modeling method and device for multidimensional sequence data | |
CN116681158A (en) | Reference crop evapotranspiration prediction method based on integrated extreme learning machine | |
CN116706888A (en) | Intelligent power grid load prediction method based on federal learning | |
CN117113068A (en) | Simulation method and system for longitudinal displacement of bridge support | |
CN109379058A (en) | Based on the consistent distribution type non-linear method for estimating state of square root volume information | |
CN114295987A (en) | Battery SOC state estimation method based on nonlinear Kalman filtering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |