CN106301950B

CN106301950B - A kind of analysis method and analytical equipment of OD flow

Info

Publication number: CN106301950B
Application number: CN201610809587.XA
Authority: CN
Inventors: 徐博华; 杨艳松; 王泽林; 郭晓琳; 马田丰
Original assignee: China United Network Communications Group Co Ltd
Current assignee: China United Network Communications Group Co Ltd
Priority date: 2016-09-07
Filing date: 2016-09-07
Publication date: 2019-08-09
Anticipated expiration: 2036-09-07
Also published as: CN106301950A

Abstract

The present invention provides the analysis methods and analytical equipment of a kind of OD flow, it is related to network technique field, solves the problems, such as that the analytical effect for showing as the OD flow of the impact noise of random value to abnormal flow, burst flow and measurement noise etc. in the prior art is undesirable.A kind of analysis method of OD flow, periodic samples obtain the whole network OD flow, and the whole network OD flow is converted into traffic matrix；It is modeled using Robust Principal Component Analysis model traffic matrix, obtains the Robust Principal Component Analysis model of traffic matrix；The Robust Principal Component Analysis model of traffic matrix carries out convex optimization relaxation, obtains the convex Optimized model of Robust Principal Component Analysis；It is solved using convex Optimized model of the augmented vector approach to Robust Principal Component Analysis, traffic matrix is decomposed into low-rank matrix and sparse matrix；Obtained low-rank matrix and sparse matrix are exported.The embodiment of the present invention is for the analysis to OD flow.

Description

OD flow analysis method and analysis device

Technical Field

The present invention relates to the field of network technologies, and in particular, to an OD traffic analysis method and an OD traffic analysis device.

Background

With the increasing scale of networks, the variation of parameters of various technologies used in the networks is more complicated. For a network operator, it is necessary to clearly know how network traffic passes through a network of the network operator, and on this basis, the network may be designed and planned to perform works such as traffic engineering, network traffic anomaly detection, congestion control, and the like. However, many network traffic analysis efforts are limited to isolated single links, and the statistics and analysis of global traffic over a full network link is still very challenging. In order to observe the characteristics and flow direction of network traffic from the perspective of the whole network, it is necessary to use Origin and Destination (OD) flow to count and describe the network traffic. The OD flows reflect the actual traffic between each pair of nodes in the network, and are the set of all traffic entering the same ingress node and leaving the same egress node. Compared with local link traffic, the OD flow can more intuitively reflect the basic properties of the network, is the basis of global traffic analysis and is also the main input parameter of traffic engineering. Therefore, the statistics and analysis of the OD flows become a research hotspot in the field of home and abroad networks, and have important practical significance.

Although the OD traffic essentially reflects the actual condition of the network, there are difficulties in the processing and analysis of the OD traffic. The most significant problem is the high-dimensional multi-element structure of the OD stream. For example, even a medium-scale network may contain hundreds of OD streams, and the analysis of the OD streams requires spreading in the time dimension, further increasing the dimensionality of the data to be analyzed. On one hand, high-dimensional information contains abundant information for mining and utilization, and on the other hand, the large-scale data volume greatly increases the difficulty of analyzing and processing the data, which brings about a so-called dimensional disaster. Thus, the high dimensional nature of the OD flows is a difficult point in the overall network traffic analysis problem.

The common practice for processing high-dimensional data is to find a linear subspace of low-dimensional alternating projections that retains the important properties of the original data to approximate the high-dimensional data. The dimension reduction technology can better explore the information redundancy of high-dimensional data and the correlation between the data, and becomes an effective tool for understanding the original data structure. The most common method for processing high-dimensional data is Principal Component Analysis (PCA), given a set of high-dimensional data, PCA can search a new low-dimensional subspace, and the projection of the original data in the coordinate space has the smallest error, thereby achieving the purpose of reducing dimensions. When a high-dimensional data can be approximated by a low-dimensional subspace, we refer to the dimension of the low-dimensional space as the feature dimension.

PCA is very sensitive to the type of noise contained in the raw data, and can achieve a better estimation effect when facing additive gaussian noise, while the estimation effect is greatly affected when facing impulsive noise such as random values. In a network environment, short-term abnormal traffic, burst traffic and network abnormal conditions often occur, namely, impact noise with random values appears, and then the effect of performing dimensionality reduction and analysis on the OD stream by using the PCA is not ideal.

In summary, there is a problem in the prior art that the analysis effect on OD traffic, such as abnormal traffic, bursty traffic, and measurement noise, which shows impact noise of random values, is not ideal.

Disclosure of Invention

Embodiments of the present invention provide an OD traffic analysis method and an OD traffic analysis device, which solve the problem in the prior art that the OD traffic analysis effect is not ideal for impulse noise representing a random value, such as abnormal traffic, bursty traffic, measurement noise, and the like.

In order to achieve the above purpose, the embodiment of the invention adopts the following technical scheme:

in a first aspect, an embodiment of the present invention provides a method for analyzing an OD flow, including:

periodically sampling to obtain the OD flow of the whole network, and converting the OD flow of the whole network into a flow matrix;

modeling the flow matrix by adopting a robust principal component analysis model to obtain the robust principal component analysis model of the flow matrix;

performing convex optimization relaxation on the robust principal component analysis model of the flow matrix to obtain a convex optimization model of the robust principal component analysis;

solving a convex optimization model of robust principal component analysis by adopting an augmented Lagrange multiplier method, and decomposing a flow matrix into a low-rank matrix and a sparse matrix; the low-rank matrix is a low-rank matrix corresponding to the OD flow without the impact noise with a random value and represents an inherent low-rank structure of the flow matrix; the sparse matrix is a sparse matrix corresponding to the OD flow of the impact noise containing the random value;

and outputting the obtained low-rank matrix and sparse matrix.

Specifically, the modeling of the traffic matrix by using the robust principal component analysis model to obtain the robust principal component analysis model of the traffic matrix includes:

modeling the flow matrix by adopting a robust principal component analysis model to obtain the robust principal component analysis model of the flow matrix:

wherein,to represent the minimization of the target matrix A and matrix E, A is a low rank matrix, D is a traffic matrix, E is a sparse matrix, rank (A) represents the rank of matrix A, | E |₀Is the number of non-zero elements in the matrix E, gamma is a compromise factor and is greater than 0, and s.t. is a constraint condition.

Specifically, performing convex optimization relaxation on the robust principal component analysis model of the flow matrix to obtain a convex optimization model of the robust principal component analysis, including:

performing convex optimization relaxation on the robust principal component analysis model of the flow matrix to obtain a convex optimization model of the robust principal component analysis:

wherein,to indicate that the minimum A of the target matrix A and the matrix E is a low rank matrix, D is a flow matrix, E is a sparse matrix, | A |_*Representing the sum of all singular values, | E | in the matrix A₁Is the sum of the absolute values of all elements in the matrix E, λ is the relaxation factor, and s.t. is the constraint.

Specifically, solving a convex optimization model of robust principal component analysis by adopting an augmented Lagrange multiplier method, and decomposing a flow matrix into a low-rank matrix and a sparse matrix, wherein the method comprises the following steps:

constructing an augmented Lagrangian function of the convex optimization model:

‖A‖_*represents the sum of all singular values, | E | in the low rank matrix A₁Is the sum of the absolute values of all elements in the sparse matrix E, λ is the relaxation factor, Y is the lagrange multiplier matrix,D-A-E is a constraint condition for increasing the penalty of the Lagrange function,<Y,D-A-E>the inner product of a Lagrange multiplier matrix Y and a constraint condition D-A-E is obtained, mu is a positive parameter, and F is the Frobenius norm of the matrix;

presetting a soft threshold shrinking operator and a singular value soft threshold operator, wherein the soft threshold shrinking operator is as follows:

wherein, when x>When epsilon, S_ε(x) Equal to x-epsilon; when x is<When epsilon, S_ε(x) Equal to x + ε;

when x ═ epsilon, S_ε(x) Is equal to zero;

the singular value soft threshold operator is:

D_τ(X)＝U S_τ(∑)V^Ta third formula;

wherein X ═ U ∑ V^TSingular value decomposition for the matrix;

solving by using an augmented Lagrange multiplier algorithm:

setting initial parameters Y0 ═ 0, E0 ═ 0 and mu₀Greater than 0, rho₀Greater than 0, k is 0;

wherein Y0 denotes a Lagrangian multiplier matrix, E0 denotes a sparse matrix, k is an iteration number, k is an integer greater than or equal to 0, and mu₀Is the initial value of the contraction factor, ρ is a parameter greater than 1, used to update the contraction factor μ;

by usingFormula IV

Performing singular value decomposition;

wherein svd represents singular value decomposition; here, the initialized parameter k is 0 and Y₀＝0、E₀＝0、μ₀Substituting rho into the formula IV, and obtaining a matrix U, a matrix S and a matrix V through singular value decomposition;

by usingFormula five

Solving for A_k+1；

Substituting k as 0 into formula five, and substituting matrix S obtained from formula four into singular value soft threshold operator in formula three to obtain A1;

by usingFormula six

Solving for E_k+1；

Substituting k as 0 into the formula six, and substituting the matrix A1 obtained from the formula five into a soft threshold shrinking operator in the formula two to obtain E1;

using Y_k+1＝Y_k+μ_k(D-A_k+1-E_k+1) Formula seven

Solving for Y_k+1；

Taking k as 0 to the seven formula, updating a Lagrange multiplier matrix Y according to the residual value D-A-E, and obtaining Y1;

this completes the first iteration of the algorithm, updating the contraction factor, μ_k+1＝ρμ_k(ii) a And k is k + 1;

and converging the augmented Lagrange multiplier algorithm to an optimal solution through iterative calculation to obtain the low-rank matrix A and the sparse matrix E of the flow matrix decomposition.

In a second aspect, an embodiment of the present invention provides an OD flow rate analyzing apparatus, including:

the data acquisition unit is used for periodically sampling and acquiring the OD flow of the whole network and converting the OD flow of the whole network into a flow matrix;

the modeling unit is used for modeling the traffic matrix converted by the data acquisition unit by adopting a robust principal component analysis model to obtain the robust principal component analysis model of the traffic matrix;

the optimization unit is used for performing convex optimization relaxation on the robust principal component analysis model of the traffic matrix obtained by the modeling unit to obtain a convex optimization model of the robust principal component analysis;

the solving unit is used for solving the convex optimization model of the robust principal component analysis obtained by the optimizing unit by adopting a non-precise augmented Lagrange multiplier method to obtain a recovered low-rank matrix and a recovered sparse matrix in the flow matrix; the low-rank matrix is a low-rank matrix corresponding to OD (origin-destination) flow without impact noise with a random value and represents an inherent low-rank structure of the flow matrix; the sparse matrix is a sparse matrix corresponding to the OD flow of the impact noise containing the random value;

and the data output unit is used for outputting the low-rank matrix and the sparse matrix obtained by the solving unit.

Specifically, the modeling unit is specifically configured to model the traffic matrix by using a robust principal component analysis model to obtain the robust principal component analysis model of the traffic matrix:

wherein,to represent the minimization of the target matrix A and matrix E, A is a low rank matrix, D is a traffic matrix, E is a sparse matrix, rank (A) represents the rank of matrix A,

‖E‖₀is the number of non-zero elements in the matrix E, gamma is a compromise factor and is greater than 0, and s.t. is a constraint condition.

Specifically, the optimization unit is configured to perform convex optimization relaxation on the robust principal component analysis model of the traffic matrix to obtain a convex optimization model of the robust principal component analysis:

wherein,to represent that the minimization A of a target matrix A and a matrix E is the low rank matrix, D is the traffic matrix, E is the sparse matrix, | A |_*Represents the sum of all singular values, | E | in the matrix A₁Is the sum of the absolute values of all elements in the matrix E, λ is the relaxation factor, and s.t. is the constraint.

Specifically, the solving unit is specifically configured to construct an augmented lagrangian function of the convex optimization model:

when x ═ epsilon, S_ε(x) Is equal to zero;

the singular value soft threshold operator is:

D_τ(X)＝U S_τ(∑)V^Ta third formula;

wherein X ═ U ∑ V^TSingular value decomposition for the matrix;

solving by using an augmented Lagrange multiplier algorithm:

by usingFormula IV

Performing singular value decomposition;

by usingFormula five

Solving for A_k+1；

by usingFormula six

Solving for E_k+1；

using Y_k+1＝Y_k+μ_k(D-A_k+1-E_k+1) Formula seven

Solving for Y_k+1；

The embodiment of the invention provides an OD flow analysis method, which comprises the steps of utilizing a robust component analysis model to model a flow matrix of total network OD flow obtained by periodic sampling, adopting a relaxation method to carry out convex optimization on the flow matrix, converting the flow matrix into a convex optimization model of robust principal component analysis, utilizing an augmented Lagrange multiplier method to solve the convex optimization model of robust principal component analysis, and decomposing the flow matrix into a low-rank matrix and a sparse matrix, so that the analysis problem of the flow matrix is converted into the decomposition problem of the low-rank matrix and the sparse matrix; the low-rank matrix is a low-rank matrix corresponding to the OD flow which does not contain the impact noise with a random value; the sparse matrix is a sparse matrix corresponding to the OD flow of the impact noise containing the random value; compared with PCA, the OD traffic analysis method provided by the embodiment of the invention can better find out a potential low-rank matrix of the traffic matrix of the OD traffic of the whole network in the calculation of the OD traffic of the impact noise which is represented as a random value, such as abnormal traffic, burst traffic, measurement noise and the like, and solves the problem that the calculation result of the OD traffic of the impact noise which is represented as a random value, such as abnormal traffic, burst traffic, measurement noise and the like, is not ideal in the prior art.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of an OD flow analysis method according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an OD flow analysis apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the method for analyzing the OD traffic provided by the embodiment of the present invention may be implemented in a hardware manner, or may be implemented in a software instruction executed by a processor. The software instructions may be comprised of corresponding software modules that may be stored in Random Access Memory (RAM), flash Memory, Read Only Memory (ROM), Erasable Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), registers, a hard disk, a removable disk, a compact disc Read Only Memory (CD-ROM), or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. The OD flow analysis method and the OD flow analysis device provided by the embodiment of the invention are positioned in the IP backbone network equipment, and can capture the OD flow in real time or capture periodically by setting a capture period in the IP backbone network equipment, and then analyze the OD flow; the OD flow analysis device provided by the embodiment of the present invention or the electronic device using the OD flow analysis method provided by the embodiment of the present invention may be used for analyzing the OD flow.

The first embodiment of the present invention provides a method for analyzing OD traffic, as shown in fig. 1, including:

and periodically sampling to obtain the OD flow of the whole network, and converting the OD flow of the whole network into a flow matrix.

And modeling the flow matrix by adopting a robust principal component analysis model to obtain the robust principal component analysis model of the flow matrix.

And performing convex optimization relaxation on the robust principal component analysis model of the flow matrix to obtain a convex optimization model of the robust principal component analysis.

Solving a convex optimization model of robust principal component analysis by adopting an augmented Lagrange multiplier method, and decomposing a flow matrix into a low-rank matrix and a sparse matrix; the low-rank matrix is a low-rank matrix corresponding to the OD flow without the impact noise with a random value and represents an inherent low-rank structure of the flow matrix; the sparse matrix is a sparse matrix corresponding to the OD traffic of the impact noise containing the random value.

And outputting the obtained low-rank matrix and sparse matrix.

The second embodiment of the present invention provides a method for analyzing OD traffic, as shown in fig. 1, including:

It should be noted that the OD traffic analysis method provided by the embodiment of the present invention first converts periodically sampled and acquired full-network OD traffic data into a traffic matrix for analysis; for a network with N nodes, the total number of OD traffic is N ═ N²Then, OD flow data are acquired through T periods; the OD flow of each cycle forms a column vector d ═ d₁,d₂,...,d_N]^TArranged in a periodic sequence T, the OD traffic matrix D can be represented as:

the obtained OD traffic matrix D is a traffic matrix corresponding to the OD traffic of the whole network, wherein the traffic matrix is a matrix of dimension N × T.

It should be noted that, in the prior art, the topology of the network core node and the size of the traffic flowing through the network have relative stability, but this stability may be damaged by the impact noise, which is represented as a random value, such as abnormal traffic, bursty traffic, and measurement noise; for the damage problem generated by the impact noise of the random value, a mathematical method is used for conversion, and the rank of the flow matrix D is far smaller than the dimension of the flow matrix D, namely the low rank of the matrix, as the flow matrix D has low rank, namely the column vectors of the flow matrix D have correlation; however, the traffic matrix D may be contaminated by random impact noise, the low rank structure is destroyed, and abnormal traffic, bursty traffic, measurement noise, and the like are sparse with respect to the scale of the OD traffic matrix; therefore, the traffic matrix D can be modeled by robust principal component analysis, the purpose of modeling the traffic matrix D by robust principal component analysis is to recover a potential low-rank structure thereof, and impact noise with random values caused by abnormal traffic, bursty traffic, measurement noise and the like is separated from the traffic matrix D, that is, D is a + E, and the matrix a and the matrix E are a low-rank matrix and a sparse matrix of true values, respectively.

The mathematical formula mentioned in example two is additionally explained, wherein:

1. p range of vectorAnd (4) counting. For vector a ═ a₁,a₂,…a_n)^T∈R^n×1Its P norm isWherein P > 0.

In particular, when p is 1,when p is 2, the compound is a compound,

‖A‖₀is the zero norm of the vector, i.e., the number of non-zero elements in the vector. Infinite norm

2. Inner product of matrix. For two isotypes of m x n dimensional real matrices A, B, their inner product is

3. The Frobenius norm of the matrix. For matrix a ═ a (ai)_j)_m×n∈R^m ^×nIts Frobenius norm is defined asZero norm A | | non-conducting phosphor₀Is the number of non-zero elements in the matrix. Infinite norm(1,1) norm of(2,1) norm of

4. Singular value decomposition of the matrix. The matrix A ∈ R^m×nIts Singular Value Decomposition (SVD) isWherein U is E.R^m×mAnd V ∈ R^n×nAre all orthogonal matrices, diagonal matrix ∑_r＝diag(σ₁,σ₂,…,σ_r)∈R^r×rAnd diagonal elements satisfy σ₁≥σ₂≥…≥σ_r>0. The rank of matrix a is rank (a) r, U ═ U (U)₁,u₂,…,u_m)，V＝(v₁,v₂,…,v_n) The singular value decomposition of the matrix is

5. The kernel norm of the matrix is defined asWherein I_mRepresents an m-order identity matrix, and trace (·) represents a trace operator of the matrix. The kernel norm of matrix A can be represented by its singular values, i.e.

to represent the minimization of the target matrix A and matrix E, A is a low rank matrix, D is a flow matrix, E is a sparse matrix, | A |_*Representing the sum of all singular values, | E | in the matrix A₁Is the sum of the absolute values of all elements in the matrix E, λ is the relaxation factor, and s.t. is the constraint.

It should be noted that, although the robust principal component analysis model is used to convert the flow matrix analysis problem into the decomposition problem of the low-rank matrix and the sparse matrix; however, in the robust principal component analysis model of the traffic matrix, both the target matrix a and the target matrix E are nonlinear and non-convex matrices, and it is difficult to directly solve a non-deterministic polynomial (NP for short), so that convex optimization relaxation needs to be performed on the robust principal component analysis model of the traffic matrix in order to solve both the target matrix a and the target matrix E, so as to obtain the convex optimization model of the robust principal component analysis.

In a specific calculation process, assume A₀∈R^n1×n2(n₁≥n₂) The non-coherence condition for the parameter μ is satisfied as follows:

wherein e_iRepresents a unit vector in whichRepresenting the singular value decomposition of the matrix.

Where r is the matrix A₀Rank of (1), infinite norm | | | D | | non-woven phosphor_∞＝max_i,j|D_i,j|，

And assuming that the support of E0 is evenly distributed in all coordinates, as long as:

rank(A₀)≤ρ_rn₂μ^-1(logn₁)^-2,m≤ρ_sn₁n₂

then there is a numerical constant c that causes the problem of tracking by principal component (where) Can be at least 1-c_n1 ^-10The original matrix is recovered by the probability of where p_r、ρ_sIs a positive numerical constant.

Therefore, when the rank matrix A is low₀Is reasonably distributed and the non-zero elements of the sparse matrix E0 are also uniformly distributed, the solution to the problem of principal component analysis can recover the original low-rank matrix a from unknown and arbitrary errors with a probability close to 1₀。

Solving a convex optimization model of robust principal component analysis by adopting an augmented Lagrange multiplier method, and decomposing a flow matrix into a low-rank matrix and a sparse matrix, wherein the method comprises the following steps:

constructing an augmented Lagrangian function of the convex optimization model:

‖A‖_*represents the sum of all singular values, | E | in the low rank matrix A₁Is the sum of the absolute values of all elements in the sparse matrix E, λ is the relaxation factor, Y is the lagrange multiplier matrix,D-A-E is a constraint condition for increasing the penalty of the Lagrange function,<Y,D-A-E>is lagrange multiplier matrix Y andthe inner product of the constraint conditions D-A-E, mu is a positive parameter, and F is the Frobenius norm of the matrix;

when x ═ epsilon, S_ε(x) Is equal to zero;

the singular value soft threshold operator is:

D_τ(X)＝U S_τ(∑)V^Ta third formula;

wherein X ═ U ∑ V^TSingular value decomposition for the matrix;

solving by using an augmented Lagrange multiplier algorithm:

it should be noted that the Augmented lagrangian Multiplier algorithm herein refers to an inaccurate Augmented lagrangian Multiplier algorithm (IALM).

by usingFormula IV

Performing singular value decomposition;

wherein svd represents singular value decomposition; here, the initialized parameter k is 0 and Y₀＝0、E₀＝0、μ₀Substituting rho into the formula IV, and obtaining a matrix U, a matrix and a matrix V through singular value decomposition;

by usingFormula five

Solving for A_k+1；

by usingFormula six

Solving for E_k+1；

using Y_k+1＝Y_k+μ_k(D-A_k+1-E_k+1) Formula seven

Solving for Y_k+1；

it should be noted that since k is the number of iterations, the number of iterations k may be selected according to the accuracy of the data in practical applications; assuming that the number of iterations is N, N is an integer greater than or equal to 1,during calculation, firstly solving the augmented Lagrange multiplier algorithm with 0 k, and finishing first iteration to obtain A1, E1 and Y1; then according to the contraction factor, mu_k+1＝ρμ_k(ii) a And k is k + 1; solving according to the fact that k is 1 to N and is brought into the augmented Lagrange multiplier algorithm, and therefore the augmented Lagrange multiplier algorithm is converged to an optimal solution; the optimal solution means that the values of the low-rank matrix A and the sparse matrix E are not changed no matter how many times of iterative computation is performed.

It should be noted that, in a general form, the formula for solving the following convex optimization problem by using the Augmented Lagrangian Multiplier (ALM) is as follows:

min f(X),s.t.h(X)＝0；

wherein f is Rⁿ→R,h:Rⁿ→R^m. Then, the augmented lagrange function of the convex optimization problem is defined as:

where μ is a positive parameter.

Augmented Lagrangian functions differ from Lagrangian functions in that they carry a penalty term for the constraint.

By comparison with the PCP model, let:

X＝(A,E),f(X)＝||A||_*+λ||E||₁,h(X)＝D-A-E；

and the method is brought into an augmented Lagrangian function in a general form, so that the augmented Lagrangian function of the convex optimization model is constructed:

the embodiment of the invention provides an OD flow analysis method, which comprises the steps of utilizing a robust component analysis model to model a flow matrix of periodically acquired full-network OD flows, adopting a relaxation method to carry out convex optimization calculation on the flow matrix through the relaxation calculation of convex optimization, converting the flow matrix into a convex optimization model for robust principal component analysis, utilizing an augmented Lagrange multiplier method to solve the convex optimization model for the robust principal component analysis, and decomposing the flow matrix into a low-rank matrix and a sparse matrix, so that the analysis problem of the flow matrix is converted into the decomposition problem of the low-rank matrix and the sparse matrix; the low-rank matrix is a low-rank matrix corresponding to the OD flow which does not contain the impact noise with a random value; the sparse matrix is a sparse matrix corresponding to the OD flow of the impact noise containing the random value; compared with PCA, the OD traffic analysis method provided by the embodiment of the invention can better find out a potential low-rank matrix of the traffic matrix of the OD traffic of the whole network in the calculation of the OD traffic of the impact noise which is represented as a random value, such as abnormal traffic, burst traffic, measurement noise and the like, and solves the problem that the calculation result of the OD traffic of the impact noise which is represented as a random value, such as abnormal traffic, burst traffic, measurement noise and the like, is not ideal in the prior art.

Third embodiment, an embodiment of the present invention provides an OD flow rate analyzing apparatus 10, as shown in fig. 2, including:

a data obtaining unit 1010, configured to periodically sample and obtain a full-network OD traffic, and convert the full-network OD traffic into a traffic matrix;

a modeling unit 1020, configured to model the traffic matrix converted by the data obtaining unit by using a robust principal component analysis model to obtain a robust principal component analysis model of the traffic matrix;

an optimizing unit 1021, configured to perform convex optimization relaxation on the robust principal component analysis model of the traffic matrix obtained by the modeling unit to obtain a convex optimization model of the robust principal component analysis;

a solving unit 1022, configured to solve the convex optimization model of robust principal component analysis obtained by the optimizing unit by using a non-precise augmented lagrangian multiplier method, so as to obtain a recovered low-rank matrix and a recovered sparse matrix in the traffic matrix; the low-rank matrix is a low-rank matrix corresponding to the OD flow which does not contain the impact noise with a random value; the sparse matrix is a sparse matrix corresponding to the OD flow of the impact noise containing the random value;

the data output unit 1030 is configured to output the low-rank matrix and the sparse matrix obtained by the solving unit.

The embodiment of the invention provides an OD flow analysis device.A data processing unit utilizes a robust component analysis model to model a flow matrix of the whole network OD flow acquired by a data acquisition unit periodically; then the data processing unit performs convex optimization calculation on the flow matrix by adopting a relaxation method through the relaxation calculation of convex optimization, and converts the flow matrix into a convex optimization model for robust principal component analysis; finally, the data processing unit adopts an augmented Lagrange multiplier method to solve a convex optimization model of robust principal component analysis, and decomposes the flow matrix into a low-rank matrix and a sparse matrix, so that the analysis problem of the flow matrix is converted into the decomposition problem of the low-rank matrix and the sparse matrix; the low rank matrix is a low rank matrix corresponding to the OD traffic that does not contain the random value of the impact noise. The OD flow analysis device provided by the embodiment of the invention can better find out a potential low-rank matrix of the flow matrix of the OD flows of the whole network in the calculation of the OD flows of the impact noise which is represented as a random value, such as abnormal flows, burst flows, measurement noise and the like, and solves the problem that the calculation result of the OD flows of the impact noise which is represented as a random value, such as the abnormal flows, the burst flows, the measurement noise and the like, is not ideal in the prior art.

In a fourth embodiment, an OD flow rate analyzing apparatus 10 according to an embodiment of the present invention is provided, as shown in fig. 2, including:

a data obtaining unit 1010, configured to periodically collect and obtain a full-network OD traffic, and convert the full-network OD traffic into a traffic matrix;

the modeling unit 1020 is specifically configured to model the traffic matrix by using a robust principal component analysis model to obtain the robust principal component analysis model of the traffic matrix:

The optimization unit 1021 is configured to perform convex optimization relaxation on the robust principal component analysis model of the traffic matrix to obtain a convex optimization model of the robust principal component analysis:

The solving unit 1022 is specifically configured to construct an augmented lagrangian function of the convex optimization model:

when x ═ epsilon, S_ε(x) Is equal to zero;

the singular value soft threshold operator is:

D_τ(X)＝U S_τ(∑)V^Ta third formula;

wherein X ═ U ∑ V^TSingular value decomposition for the matrix;

solving by using an augmented Lagrange multiplier algorithm:

it should be noted that. The Augmented Lagrange Multiplier algorithm herein refers to an inaccurate Augmented Lagrange Multiplier algorithm (IALM).

by usingFormula IV

Performing singular value decomposition;

by usingFormula five

Solving for A_k+1；

by usingFormula six

Solving for E_k+1；

using Y_k+1＝Y_k+μ_k(D-A_k+1-E_k+1) Formula seven

Solving for Y_k+1；

And the data output unit 1030 is configured to output the low-rank matrix a and the sparse matrix E obtained by the solving unit.

According to the OD flow analysis device provided by the embodiment of the invention, the data processing unit utilizes a robust component analysis model algorithm to model a flow matrix of the whole network OD flow acquired by the data acquisition unit periodically; then the data processing unit adopts a relaxation method to perform convex optimization calculation on the flow matrix, and converts the flow matrix into a convex optimization model for robust principal component analysis; finally, the data processing unit adopts an augmented Lagrange multiplier method to solve a convex optimization model of robust principal component analysis, and decomposes the flow matrix into a low-rank matrix and a sparse matrix, so that the analysis problem of the flow matrix is converted into the decomposition problem of the low-rank matrix and the sparse matrix; the low rank matrix is a low rank matrix corresponding to the OD traffic that does not contain the random value of the impact noise. The OD flow analysis device provided by the embodiment of the invention can better find out a potential low-rank matrix of the flow matrix of the OD flows of the whole network in the calculation of the OD flows of the impact noise which is represented as a random value, such as abnormal flows, burst flows, measurement noise and the like, and solves the problem that the calculation result of the OD flows of the impact noise which is represented as a random value, such as the abnormal flows, the burst flows, the measurement noise and the like, is not ideal in the prior art.

The data acquisition unit and the data output unit may be communication units of the OD traffic analysis device, and the modeling unit, the optimization unit, and the solving unit may be processors separately installed, or may be implemented by being integrated into one of the processors of the OD traffic analysis device, or may be stored in a memory of the OD traffic analysis device in the form of program codes, and the functions of the modeling unit, the optimization unit, and the solving unit may be invoked and executed by one of the processors of the OD traffic analysis device. The processor described herein may be a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present invention.

Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims

1. An OD flow analysis method is characterized in that:

solving the convex optimization model of the robust principal component analysis by adopting an augmented Lagrange multiplier method, and decomposing the flow matrix into a low-rank matrix and a sparse matrix; wherein the low-rank matrix does not contain the OD flow of the impact noise with random value, and represents the inherent low-rank structure of the flow matrix; the sparse matrix contains the OD flow of the impact noise with a random value;

outputting the obtained low-rank matrix and the sparse matrix;

the robust principal component analysis model is adopted to model the traffic matrix to obtain the robust principal component analysis model of the traffic matrix:

wherein,to represent a minimization of a target matrix A and a matrix E, A being the low rank matrix, D being the traffic matrix, E being the sparse matrix, rank (A) representing the rank of the matrix A, | E |₀Is the number of non-zero elements in the matrix E, γ is a compromise factor and is greater than 0, and s.t. is a constraint condition;

wherein,to represent a minimization of a target matrix A and a matrix E, A being the low rank matrix, D being the traffic matrix, E being the sparse matrix, | A |_*Represents the sum of all singular values, | E | in the matrix A₁Is the sum of the absolute values of all elements in the matrix E, λIs a relaxation factor, and s.t. is a constraint condition; solving the convex optimization model of the robust principal component analysis by adopting an augmented Lagrange multiplier method, and decomposing the flow matrix into a low-rank matrix and a sparse matrix, wherein the method comprises the following steps of:

constructing an augmented Lagrangian function of the convex optimization model:

‖A‖_*represents the sum of all singular values, | E | in the low rank matrix A₁Is the sum of the absolute values of all elements in the sparse matrix E, λ is the relaxation factor, Y is the lagrange multiplier matrix,D-A-E is a constraint condition for increasing the penalty of the Lagrange function,<Y,D-A-E>is the inner product of a Lagrange multiplier matrix Y and a constraint condition D-A-E, mu is a positive parameter, and F is the Frobenius norm of the matrix.

2. The analysis method according to claim 1, wherein the solving of the convex optimization model of the robust principal component analysis using the augmented lagrange multiplier method to decompose the traffic matrix into a low rank matrix and a sparse matrix further comprises:

wherein, when x > epsilon, S_ε(x) Equal to x-epsilon; when x < epsilon, S_ε(x) Equal to x + ε;

when x ═ epsilon, S_ε(x) Is equal to zero;

the singular value soft threshold operator is:

D_τ(X)＝U S_τ(∑)V^Ta third formula;

wherein X ═ U ∑ V^TSingular value decomposition for the matrix;

solving by using an augmented Lagrange multiplier algorithm:

by usingPerforming singular value decomposition on the formula IV;

by usingEquation five solves for A_k+1；

by usingSix solution of formula E_k+1；

using Y_k+1＝Y_k+μ_k(D-A_k+1-E_k+1) Solving for Y by formula seven_k+1；

3. An OD flow analysis device, comprising:

the data output unit is used for outputting the low-rank matrix and the sparse matrix obtained by the solving unit;

the modeling unit is specifically configured to model the traffic matrix by using a robust principal component analysis model to obtain the robust principal component analysis model of the traffic matrix:

the optimization unit is configured to perform convex optimization relaxation on the robust principal component analysis model of the traffic matrix to obtain a convex optimization model of the robust principal component analysis:

wherein,to represent that the minimization A of a target matrix A and a matrix E is the low rank matrix, D is the traffic matrix, E is the sparse matrix, | A |_*Represents the sum of all singular values, | E | in the matrix A₁Is the sum of absolute values of all elements in the matrix E, λ is a relaxation factor, and s.t. is a constraint condition;

the solving unit is specifically used for constructing an augmented Lagrangian function of the convex optimization model, and comprises the following steps:

||A||_*represents the sum of all singular values, | E | in the low rank matrix A₁Is the sum of the absolute values of all elements in the sparse matrix E, λ is the relaxation factor, Y is the lagrange multiplier matrix,for the purpose of broadeningThe penalty term of the Lagrangian function, D-A-E is the constraint condition,<Y,D-A-E>is the inner product of a Lagrange multiplier matrix Y and a constraint condition D-A-E, mu is a positive parameter, and F is the Frobenius norm of the matrix.

4. The analysis device according to claim 3, wherein the solving unit is specifically configured to construct an augmented Lagrangian function of the convex optimization model, and further comprises:

when x ═ epsilon, S_ε(x) Is equal to zero;

the singular value soft threshold operator is:

D_τ(X)＝U S_τ(∑)V^Ta third formula;

wherein X ═ U ∑ V^TSingular value decomposition for the matrix;

solving by using an augmented Lagrange multiplier algorithm:

by usingPerforming singular value decomposition on the formula IV;

by usingEquation five solves for A_k+1；

by usingSix solution of formula E_k+1；

using Y_k+1＝Y_k+μ_k(D-A_k+1-E_k+1) Solving for Y by formula seven_k+1；