CN117609814A

CN117609814A - SD-WAN intelligent flow scheduling optimization method and system

Info

Publication number: CN117609814A
Application number: CN202410095264.3A
Authority: CN
Inventors: 韩伟; 李碧妍; 易夕冬; 张天松; 肖连菊; 翁祖逖; 冯康; 高宝军; 黄展鹏; 何烈军; 刘文佳
Original assignee: Guangdong Aofei Data Technology Co ltd
Current assignee: Guangdong Aofei Data Technology Co ltd
Priority date: 2024-01-24
Filing date: 2024-01-24
Publication date: 2024-02-27
Anticipated expiration: 2044-01-24
Also published as: CN117609814B

Abstract

The invention relates to the technical field of intelligent flow scheduling, in particular to an SD-WAN intelligent flow scheduling optimization method and system. The method comprises the following steps: acquiring a sequence corresponding to each dimension in the SD-WAN, and clustering data in the sequence corresponding to each dimension to acquire a clustering result corresponding to each dimension in each clustering mode; according to the difference of data values between the clustering results corresponding to each dimension in each clustering mode and the clustering center, the fluctuation condition of data in the clustering results corresponding to each dimension in each clustering mode, and the difference between the clustering results of each dimension and the clustering results of other dimensions, the influence evaluation value of each dimension on other features is obtained, covariance between the sequence corresponding to each dimension and the sequence corresponding to other dimensions is corrected, corrected covariance is obtained, and traffic is scheduled. The invention improves the accuracy and the credibility of intelligent flow dispatching.

Description

SD-WAN intelligent flow scheduling optimization method and system

Technical Field

The invention relates to the technical field of intelligent flow scheduling, in particular to an SD-WAN intelligent flow scheduling optimization method and system.

Background

With the continuous expansion of enterprise network scale and the increase of network applications, a software defined wide area network (SD-WAN) is a new generation network architecture, and gradually becomes an important component of enterprise networks. The SD-WAN provides a more flexible and efficient network connection mode through centralized control and intelligent routing, so that enterprises can better manage and optimize network traffic. However, in complex network environments, dynamic changes and uncertainties in network traffic often lead to fluctuations in network performance and the occurrence of anomalies. Conventional traffic scheduling methods mostly rely on predefined rules and static parameters, which may be difficult to adapt to dynamic changes of the network environment. Meanwhile, the traditional flow scheduling method generally fails to fully utilize potential information in flow data, so that the potential of performance optimization is not fully utilized.

In order to solve these problems, the flow data is usually subjected to deep data analysis, so as to extract a multidimensional feature vector of the flow data, and the multidimensional feature vector is subjected to dimension reduction by a dimension reduction method, wherein a commonly used dimension reduction method is PCA dimension reduction, and the dimension reduction method can find key features in the multidimensional feature vector, so that an auxiliary system can better understand and adapt to dynamic changes of network states. Conventional PCA dimension reduction algorithms typically choose to use variance-interpretation-rate eigenvalues for parameter selection, but when combined with traffic scheduling and adaptively adjusting the PCA dimension-reduction parameters, since the multidimensional eigenvectors of the traffic data are obtained through traffic depth analysis, there may be strong correlation between the multidimensional eigenvectors, and too strong correlation between the vector eigenvectors may result in that the found principal component with high variance may not represent the direction in which the most important information is found. The feature of strong correlation may cause the direction of the principal component to be not clear enough, thereby reducing the interpretation of the principal component, and further making the reliability of SD-WAN intelligent traffic scheduling lower.

Disclosure of Invention

In order to solve the problem of lower accuracy of analysis results in the process of analyzing SD-WAN flow data in the existing method, and further the problem of lower reliability of SD-WAN flow scheduling, the invention aims to provide an SD-WAN intelligent flow scheduling optimization method and system, and the adopted technical scheme is as follows:

in a first aspect, the present invention provides an SD-WAN intelligent traffic scheduling optimization method, which includes the following steps:

acquiring a flow data packet in an SD-WAN, and acquiring a sequence corresponding to each dimension based on the flow data packet;

clustering the data in the sequence corresponding to each dimension for a plurality of times based on the data in the sequence corresponding to each dimension to obtain clustering results corresponding to each dimension in each clustering mode; obtaining influence evaluation values of each dimension on other characteristics according to the difference of data values between the corresponding clustering results and the clustering centers of each dimension in each clustering mode, the fluctuation condition of data in the corresponding clustering results of each dimension in each clustering mode, and the difference between the corresponding clustering results of each dimension in different clustering modes and the corresponding clustering results of each dimension in different clustering modes;

obtaining corresponding corrected covariance according to covariance between sequences corresponding to each dimension and sequences corresponding to other dimensions, the number of types of data in the sequences corresponding to each dimension and the influence evaluation value; constructing a target covariance matrix based on the corrected covariance;

and scheduling the SD-WAN traffic based on the target covariance matrix.

Preferably, the clustering of the data in the sequence corresponding to each dimension for several times based on the data in the sequence corresponding to each dimension to obtain each clustering result corresponding to each dimension in each clustering mode includes:

for the sequence corresponding to the kth dimension:

clustering data in a sequence corresponding to a kth dimension by adopting a mean shift clustering algorithm to obtain a plurality of first clustering results;

constructing a plurality of tuples corresponding to the kth dimension and other dimensions based on each data in the sequence corresponding to the kth dimension and each data of the other dimensions on which the sequence is located;

for the j-th dimension other than the k-th dimension: and clustering all the tuples corresponding to the kth dimension and each dimension except the kth dimension by adopting a mean shift clustering algorithm to obtain a clustering result of the kth dimension and each dimension except the kth dimension.

Preferably, the obtaining the impact evaluation value of each dimension on other features according to the difference of the data value between each clustering result corresponding to each dimension in each clustering mode and the clustering center, the fluctuation condition of the data in each clustering result corresponding to each dimension in each clustering mode, and the difference between the clustering result corresponding to each dimension in different clustering modes and the clustering result corresponding to each other dimension in different clustering modes includes:

for the kth dimension:

for any clustering result: obtaining a discrete index corresponding to the clustering result according to the difference between each datum in the clustering result and the clustering center;

according to the discrete index corresponding to the clustering result of the kth dimension and each dimension except the kth dimension and the variance of the data in the clustering result of the kth dimension and each dimension except the kth dimension, obtaining an influence degree value of the kth dimension, wherein the discrete index corresponding to the clustering result of the kth dimension and each dimension except the kth dimension and the variance of the data in the clustering result of the kth dimension and each dimension except the kth dimension are in negative correlation with the influence degree value;

for the j-th dimension other than the k-th dimension: according to the discrete index corresponding to each clustering result of the kth dimension except the kth dimension and the kth dimension, and the variance of the kth dimension data in each clustering result of the kth dimension except the kth dimension, obtaining an influence index of the kth dimension on the jth dimension, wherein the discrete index corresponding to each clustering result of the kth dimension except the kth dimension and the variance of the kth dimension data in each clustering result of the kth dimension except the kth dimension and the kth dimension are in a negative correlation relation with the influence index; and determining an influence evaluation value of the kth dimension on the jth dimension based on the influence index and the influence degree value.

Preferably, the obtaining of the discrete index includes:

for any one of the kth dimension and the jth dimension other than the kth dimension, clustering results: the difference between each jth dimension data in the clustering result and the jth dimension data corresponding to the clustering center is recorded as a first difference index of each jth dimension data; and determining the arithmetic square root of the average value of the first difference indexes of all the j-th dimension data in the clustering result as a discrete index corresponding to the clustering result.

Preferably, the impact index of the kth dimension on the jth dimension is calculated using the following formula:

wherein,an influence index representing the kth dimension on the jth dimension,/for>The number of clustering results representing the kth dimension and the jth dimension other than the kth dimension,/->Maximum value in discrete indexes representing that kth dimension corresponds to all clustering results of jth dimension except kth dimension,/for each of the plurality of clusters of kth dimension and jth dimension>A kth dimension and a jth dimension other than the kth dimension are represented by ≡>Discrete indexes corresponding to the clustering results, +.>A kth dimension and a jth dimension other than the kth dimension are represented by ≡>Variance of kth dimension data in all tuples in the clustering result, +.>Represents a logarithmic function based on a constant 2, exp () represents an exponential function based on a natural constant, ++>Representing a preset second adjustment parameter, +.>Greater than 0.

Preferably, the determining the impact evaluation value of the kth dimension on the jth dimension based on the impact index and the impact degree value includes:

and determining the ratio of the influence index of the kth dimension to the influence degree value of the kth dimension as the influence evaluation value of the kth dimension to the jth dimension.

Preferably, the obtaining the corresponding corrected covariance according to the covariance between the sequence corresponding to each dimension and the sequences corresponding to other dimensions, the number of types of data in the sequence corresponding to each dimension, and the impact evaluation value includes:

for the kth dimension and the jth dimension other than the kth dimension:

the ratio between the hyperbolic tangent function value of the number of kinds of data in the sequence corresponding to the kth dimension and the influence index of the kth dimension on the jth dimension is recorded as a correction coefficient;

and determining the product of the covariance between the sequence corresponding to the kth dimension and the sequence corresponding to the jth dimension except the kth dimension and the correction coefficient as the corrected covariance between the sequence corresponding to the kth dimension and the sequence corresponding to the jth dimension except the kth dimension.

Preferably, the scheduling the traffic of the SD-WAN based on the target covariance matrix includes:

performing self-adaptive dimension reduction processing on the data in the target covariance matrix through a variance interpretation rate to obtain dimension reduced data; and scheduling the traffic of the SD-WAN based on the reduced-dimension data.

Preferably, the obtaining the sequence corresponding to each dimension based on the traffic data packet includes:

carrying out deep analysis on the flow data packet to obtain high-dimensional vector data, and inputting the high-dimensional vector data into a PCA algorithm to obtain an initial matrix; each column vector in the initial matrix serves as a sequence corresponding to one dimension.

In a second aspect, the present invention provides an SD-WAN intelligent traffic scheduling optimization system, including a memory and a processor, where the processor executes a computer program stored in the memory to implement the above-mentioned SD-WAN intelligent traffic scheduling optimization method.

The invention has at least the following beneficial effects:

1. according to the method, firstly, the data in the sequence corresponding to each dimension are clustered for multiple times to obtain the corresponding clustering results of each dimension in each clustering mode, and the relevance among different characteristics is established.

2. The method provided by the invention better utilizes the inherent structure and relation of the data, increases the mining of the relevance of the data, improves the interpretation and characterization capability of the flow data characteristics, provides more powerful support for further analysis and application, and further improves the accuracy and reliability of SD-WAN flow intelligent scheduling.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of an SD-WAN intelligent traffic scheduling optimization method according to an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description is given to an SD-WAN intelligent traffic scheduling optimization method according to the present invention with reference to the accompanying drawings and the preferred embodiments.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of the SD-WAN intelligent traffic scheduling optimization method provided by the invention with reference to the accompanying drawings.

An embodiment of an SD-WAN intelligent traffic scheduling optimization method:

the specific scene aimed at by this embodiment is: conventional PCA dimension reduction algorithms typically choose to use variance-interpretation-rate eigenvalues for parameter selection, but when combined with traffic scheduling and adaptively adjusting the PCA dimension-reduction parameters, since the multidimensional eigenvectors of the traffic data are obtained through traffic depth analysis, there may be strong correlation between the multidimensional eigenvectors, and too strong correlation between the vector eigenvectors may result in that the found principal component with high variance may not represent the direction in which the most important information is found. The feature of strong correlation may cause the direction of the principal component to be insufficiently clear, thereby reducing the interpretation of the principal component. In the embodiment, firstly, a flow data packet in an SD-WAN is acquired, the flow data packet is subjected to deep analysis to obtain a high-dimensional vector, then the high-dimensional vector is analyzed, covariance between sequences corresponding to each dimension and sequences corresponding to other dimensions is corrected, a target covariance matrix is obtained, and then intelligent dispatching is performed on the flow of the SD-WAN based on the target covariance matrix.

The embodiment provides an SD-WAN intelligent traffic scheduling optimization method, as shown in fig. 1, which includes the following steps:

step S1, obtaining a flow data packet in the SD-WAN, and obtaining a sequence corresponding to each dimension based on the flow data packet.

In the case of better management and optimization through SD-WAN, the flow data is typically subjected to deep data analysis, and then high-dimensional feature vectors of the flow data are obtained through the deep analysis result, and these high-dimensional feature vectors are used to represent data features and data changes of various aspects of the flow data.

In this embodiment, firstly, traffic data packets in SD-WAN are collected, then, these data packets are subjected to deep analysis to obtain high-dimensional vector data, and these high-dimensional vector data are input into PCA algorithm to obtain an initial matrix, where the initial matrix specifically includes:

wherein,represents the initial matrix, m represents the number of rows of the initial matrix, n represents the number of columns of the initial matrix, +.>Data representing row 1 and column 1 of the initial matrix,>data representing row 1, column n, ">Data representing column 1 of row m in the initial matrix,/->Data representing the mth column of the mth row in the initial matrix.

Each column of data in the initial matrix is used as a sequence corresponding to one dimension, so that the sequence corresponding to a plurality of dimensions is obtained in the embodiment. The PCA algorithm is prior art and will not be described in further detail herein.

Step S2, clustering the data in the sequence corresponding to each dimension for a plurality of times based on the data in the sequence corresponding to each dimension to obtain clustering results corresponding to each dimension in each clustering mode; and obtaining the influence evaluation value of each dimension on other characteristics according to the difference of the data value between each clustering result corresponding to each dimension in each clustering mode and the clustering center, the fluctuation condition of the data in each clustering result corresponding to each dimension in each clustering mode, and the difference between the clustering result corresponding to each dimension in different clustering modes and the clustering result corresponding to each other dimension in different clustering modes.

After the deep analysis of the flow data packet, multidimensional feature vectors of the flow data are obtained, and strong correlation may exist between the feature vectors, and the strong correlation among the vector features may cause that the found principal component with large variance cannot represent the principal component data of the data. In the multi-dimensional features of the flow data, features affecting other features are often characteristic features of the data, for example, data reflecting that the flow belongs to a certain transmission protocol or a certain data format, and the data is often related to other feature vectors of the flow data strongly, so that the main component acquisition process of the PCA can be optimized by identifying and calculating the characteristic data in the multi-dimensional data and the correlation between the characteristic data and other feature data.

The characteristic data in the traffic data often represents the characteristics of the data packet, and there is a strong correlation between the data and various characteristics of the traffic data, so that the data needs to be identified in multidimensional data first. The data is characterized in that the data value is strictly limited, and the value and the specific meaning of the data have a relatively tight corresponding relation, namely the value range of the data is smaller, and usually only a plurality of specific values are fixed, and the data aggregation among other feature vectors when the values of the data are the same is relatively strong, so that the data have relatively large influence on other data. Based on this, the embodiment performs multiple clustering on the data in the sequence corresponding to each dimension, and evaluates the influence degree of each dimension on other features according to the difference of the data value between each clustering result corresponding to each dimension in each clustering mode and the clustering center, the fluctuation condition of the data in each clustering result corresponding to each dimension in each clustering mode, and the difference between the clustering result corresponding to each dimension in different clustering modes and the clustering result corresponding to each other dimension in different clustering modes, so as to obtain the influence evaluation value of each dimension on other features.

For the sequence corresponding to the kth dimension:

firstly, clustering data in a sequence corresponding to a kth dimension by adopting a mean shift clustering algorithm to obtain a plurality of clustering results, and marking each clustering result obtained at the moment as a first clustering result, namely obtaining a plurality of first clustering results of the kth dimension. And then constructing a plurality of tuples corresponding to the kth dimension and other dimensions based on each data in the sequence corresponding to the kth dimension and each data in other dimensions of the row where each data is located respectively. For the j-th dimension other than the k-th dimension: and clustering all the tuples corresponding to the kth dimension and each dimension except the kth dimension by adopting a mean shift clustering algorithm to obtain a clustering result of the kth dimension and each dimension except the kth dimension. In the embodiment, the number of seed points is set to 20 when a mean shift clustering algorithm is adopted for clustering, and the drift radius is set according to specific conditions. The mean shift clustering algorithm is the prior art and will not be described in detail here. And respectively acquiring a clustering center in each clustering result.

For the kth dimension:

for any one of the kth dimension and the jth dimension other than the kth dimension, clustering results: the difference between each jth dimension data in the clustering result and the jth dimension data corresponding to the clustering center is recorded as a first difference index of each jth dimension data; determining the arithmetic square root of the average value of the first difference indexes of all the j-th dimension data in the clustering result as the clusteringDiscrete indexes corresponding to the results; the specific calculation formula of the discrete index corresponding to the clustering result comprises the following steps:wherein Q represents the discrete index corresponding to the clustering result, V represents the number of the binary groups in the clustering result, and +.>Represents the j-th dimension data in the v-th binary group in the clustering result, O represents the j-th dimension data corresponding to the clustering center in the clustering result,/in the clustering result>The first difference index is used for reflecting the difference between the jth dimension data in the v and the jth dimension data in the clustering result, the discrete index can reflect the aggregation of the clustering result, and the smaller the discrete index is, the larger the aggregation of the data is; the larger the discrete index, the less aggregated the data. By adopting the method, the discrete index corresponding to each clustering result of the kth dimension in each clustering mode can be obtained. According to the discrete index corresponding to the clustering result of the kth dimension and each dimension except the kth dimension and the variance of the data in the clustering result of the kth dimension and each dimension except the kth dimension, obtaining an influence degree value of the kth dimension, wherein the discrete index corresponding to the clustering result of the kth dimension and each dimension except the kth dimension and the variance of the data in the clustering result of the kth dimension and each dimension except the kth dimension are in negative correlation with the influence degree value. The negative correlation indicates that the dependent variable decreases with increasing independent variable, and the dependent variable increases with decreasing independent variable, which may be a subtraction relationship, a division relationship, or the like, and is determined by the actual application. As a specific embodiment, a specific calculation formula of the influence degree value is given, where the specific calculation formula of the influence degree value of the kth dimension is:

wherein,a value representing the degree of influence of the kth dimension, < >>Maximum value of average discrete indexes corresponding to all first clustering results representing kth dimension,/, and>the number of first clustering results representing the kth dimension,/->Discrete index corresponding to the ith clustering result representing the kth dimension, ++>Representing the variance of the data in the ith cluster result of the kth dimension,/for the data in the kth cluster result>Represents a logarithmic function based on a constant 2, exp () represents an exponential function based on a natural constant,representing a preset first adjustment parameter, +.>Greater than 0.

The specific acquisition process of (1) is as follows: the maximum value of the discrete indexes corresponding to all the first clustering results of the kth dimension is taken as +.>. The reason why the preset first adjustment parameter is introduced into the calculation formula of the influence level value in this embodiment is to prevent the denominator from being 0, which is the preset first adjustment parameter in this embodiment0.01, in a specific application, the practitioner can set according to the specific situation. The larger the discrete index corresponding to each clustering result corresponding to the kth dimension in each clustering mode is, the smaller the overall aggregation of all data in the sequence corresponding to the kth dimension is, and the smaller the influence degree of the kth dimension is. />The method is used for reflecting the maximum aggregation of the data corresponding to the kth dimension after clustering, and the larger the value is, the larger the overall aggregation of the data corresponding to the kth dimension is; the variance of the data in each clustering result is used for reflecting the fluctuation condition of the data corresponding to the kth dimension in the clustering result, and the larger the value is, the smaller the overall aggregation is, and the larger the influence degree value of the kth dimension is.

For the j-th dimension other than the k-th dimension: according to the discrete index corresponding to each clustering result of the kth dimension except the kth dimension and the kth dimension, and the variance of the kth dimension data in each clustering result of the kth dimension except the kth dimension, obtaining an influence index of the kth dimension on the jth dimension, wherein the discrete index corresponding to each clustering result of the kth dimension except the kth dimension and the variance of the kth dimension data in each clustering result of the kth dimension except the kth dimension are in a negative correlation relation with the influence index. The specific calculation formula of the influence index of the kth dimension on the jth dimension is as follows:

wherein,an influence index representing the kth dimension on the jth dimension,/for>The number of clustering results representing the kth dimension and the jth dimension other than the kth dimension,/->Representing the maximum value of the average discrete indexes of the kth dimension and all clustering results of the jth dimension except the kth dimension, +.>A kth dimension and a jth dimension other than the kth dimension are represented by ≡>Discrete indexes corresponding to the clustering results, +.>A kth dimension and a jth dimension other than the kth dimension are represented by ≡>Variance of kth dimension data in all tuples in the clustering result, +.>Represents a logarithmic function based on a constant 2, exp () represents an exponential function based on a natural constant, ++>Representing a preset second adjustment parameter, +.>Greater than 0.

The specific acquisition method of (1) comprises the following steps: taking the maximum value in the discrete indexes corresponding to all clustering results of the kth dimension and the jth dimension except the kth dimension as +.>. In this embodiment, the preset second adjustment parameter is introduced into the calculation formula of the impact index to prevent the denominator from being 0, and in this embodiment, the preset second adjustment parameter is 0.01, and in a specific application, an implementer can according to a specific situationThe condition is set. The larger the discrete index corresponding to each clustering result corresponding to the jth dimension in each clustering mode is, the smaller the overall aggregation of all data in the sequence corresponding to the jth dimension is, and the smaller the influence degree of the jth dimension is. />The larger the value is, the larger the overall aggregation of the data corresponding to the jth dimension is, and the value range of the data is controlled by using a logarithmic function; the variance of the data in each clustering result is used for reflecting the fluctuation condition of the data corresponding to the jth dimension in the clustering result, and the larger the value is, the larger the influence degree difference of the kth dimension on the jth dimension is, the smaller the influence index of the kth dimension on the jth dimension is.

And determining the ratio of the influence index of the kth dimension to the influence degree value of the kth dimension as the influence evaluation value of the kth dimension to the jth dimension. The influence index of the kth dimension is used for reflecting the data aggregation of the data of the kth dimension after being classified by combining with the jth dimension; the influence degree value of the kth dimension is used for reflecting the data aggregation of clustering based on the data value of the kth dimension only; the ratio of the influence index to the influence degree value can reflect the difference of the aggregation condition of the kth dimension data after being classified by combining the jth dimension, and the larger the value is, the more concentrated the data distribution of the kth dimension after being classified by combining the jth dimension is, and the stronger the overall correlation between the kth dimension and the jth dimension is. The influence index of the kth dimension on the jth dimension is used for reflecting the overall correlation of the features of the kth dimension to the features of the kth dimension, and the larger the value is, the more concentrated the data range of the kth dimension is when the data of the kth dimension is the same, the stronger the correlation between the kth dimension and the jth dimension is.

According to the embodiment, the influence degree among the features of different dimensions is calculated by comparing the clustering result of clustering the kth dimension by combining the jth dimension with the clustering result of clustering only according to the data value corresponding to the kth dimension, and the obtained correlation is more accurate.

By adopting the method, the influence evaluation value of each dimension on other characteristics can be obtained.

Step S3, obtaining corresponding corrected covariance according to covariance between sequences corresponding to each dimension and sequences corresponding to other dimensions, the types of data in the sequences corresponding to each dimension and the influence evaluation value; and constructing a target covariance matrix based on the corrected covariance.

In the covariance calculation between different features, the stronger the correlation between two dimension vectors, the more difficult the covariance between the two data is to represent its true dimension information, and the greater the extent to which the covariance should be corrected at this time. In addition, the possibility that both dimensions are the flag class data needs to be considered, and one characteristic of the flag class data is that the value range of the data is smaller, and often not a continuous value range, and is a single data. Based on this, in this embodiment, the covariance between the sequence corresponding to each dimension and the sequence corresponding to the other dimension is corrected according to the number of types of data in the sequence corresponding to each dimension and the influence evaluation value of each dimension on other features, so as to obtain the corresponding corrected covariance.

Specifically, for the kth dimension and the jth dimension other than the kth dimension:

the ratio between the hyperbolic tangent function value of the number of kinds of data in the sequence corresponding to the kth dimension and the influence index of the kth dimension on the jth dimension is recorded as a correction coefficient; and determining the product of the covariance between the sequence corresponding to the kth dimension and the sequence corresponding to the jth dimension except the kth dimension and the correction coefficient as the corrected covariance between the sequence corresponding to the kth dimension and the sequence corresponding to the jth dimension except the kth dimension. The specific calculation formula of the corrected covariance between the sequence corresponding to the kth dimension and the sequence corresponding to the jth dimension except the kth dimension is as follows:

wherein,representing a modified covariance between sequences corresponding to the kth dimension and sequences corresponding to the jth dimension other than the kth dimension,/for example>Representing the covariance between the sequence corresponding to the kth dimension and the sequence corresponding to the jth dimension other than the kth dimension,/for each of the sequences>Representing the number of categories of data in the sequence corresponding to the kth dimension,/for each dimension>Representing a hyperbolic tangent function. The specific method for calculating covariance is the prior art, and will not be described in detail here.

Representing a correction factor for correcting the initial covariance between the different features. The larger the influence index of the kth dimension on the jth dimension is, the more concentrated the data range of the kth dimension is when the jth dimension is the same, the stronger the correlation between the jth dimension and the kth dimension is, and the larger the variance data caused by the correlation in the covariance is, the smaller the real covariance is; the more kinds of data in the sequence corresponding to the kth dimension, the wider the value of the data value of the kth dimension is, the less likely the data value is the sign type characteristic data, and the more accurate the correlation calculation is.

By adopting the method, the covariance between every two different features can be corrected, and the corresponding corrected covariance is obtained.

In this embodiment, a covariance matrix is constructed according to the corrected covariance between different features, and the covariance matrix constructed at this time is recorded as a target covariance matrix. Thus far, the present embodiment acquires the target covariance matrix.

And step S4, scheduling the SD-WAN traffic based on the target covariance matrix.

The present embodiment has obtained the target covariance matrix in step S3, and will schedule the traffic of the SD-WAN based on the target covariance matrix next.

Specifically, in this embodiment, eigenvalues and eigenvectors of a target covariance matrix are calculated, and then adaptive dimension reduction processing is performed on data in the target covariance matrix through a variance interpretation rate to obtain dimension reduced data, so that the dimension reduced data schedules the flow of the SD-WAN.

So far, the intelligent dispatching of SD-WAN traffic is realized by adopting the method provided by the embodiment.

According to the embodiment, firstly, clustering is carried out on data in a sequence corresponding to each dimension for a plurality of times to obtain clustering results corresponding to each dimension in each clustering mode, and relevance among different characteristics is established.

An embodiment of an SD-WAN intelligent traffic scheduling optimization system:

the SD-WAN intelligent flow scheduling optimization system comprises a memory and a processor, wherein the processor executes a computer program stored in the memory to realize the SD-WAN intelligent flow scheduling optimization method.

Since an SD-WAN intelligent traffic scheduling optimization method has been described in an embodiment of an SD-WAN intelligent traffic scheduling optimization method, the description of the SD-WAN intelligent traffic scheduling optimization method is not repeated in this embodiment.

It should be noted that: the foregoing description of the preferred embodiments of the present invention is not intended to be limiting, but rather, any modifications, equivalents, improvements, etc. that fall within the principles of the present invention are intended to be included within the scope of the present invention.

Claims

1. An intelligent SD-WAN flow scheduling optimization method is characterized by comprising the following steps:

and scheduling the SD-WAN traffic based on the target covariance matrix.

2. The method for intelligent traffic scheduling optimization of SD-WAN according to claim 1, wherein clustering the data in the sequence corresponding to each dimension for several times based on the data in the sequence corresponding to each dimension to obtain each clustering result corresponding to each dimension in each clustering mode, comprises:

for the sequence corresponding to the kth dimension:

3. The method for intelligent traffic scheduling optimization of SD-WAN according to claim 2, wherein the obtaining the impact evaluation value of each dimension on other features according to the difference of the data value between the corresponding clustering result and the clustering center in each clustering mode, the fluctuation condition of the data in the corresponding clustering result in each clustering mode, the difference between the corresponding clustering result in different clustering modes and the corresponding clustering result in different clustering modes, comprises:

for the kth dimension:

4. A method for intelligent traffic scheduling optimization for SD-WAN according to claim 3, wherein the obtaining of the discrete index comprises:

5. A method for intelligent traffic scheduling optimization for SD-WAN according to claim 3, wherein the impact index of the kth dimension on the jth dimension is calculated by using the following formula:

6. The method for optimizing SD-WAN intelligent traffic scheduling according to claim 3, wherein said determining an impact evaluation value of the kth dimension to the jth dimension based on the impact index and the impact level value comprises:

7. The method for optimizing SD-WAN intelligent traffic scheduling according to claim 1, wherein the obtaining the corresponding corrected covariance according to the covariance between the sequence corresponding to each dimension and the sequences corresponding to other dimensions, the number of kinds of data in the sequence corresponding to each dimension, and the impact evaluation value comprises:

for the kth dimension and the jth dimension other than the kth dimension:

8. The method for intelligent traffic scheduling optimization of SD-WAN according to claim 1, wherein said scheduling of SD-WAN traffic based on said target covariance matrix comprises:

9. The method for optimizing SD-WAN intelligent traffic scheduling according to claim 1, wherein said obtaining a sequence corresponding to each dimension based on the traffic data packet comprises:

10. An SD-WAN intelligent traffic scheduling optimization system comprising a memory and a processor, characterized in that said processor executes a computer program stored in said memory to implement an SD-WAN intelligent traffic scheduling optimization method according to any of claims 1-9.