CN114500335B - SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine - Google Patents

SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine Download PDF

Info

Publication number
CN114500335B
CN114500335B CN202210071611.XA CN202210071611A CN114500335B CN 114500335 B CN114500335 B CN 114500335B CN 202210071611 A CN202210071611 A CN 202210071611A CN 114500335 B CN114500335 B CN 114500335B
Authority
CN
China
Prior art keywords
frequency
kernel
amplitude
low
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210071611.XA
Other languages
Chinese (zh)
Other versions
CN114500335A (en
Inventor
李帅永
张旭云涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202210071611.XA priority Critical patent/CN114500335B/en
Publication of CN114500335A publication Critical patent/CN114500335A/en
Application granted granted Critical
Publication of CN114500335B publication Critical patent/CN114500335B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Environmental & Geological Engineering (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of network flow prediction, and particularly relates to an SDN network flow control method based on a fuzzy C-means and a mixed kernel least square support vector machine, which comprises the following steps: converting the non-stationary SDN network flow data into stationary time sequence components by adopting discrete wavelet transformation; processing the stable time sequence component to obtain amplitude signals of a high frequency band and a low frequency band; clustering amplitude signals of the high frequency band and the low frequency band by adopting a fuzzy C-means algorithm; respectively predicting the clustered components by adopting an optimized adaptive hybrid kernel least square support vector machine prediction model; reconstructing the prediction results of all components to obtain the prediction results of SDN network data traffic; according to the invention, a membership mechanism is introduced by utilizing a fuzzy C-means algorithm, and the time sequence components are divided into three types of high-frequency low-amplitude components, medium-frequency medium-amplitude components and low-frequency high-amplitude components according to amplitude-frequency characteristics of the time sequence components, so that accurate prediction is provided for subsequent classification prediction.

Description

SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine
Technical Field
The invention belongs to the field of network flow prediction, and particularly relates to an SDN network flow control method based on a fuzzy C-means and a mixed kernel least square support vector machine.
Background
Software Defined Networking (SDN) has become an emerging industry in the current world of networks, with the main idea being to separate the control plane from the data plane that would otherwise be in the network switches and routers, enabling true forwarding and data separation. Compared with the complexity of a traditional SNMP network distributed measurement system, the SDN network can realize centralized monitoring of network traffic data, and predicting the network traffic is also one of important ways for improving the service quality and guaranteeing the service safety. The traditional flow prediction mode mainly integrates flow data into a time sequence of flow, namely, a flow prediction problem is planned into a prediction problem based on the time sequence. The time sequence prediction is mainly used for predicting time sequence states and development trends of a period of time in the future according to historical time sequence data, and related work deployment or formulation schemes can be developed in advance to cope with possible abnormal situations in the predicted data. In general, time series prediction analysis is more effective for near and short term predictions than for long term predictions, because if the prediction time point is lengthened to a longer future, a large limitation may occur, resulting in a large deviation of the predicted value from the actual value, which may lead to erroneous decisions for some decisions.
Most of network traffic time series data have a tendency of non-stationary, and in view of the fact that time series prediction depends on the stability of time series, a proper method is needed to decompose the non-stationary traffic data so as to acquire a stationary sequence, and then the next analysis is performed. Tan and the like are introduced into multi-scale wavelet transformation to convert a non-stationary time sequence signal into a multi-layer relatively stable decomposition sequence, then an ARMA model and an ARFIMA model are mixed to respectively predict and analyze data of an approximation layer and a detail layer, and the method has higher network flow prediction precision, but does not effectively analyze amplitude-frequency characteristics of decomposition components. The Least Squares Support Vector Machine (LSSVM) is a special form of SVM under a quadratic loss function, solving only linear equations and solving very quickly. In the LSSVM, although sample data is originally complicated and different in dimension, the data is easily separated, and corresponding data is mapped into a high-dimensional space through a kernel function, so that the LSSVM is very widely applied to time sequence prediction; zhu Qianyu et al propose a flow prediction model combining Empirical Mode Decomposition (EMD) and Particle Swarm (PSO) optimization least square support vector machine, in which the flow sequence is decomposed and stabilized by EMD, and then LSSVM prediction model parameters are optimized by PSO, so as to effectively improve the prediction accuracy of the model, but only one type of kernel function is used for prediction analysis, and the adaptability of each decomposition component to different kernel functions is not considered.
In view of the foregoing, there is a strong need for an SDN network flow control method that not only can effectively analyze the amplitude-frequency characteristics of the analysis components, but also can use multiple kinds of kernel functions to perform analysis and prediction.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an SDN network flow control method based on a fuzzy C-means and a hybrid kernel least squares support vector machine, which comprises the following steps: acquiring non-stationary SDN network flow data, and converting the non-stationary SDN network flow data into stationary time sequence components by adopting discrete wavelet transformation; calculating the signal amplitude of the stable time sequence component, and processing the signal amplitude by adopting fast Fourier transform to obtain a high-frequency-band amplitude signal and a low-frequency-band amplitude signal; clustering the high-frequency amplitude signal and the low-frequency amplitude signal by adopting a fuzzy C-means algorithm to obtain a high-frequency low-amplitude component, an intermediate-frequency medium-amplitude component and a low-frequency high-amplitude component; the high-frequency low-amplitude component, the medium-frequency medium-amplitude component and the low-frequency high-amplitude component are respectively predicted by adopting an optimized self-adaptive mixed kernel least square support vector machine prediction model, so that a prediction result of each component is obtained; reconstructing the prediction results of all components to obtain the prediction results of SDN network data traffic; and controlling SDN network data traffic according to the prediction result.
Preferably, the formula for converting the non-stationary SDN network traffic data into stationary time series components using discrete wavelet transform is:
A 0 [s(t)]=s(t)
Figure BDA0003482282230000021
Figure BDA0003482282230000022
wherein t is a time sequence number, s (t) is an initial network traffic sequence, and i is a decomposition scale; h and G are a low-pass filter and a high-pass filter of wavelet decomposition; a is that i And D i Wavelet coefficients of a low frequency portion and a high frequency portion obtained in the i-th layer decomposition of the initial time series s (t), respectively.
Further, the determining the decomposition scale includes:
step 1: setting an initial decomposition ratio as N;
step 2: performing stable decomposition on the time sequence signals according to the initial decomposition proportion;
step 3: determining whether the (n+1) th component meets the decomposition stopping standard, if so, stopping the decomposition, otherwise, increasing the decomposition ratio and continuing the decomposition; the stop decomposition criterion is whether the number of extreme points is 1 or 0.
Preferably, the formula for processing the signal amplitude by using the fast fourier transform is:
Figure BDA0003482282230000031
wherein x (N) represents a decomposition component sequence, N represents the number of sequences, N represents a sample point, k=0, 1,2,..n-1 is the number of sequences, x even (n) represents an even number sequence in the component, x odd (n) represents an odd number of sequences in the component.
Preferably, the clustering process of the high-frequency-band amplitude signal and the low-frequency-band amplitude signal by adopting the fuzzy C-means algorithm comprises the following steps:
step 1: initializing decomposition component information and a membership matrix; setting a maximum iteration number and a threshold epsilon;
step 2: judging whether the current iteration number is smaller than the maximum iteration number, if so, calculating membership matrixes mu of the high-frequency-band amplitude signal and the low-frequency-band amplitude signal respectively ij The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, outputting a clustering result and a clustering center;
step 3: clustering center c according to membership matrix pairs of various data j Updating;
step 4: calculating the change amount of the data cost function according to the membership matrix and the updated clustering center;
step 5: and comparing the change amount of the data cost function with a set threshold epsilon, outputting a clustering result and a clustering center if the change amount is a threshold which is set by raining, otherwise, returning to the step S2 after the iteration times are increased by 1.
Preferably, optimization is employedThe process of respectively predicting the high-frequency low-amplitude component, the medium-frequency medium-amplitude component and the low-frequency high-amplitude component by the adaptive hybrid kernel least square support vector machine prediction model comprises the following steps: for different amplitude-frequency characteristics of each flow sequence component, a Gaussian kernel (K) is constructed for the high-frequency low-amplitude component GAU ) LSSVM predictive model, constructing polynomial kernel (K) to medium frequency medium amplitude components POL ) LSSVM predictive model builds a linear kernel (K) on low frequency high amplitude components LIN ) The LSSVM prediction model is used for respectively carrying out prediction analysis.
Further, gaussian kernel (K GAU ) The kernel function formula of the LSSVM prediction model is:
Figure BDA0003482282230000041
the prediction function is:
Figure BDA0003482282230000042
polynomial kernel (K) POL ) The kernel function of the LSSVM predictive model is:
K POL (x,x i )=((x,x i )+1) d d is a natural number
The prediction function is:
Figure BDA0003482282230000043
linear kernel (K) LIN ) The kernel function of the LSSVM predictive model is:
K LIN (x,x i )=x T x i
the prediction function becomes:
Figure BDA0003482282230000044
wherein K is GAU Representing a gaussian kernel, x representing a test sample, x i The input vector is represented as such,sigma represents a kernel parameter, alpha i Represents the weight vector, m represents the sample size, b represents the bias, K POL Representing polynomial kernels, K LIN Representing the linear kernel and T representing the transpose.
Further, the optimizing process of the adaptive hybrid kernel least squares support vector machine prediction model comprises the following steps:
step 1: initializing parameters of the artificial bee colony algorithm, wherein the initialized parameters comprise: total number of bees of bee colony N C Number of leading bees N e Number of following bees N o Number N of algorithm solutions s Maximum number of iterations M, and food source parameter combinations (γ, σ);
step 2: setting a fitness function f (i);
step 3: leading bees to search honey sources, searching new solutions, calculating the fitness value of each solution, and updating and replacing old solutions if the new fitness value is larger;
step 4: after the leading bees update the honey sources, calculating following probabilities according to the benefit degree of the honey sources, and selecting the bees to follow and searching the field by the following bees according to the following probabilities;
step 5: if the update failure times of the solution exceeds the maximum search times, the solution cannot be continuously optimized, the following bees give up the solution, and the following bees are converted into the reconnaissance bees to start searching for new honey sources;
step 6: if the maximum iteration number is reached, training is finished, and an optimal parameter combination (gamma, sigma) is output; otherwise, returning to the step 3.
Preferably, the process of predicting the high-frequency low-amplitude component, the medium-frequency medium-amplitude component and the low-frequency high-amplitude component by adopting an optimized adaptive hybrid kernel least square support vector machine prediction model comprises the following steps:
preferably, the formula for reconstructing the prediction results of all the components is:
Figure BDA0003482282230000051
where s (t) represents an initial network traffic sequence,h (2 t-k) represents a wavelet decomposed low pass filter, t represents a time sequence number, k represents a time shift, A i+1 [s(t)]The wavelet coefficient of the high frequency part of the (i+1) th layer is represented, and G (2 t-k) represents a wavelet decomposed high pass filter.
According to the invention, a membership mechanism is introduced by utilizing a fuzzy C-means algorithm, and the time sequence components are divided into three types of high-frequency low-amplitude components, medium-frequency medium-amplitude components and low-frequency high-amplitude components according to amplitude-frequency characteristics of the time sequence components, so that a powerful basis is provided for subsequent classification prediction. Because the high-frequency time sequence needs a local kernel function with good local learning ability, and the low-frequency time sequence needs a global kernel function with good global learning ability, the invention constructs a mixed kernel least square support vector machine prediction model according to the algorithm principle and the application field of different kernel functions, respectively predicts and reconstructs the classification result by adopting LSSVM models with different kernels, and effectively improves the prediction precision of network flow.
Drawings
FIG. 1 is an overall flow chart of an embodiment of the present invention;
FIG. 2 is a graph showing the determination of the decomposition scale of a DWT in an embodiment of the present invention;
FIG. 3 is a flow chart of the FCM algorithm in an embodiment of the present invention;
FIG. 4 is a flowchart of an ABC optimized prediction model in an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
An SDN network flow control method based on fuzzy C-means and mixed kernel least squares support vector machine, as shown in figure 1, comprises the following steps: acquiring non-stationary SDN network flow data, and converting the non-stationary SDN network flow data into stationary time sequence components by adopting discrete wavelet transformation; calculating the signal amplitude of the stable time sequence component, and processing the signal amplitude by adopting fast Fourier transform to obtain a high-frequency-band amplitude signal and a low-frequency-band amplitude signal; clustering the high-frequency amplitude signal and the low-frequency amplitude signal by adopting a fuzzy C-means algorithm to obtain a high-frequency low-amplitude component, an intermediate-frequency medium-amplitude component and a low-frequency high-amplitude component; the high-frequency low-amplitude component, the medium-frequency medium-amplitude component and the low-frequency high-amplitude component are respectively predicted by adopting an optimized self-adaptive mixed kernel least square support vector machine prediction model, so that a prediction result of each component is obtained; reconstructing the prediction results of all components to obtain the prediction results of SDN network data traffic; and controlling SDN network data traffic according to the prediction result.
The initial flow sequence is decomposed into a low frequency component (reflecting trend features) and a high frequency component (reflecting local detail features) using discrete wavelet decomposition (Discrete Wavelet Transform, DWT). The original input time sequence is decomposed and reconstructed by wavelet filters H, G and h and g mainly combined with a Mallat fast algorithm, and the calculation process is as follows:
A 0 [s(t)]=s(t)
Figure BDA0003482282230000061
Figure BDA0003482282230000062
wherein t is a time sequence number, s (t) is an initial network traffic sequence, and i is a decomposition scale; h and G are a low-pass filter and a high-pass filter of wavelet decomposition; a is that i And D i Wavelet coefficients of a low frequency portion and a high frequency portion obtained in the i-th layer decomposition of the initial time series s (t), respectively.
As shown in fig. 2, the process of determining the decomposition scale includes:
step 1: setting an initial decomposition ratio as N;
step 2: performing stable decomposition on the time sequence signals according to the initial decomposition proportion;
step 3: determining whether the (n+1) th component meets the decomposition stopping standard, if so, stopping the decomposition, otherwise, increasing the decomposition ratio and continuing the decomposition; the stop decomposition criterion is whether the number of extreme points is 1 or 0.
Calculating the signal amplitude of the stationary time series component comprises: the signal amplitude of the decomposed component is found using a sum of squares function. The sum of squares function represents the signal amplitude intensity of the decomposition component, and the calculation formula is as follows:
Figure BDA0003482282230000071
where x (t) represents a time-series signal.
Decomposing the high frequency band and the low frequency band by utilizing Fast Fourier Transform (FFT); the fast Fourier transform has the advantages of low requirement on operation and fast operation, and is applied to the derivation of the wavelet decomposition component frequency characteristic. And analyzing the bandwidth of the DWT high-frequency component by adopting fast Fourier transform to obtain a high frequency band and a low frequency band of the DWT high-frequency component.
The formula for processing the signal amplitude by adopting the fast Fourier transform is as follows:
Figure BDA0003482282230000072
where x (N) represents a decomposition component sequence, N represents the number of sequences, N represents a sample point, k=0, 1,2,.. even (n) represents an even number sequence in the component, x odd (n) represents an odd number of sequences in the component. The whole transformation process is subjected to 2Nlog 2 N multiplication operations.
And classifying the flow sequence components by using a fuzzy C-means algorithm, and preparing for establishing a corresponding nuclear prediction model according to different class components in the next step. Since the low frequency components of high amplitude represent the trend of variation of these components, the high frequency components of low amplitude represent details including noise, these components should be classified into at least 3 classes, including high frequency low amplitude components, medium frequency medium amplitude components, and low frequency high amplitude components.
As shown in fig. 3, the process of clustering the high-frequency-band amplitude signal and the low-frequency-band amplitude signal by adopting the fuzzy C-means algorithm includes:
step 1: initializing decomposition component information and a membership matrix; setting a maximum iteration number and a threshold epsilon;
step 2: judging whether the current iteration number is smaller than the maximum iteration number, if so, calculating membership matrixes mu of the high-frequency-band amplitude signal and the low-frequency-band amplitude signal respectively ij The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, outputting a clustering result and a clustering center;
membership matrix mu ij The calculation formula of (2) is as follows:
Figure BDA0003482282230000081
wherein d ij Indicating Euclidean distance between the ith time point and the jth cluster center, d ik The Euclidean distance between the ith time point and the kth clustering center is represented, m represents a fuzzy factor, and c represents the number of the clustering centers.
Step 3: clustering center c according to membership matrix pairs of various data j Updating;
the calculation expression of the clustering center is as follows:
Figure BDA0003482282230000082
where n represents the number of samples, x j Representing the j-th sample.
Step 4: calculating the change amount of the data cost function according to the membership matrix and the updated clustering center;
the objective function (cost function) is:
Figure BDA0003482282230000083
wherein c represents the number of clusters; mu (mu) ij Representing membership; d, d ij Representation sectionThe Euclidean distance from the point to the cluster center is expressed as: d, d ij =||x i -c j I, wherein x i For the ith sample, c j Representing the j-th cluster center; m represents a weighted index, which is essentially a parameter characterizing the degree of blurring, the optimal range being [1,2.5]Generally, 2 is preferable.
Step 5: and comparing the change amount of the data cost function with a set threshold epsilon, outputting a clustering result and a clustering center if the change amount is a threshold which is set by raining, otherwise, returning to the step S2 after the iteration times are increased by 1. The condition for comparing the change amount of the data cost function with the set threshold epsilon is:
U(k+1)-U(k)≤ε
where U (k+1) represents the cost function after (k+1) iterations, and U (k) represents the cost function after k iterations.
The method for constructing the hybrid kernel least square support vector machine prediction model comprises the following steps: introducing a set of SDN network traffic sample sets
Figure BDA0003482282230000091
Wherein x is i For input vector (initial network traffic time series), y i For its class labels, m is sample size, R n Represents an n-dimensional real set, and R represents a real set.
Constructing a linear regression function:
Figure BDA0003482282230000092
wherein the weight vector w E R n ,
Figure BDA0003482282230000093
The input data is mapped into a high-dimensional feature space.
The regression problem of the LSSVM translates into a minimum problem solving the following equation:
Figure BDA0003482282230000094
the constraint conditions are as follows:
Figure BDA0003482282230000095
wherein e i For the error between the i-th estimated value and the true value, γ is the regularization coefficient.
According to the dual principle, the optimization problem of the LSSVM is converted into a Lagrangian equation:
Figure BDA0003482282230000096
and (3) performing bias derivation on w, b, e and alpha to obtain the optimization condition of the Lagrangian function:
Figure BDA0003482282230000097
converting the optimized condition into a matrix, namely:
Figure BDA0003482282230000101
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0003482282230000102
Ω kj =K(x k ,x j ) K, j=1, … …, m is a kernel function matrix; gamma is a regularization coefficient; alpha= [ alpha ] 1 ;...;α m ];Y=[y 1 ;...y m ]。
The final LSSVM prediction function is:
Figure BDA0003482282230000103
wherein (alpha) 1 ,…,α N ) Is a weight vector, K (x, x i ) Is a kernel function and b is a bias.
Since the high frequency time series requires a local kernel function with good local learning ability, and the low frequency time series requires a global kernel function with good global learning ability. So for different amplitude-frequency characteristics of each flow sequence component, a Gaussian kernel (K) is constructed for the high-frequency low-amplitude component GAU ) LSSVM predictive model, constructing polynomial kernel (K) to medium frequency medium amplitude components POL ) LSSVM predictive model builds a linear kernel (K) on low frequency high amplitude components LIN ) LSSVM predictive model.
(1) Construction of Gaussian kernel (K) for high frequency low amplitude components GAU ) LSSVM predictive model. The corresponding formula for the gaussian kernel function is as follows:
Figure BDA0003482282230000104
the corresponding prediction function becomes:
Figure BDA0003482282230000105
(2) Construction of polynomial kernels (K) for mid-range components in the intermediate frequency POL ) The LSSVM predictive model, the corresponding formula of the polynomial kernel function is as follows:
K POL (x,x i )=((x,x i )+1) d d is a natural number
The corresponding prediction function becomes:
Figure BDA0003482282230000106
(3) Building a linear kernel (K) on low frequency high amplitude components LIN ) The LSSVM predictive model, the corresponding formula of the linear kernel function is as follows:
K LIN (x,x i )=x T x i
the corresponding prediction function becomes:
Figure BDA0003482282230000111
wherein K is GAU Representing a gaussian kernel, x representing a test sample, x i Representing the input vector (initial network traffic time series), σ representing the kernel parameter, α i Represents the weight vector, m represents the sample size, b represents the bias, K POL Representing polynomial kernels, K LIN Representing the linear kernel and T representing the transpose.
The process for optimizing the adaptive hybrid kernel least squares support vector machine prediction model comprises the following steps: the parameters that mainly need to be optimized in LSSVM are the following two: regularization coefficient γ and kernel function parameter σ. The regularization coefficient gamma controls the complexity of the model and the amount of approximation error compromise, and the size of the regularization coefficient gamma influences the popularization capability of the model; the kernel function parameter sigma is a parameter for determining the kernel width, and is mainly aimed at optimizing the gaussian kernel function parameter. The method takes the prediction accuracy of the least square support vector machine as an objective function, and establishes a function: minF (gamma, sigma), s.t. gamma.E [ gamma ] minmax ],σ∈[σ minmax ]。
As shown in fig. 4, the process of optimizing the adaptive hybrid kernel least squares support vector machine prediction model includes:
step 1: initializing parameters of the artificial bee colony algorithm, wherein the initialized parameters comprise: total number of bees of bee colony N C Number of leading bees N e Number of following bees N o Number N of algorithm solutions s Maximum number of iterations M, and food source parameter combination (γ, σ).
Step 2: setting a fitness function f (i), wherein the expression of the function is as follows:
Figure BDA0003482282230000112
wherein y is i And
Figure BDA0003482282230000113
respectively, the actual value and the predicted value of the time sequence, and n is a training sample.
Step 3: the leading bees search for honey sources, search for new solutions, calculate the fitness value of each solution, and update and replace old solutions if the new fitness value is larger. The neighborhood search formula is:
v ij =x ij +r ij (x ij -x kj )
wherein x is ij The j-th dimensional coordinate representing the i-th honey source, r ij ∈[-1,1]Is a randomly selected number, x kj And the j-th dimensional coordinate of the kth honey source is represented.
Step 4: after the leading bees update the honey source, calculating the following probability according to the benefit degree of the honey source, and selecting the honey to follow by the following bees according to the following probability and searching the field. The following probability function is:
Figure BDA0003482282230000121
wherein f (X) i ) Indicating the fitness value of the ith honey source, X i Indicating the ith honey source.
Step 5: if the update failure times of the solution exceeds the maximum search times, the solution cannot be optimized continuously, the following bees discard the solution, and the following bees are converted into the reconnaissance bees to start searching for new honey sources. The honey source updating formula is as follows:
x ij =x min (j)+rand(0,1)(x max (j)-x min (j))
wherein x is min (j) And x max (j) The minimum and maximum values in the j-th dimension are represented, respectively, and rand (0, 1) represents a random number on the interval (0, 1).
Step 6: if the maximum iteration number is reached, training is finished, and an optimal parameter combination (gamma, sigma) is output; otherwise, returning to the step 3.
After the optimization of the prediction model is completed, the data reconstruction is needed to be carried out on the predicted value of the wavelet component, and the final SDN network flow prediction result is output after integration. The formula for reconstructing the prediction results of all components is:
Figure BDA0003482282230000122
wherein s (t) represents an initial network traffic sequence, H (2 t-k) represents a wavelet decomposed low-pass filter, t represents a time sequence number, k represents a time shift, A i+1 [s(t)]The wavelet coefficient of the high frequency part of the (i+1) th layer is represented, and G (2 t-k) represents a wavelet decomposed high pass filter.
The reconstruction process of the predicted component is as follows: the average predicted component is stretched by inserting zero values between 2 samples, namely 'up-sampling', and then a large-scale low-resolution approximation, namely low-pass output, is obtained through a low-pass filter; after the detail signal is up-sampled, the detail signal passes through a high-pass filter, and finally high-pass output is obtained. And adding the low-pass output and the high-pass output to obtain a reconstructed SDN network flow prediction value.
While the foregoing is directed to embodiments, aspects and advantages of the present invention, other and further details of the invention may be had by the foregoing description, it will be understood that the foregoing embodiments are merely exemplary of the invention, and that any changes, substitutions, alterations, etc. which may be made herein without departing from the spirit and principles of the invention.

Claims (9)

1. An SDN network flow control method based on a fuzzy C-means and a hybrid kernel least squares support vector machine is characterized by comprising the following steps: acquiring non-stationary SDN network flow data, and converting the non-stationary SDN network flow data into stationary time sequence components by adopting discrete wavelet transformation; calculating the signal amplitude of the stable time sequence component, and processing the signal amplitude by adopting fast Fourier transform to obtain a high-frequency-band amplitude signal and a low-frequency-band amplitude signal; clustering the high-frequency amplitude signal and the low-frequency amplitude signal by adopting a fuzzy C-means algorithm to obtain a high-frequency low-amplitude component, an intermediate-frequency medium-amplitude component and a low-frequency high-amplitude component; the high-frequency low-amplitude component, the medium-frequency medium-amplitude component and the low-frequency high-amplitude component are respectively predicted by adopting an optimized self-adaptive mixed kernel least square support vector machine prediction model, so that a prediction result of each component is obtained; reconstructing the prediction results of all components to obtain the prediction results of SDN network data traffic; and controlling SDN network data traffic according to the prediction result.
2. The SDN network flow control method based on fuzzy C-means and hybrid kernel least squares support vector machine of claim 1, wherein the formula for converting non-stationary SDN network traffic data into stationary time series components using discrete wavelet transform is:
A 0 [s(t)]=s(t)
Figure FDA0003482282220000011
Figure FDA0003482282220000012
wherein t is a time sequence number, s (t) is an initial network traffic sequence, and i is a decomposition scale; h and G are a low-pass filter and a high-pass filter of wavelet decomposition; a is that i And D i Wavelet coefficients of a low frequency portion and a high frequency portion obtained in the i-th layer decomposition of the initial time series s (t), respectively.
3. The SDN network flow control method of claim 2, wherein determining the resolution scale includes:
step 1: setting an initial decomposition ratio as N;
step 2: performing stable decomposition on the time sequence signals according to the initial decomposition proportion;
step 3: determining whether the (n+1) th component meets the decomposition stopping standard, if so, stopping the decomposition, otherwise, increasing the decomposition ratio and continuing the decomposition; the stop decomposition criterion is whether the number of extreme points is 1 or 0.
4. The SDN network flow control method based on fuzzy C-means and hybrid kernel least squares support vector machine of claim 1, wherein the formula for processing the signal amplitude using fast fourier transform is:
Figure FDA0003482282220000021
k=0,1,2,...,N-1
wherein x (N) represents the decomposition component sequence, N represents the number of sequences, N represents the sample point, x even (n) represents an even number sequence in the component, x odd (n) represents an odd number of sequences in the component.
5. The SDN network flow control method based on fuzzy C-means and hybrid kernel least squares support vector machine of claim 1, wherein clustering the high-band amplitude signal and the low-band amplitude signal using a fuzzy C-means algorithm comprises:
step 1: initializing decomposition component information and a membership matrix; setting a maximum iteration number and a threshold epsilon;
step 2: judging whether the current iteration number is smaller than the maximum iteration number, if so, calculating membership matrixes mu of the high-frequency-band amplitude signal and the low-frequency-band amplitude signal respectively ij The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, outputting a clustering result and a clustering center;
step 3: clustering center c according to membership matrix pairs of various data j Updating;
step 4: calculating the change amount of the data cost function according to the membership matrix and the updated clustering center;
step 5: and comparing the change amount of the data cost function with a set threshold epsilon, outputting a clustering result and a clustering center if the change amount is a threshold which is set by raining, otherwise, returning to the step S2 after the iteration times are increased by 1.
6. A fuzzy C-means and blending kernel based minimum as claimed in claim 1The SDN network flow control method of the square support vector machine is characterized in that the process of respectively predicting the high-frequency low-amplitude component, the medium-frequency medium-amplitude component and the low-frequency high-amplitude component by adopting an optimized adaptive hybrid kernel least square support vector machine prediction model comprises the following steps: for different amplitude-frequency characteristics of each flow sequence component, a Gaussian kernel (K) is constructed for the high-frequency low-amplitude component GAU ) LSSVM predictive model, constructing polynomial kernel (K) to medium frequency medium amplitude components POL ) LSSVM predictive model builds a linear kernel (K) on low frequency high amplitude components LIN ) The LSSVM prediction model is used for respectively carrying out prediction analysis.
7. The SDN network flow control method based on fuzzy C-means and hybrid kernel least squares support vector machine of claim 6, characterized by gaussian kernel (K GAU ) The kernel function formula of the LSSVM prediction model is:
Figure FDA0003482282220000031
the prediction function is:
Figure FDA0003482282220000032
polynomial kernel (K) POL ) The kernel function of the LSSVM predictive model is:
K POL (x,x i )=((x,x i )+1) d d is a natural number
The prediction function is:
Figure FDA0003482282220000033
linear kernel (K) LIN ) The kernel function of the LSSVM predictive model is:
K LIN (x,x i )=x T x i
the prediction function becomes:
Figure FDA0003482282220000034
wherein K is GAU Representing a gaussian kernel, x representing a test sample, x i Representing the input vector, σ representing the kernel parameter, α i Represents the weight vector, m represents the sample size, b represents the bias, K POL Representing polynomial kernels, K LIN Representing the linear kernel and T representing the transpose.
8. The method for controlling the flow of an SDN network based on a fuzzy C-means and hybrid kernel least squares support vector machine of claim 6, wherein optimizing parameters in a predictive model of an adaptive hybrid kernel least squares support vector machine using an artificial swarm algorithm comprises:
step 1: initializing parameters of the artificial bee colony algorithm, wherein the initialized parameters comprise: total number of bees of bee colony N C Number of leading bees N e Number of following bees N o Number N of algorithm solutions s Maximum number of iterations M, and food source parameter combinations (γ, σ);
step 2: setting a fitness function f (i);
step 3: leading bees to search honey sources, searching new solutions, calculating the fitness value of each solution, and updating and replacing old solutions if the new fitness value is larger;
step 4: after the leading bees update the honey sources, calculating following probabilities according to the benefit degree of the honey sources, and selecting the bees to follow and searching the field by the following bees according to the following probabilities;
step 5: if the update failure times of the solution exceeds the maximum search times, the solution cannot be continuously optimized, the following bees give up the solution, and the following bees are converted into the reconnaissance bees to start searching for new honey sources;
step 6: if the maximum iteration number is reached, training is finished, and an optimal parameter combination (gamma, sigma) is output; otherwise, returning to the step 3.
9. The SDN network flow control method based on fuzzy C-means and hybrid kernel least squares support vector machine of claim 1, wherein the formula for reconstructing the prediction results of all components is:
Figure FDA0003482282220000041
wherein s (t) represents an initial network traffic sequence, H (2 t-k) represents a wavelet decomposed low-pass filter, t represents a time sequence number, k represents a time shift, A i+1 [s(t)]The wavelet coefficient of the high frequency part of the (i+1) th layer is represented, and G (2 t-k) represents a wavelet decomposed high pass filter.
CN202210071611.XA 2022-01-21 2022-01-21 SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine Active CN114500335B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210071611.XA CN114500335B (en) 2022-01-21 2022-01-21 SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210071611.XA CN114500335B (en) 2022-01-21 2022-01-21 SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine

Publications (2)

Publication Number Publication Date
CN114500335A CN114500335A (en) 2022-05-13
CN114500335B true CN114500335B (en) 2023-06-16

Family

ID=81472505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210071611.XA Active CN114500335B (en) 2022-01-21 2022-01-21 SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine

Country Status (1)

Country Link
CN (1) CN114500335B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115442246B (en) * 2022-08-31 2023-09-26 武汉烽火技术服务有限公司 Traffic prediction method, device, equipment and storage medium of data plane network
CN115389888B (en) * 2022-10-28 2023-01-31 山东科华电力技术有限公司 Partial discharge real-time monitoring system based on high-voltage cable

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086928A (en) * 2018-07-27 2018-12-25 福州大学 Photovoltaic plant realtime power prediction technique based on SAGA-FCM-LSSVM model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086928A (en) * 2018-07-27 2018-12-25 福州大学 Photovoltaic plant realtime power prediction technique based on SAGA-FCM-LSSVM model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Classification via Deep Fuzzy c-Means Clustering;Mojtaba Yeganejou ET.AL;《2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)》;全文 *
基于改进模糊C均值的回转支承寿命状态识别;李媛媛;陈捷;黄筱调;洪荣晶;;计算机集成制造系统(11);全文 *
模糊多输出最小二乘支持向量机的分类与回归研究;陆一祎;《中国优秀硕士学位论文全文数据库(电子期刊)》(第01期);全文 *

Also Published As

Publication number Publication date
CN114500335A (en) 2022-05-13

Similar Documents

Publication Publication Date Title
CN114500335B (en) SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine
CN110334875B (en) Wind power combination probability prediction method considering evaluation index conflict
CN111785329B (en) Single-cell RNA sequencing clustering method based on countermeasure automatic encoder
US20220114455A1 (en) Pruning and/or quantizing machine learning predictors
WO2021016533A1 (en) Gene expression programming
CN112684012A (en) Equipment key force-bearing structural part fault diagnosis method based on multi-parameter information fusion
CN113627375A (en) Planetary gear fault diagnosis method and system, storage medium and computing device
Davis et al. On network science and mutual information for explaining deep neural networks
CN112528554A (en) Data fusion method and system suitable for multi-launch multi-source rocket test data
CN117079017A (en) Credible small sample image identification and classification method
CN116787227A (en) Cutter abrasion state monitoring method based on data enhancement and multi-feature fusion under small sample
Wu et al. Quantifying intrinsic uncertainty in classification via deep Dirichlet mixture networks
CN112949524B (en) Engine fault detection method based on empirical mode decomposition and multi-core learning
CN116091763A (en) Apple leaf disease image semantic segmentation system, segmentation method, device and medium
CN111429436B (en) Intrinsic image analysis method based on multi-scale attention and label loss
CN115392102A (en) Method and device for establishing energy consumption prediction model and method and system for predicting energy consumption
Louati et al. Embedding channel pruning within the CNN architecture design using a bi-level evolutionary approach
CN112613366A (en) Driver state detection neural network construction method based on quantum genetic algorithm
RU2819348C1 (en) Method for graphed neural network classification for absence or presence of major depressive disorder according to fmri data
Xiang et al. Quality-distinguishing and patch-comparing no-reference image quality assessment
CN115174421B (en) Network fault prediction method and device based on self-supervision unwrapping hypergraph attention
CN118035923B (en) Power grid wave recording abnormal signal identification method
CN117151229B (en) Cloud reasoning method and system based on cloud side architecture
CN115831339B (en) Medical system risk management and control pre-prediction method and system based on deep learning
CN117689966B (en) Quantum Bayesian neural network-based magnetic resonance image classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant