CN114500335A - SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine - Google Patents

SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine Download PDF

Info

Publication number
CN114500335A
CN114500335A CN202210071611.XA CN202210071611A CN114500335A CN 114500335 A CN114500335 A CN 114500335A CN 202210071611 A CN202210071611 A CN 202210071611A CN 114500335 A CN114500335 A CN 114500335A
Authority
CN
China
Prior art keywords
frequency
amplitude
kernel
low
decomposition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210071611.XA
Other languages
Chinese (zh)
Other versions
CN114500335B (en
Inventor
李帅永
张旭云涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202210071611.XA priority Critical patent/CN114500335B/en
Publication of CN114500335A publication Critical patent/CN114500335A/en
Application granted granted Critical
Publication of CN114500335B publication Critical patent/CN114500335B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Environmental & Geological Engineering (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of network flow prediction, and particularly relates to an SDN network flow control method based on a fuzzy C-means and a mixed kernel least square support vector machine, which comprises the following steps: converting non-stationary SDN network flow data into stationary time sequence components by adopting discrete wavelet transform; processing the stationary time sequence component to obtain amplitude signals of a high frequency band and a low frequency band; clustering amplitude signals of high and low frequency bands by adopting a fuzzy C mean algorithm; predicting the clustered components by adopting an optimized self-adaptive mixed kernel least square support vector machine prediction model; reconstructing the prediction results of all the components to obtain the prediction result of the SDN network data flow; according to the invention, a membership mechanism is introduced by utilizing a fuzzy C-means algorithm, and the time sequence component is divided into three types of high-frequency low-amplitude component, medium-frequency medium-amplitude component and low-frequency high-amplitude component according to the amplitude-frequency characteristic of the time sequence component, so that accurate prediction is provided for subsequent classification prediction.

Description

SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine
Technical Field
The invention belongs to the field of network flow prediction, and particularly relates to an SDN network flow control method based on a fuzzy C-means and a mixed kernel least square support vector machine.
Background
Software Defined Networking (SDN) has gradually become an emerging industry in the network world, and its main idea is to separate the control plane and the data plane, which originally belong to network switches and routers, to realize real forwarding and data separation. Compared with the complexity of a traditional SNMP network distributed measurement system, the SDN network can realize centralized monitoring on network flow data, and prediction on the network flow is also one of important ways for improving the service quality and guaranteeing the service safety of the SDN network. The traditional flow prediction mode is mainly to integrate flow data into a flow time sequence, namely to plan a flow prediction problem into a prediction problem based on the time sequence. The time series prediction mainly predicts the time series state and the development trend of a future period of time according to historical time series data, and relevant work deployment or scheme making can be carried out in advance by the method to deal with abnormal situations which may occur in the predicted data. Generally, time series prediction analysis is more effective for near-term and short-term predictions than for long-term predictions, because if the prediction time point is extended to a further future, a great limitation may occur, which results in a great deviation of the predicted value from the actual value and makes some decisions misjudged.
Most network flow time sequence data have a non-stationary trend, and due to the fact that time sequence prediction depends on the stationarity of time sequences, appropriate methods need to be adopted to decompose the non-stationary flow data to obtain stationary sequences and then carry out the next analysis. Tan and the like introduce multi-scale wavelet transformation to convert non-stationary time sequence signals into a multi-layer relatively stable decomposition sequence, and then an ARMA model and an ARFIMA model are mixed to perform prediction analysis on data of an approximate layer and a detailed layer respectively, so that the method has higher network flow prediction precision, but the method does not perform effective analysis on amplitude-frequency characteristics of decomposition components. Least Squares Support Vector Machine (LSSVM) is a special form of SVM under a quadratic loss function, solving only linear equations and solving very rapidly. In the LSSVM, although sample data is originally complex and has different dimensions, the data is easily separated, and the corresponding data is mapped to a high-dimensional space through a kernel function, so that the LSSVM is widely applied to time series prediction; the method comprises the steps of providing a flow prediction model combining Empirical Mode Decomposition (EMD) and Particle Swarm Optimization (PSO) to optimize a least square support vector machine, decomposing and stabilizing a flow sequence through the EMD, and then optimizing parameters of the LSSVM prediction model through the PSO, so that the prediction accuracy of the model is effectively improved finally, but only one type of kernel function is used for prediction analysis, and the adaptability of each decomposition component to different kernel functions is not considered.
In view of the foregoing, there is an urgent need for an SDN network flow control method that can not only effectively analyze the amplitude-frequency characteristics of decomposed components, but also perform analysis and prediction by using multiple classes of kernel functions.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an SDN network flow control method based on a fuzzy C-means and a mixed kernel least square support vector machine, which comprises the following steps: acquiring non-stationary SDN network flow data, and converting the non-stationary SDN network flow data into stationary time sequence components by adopting discrete wavelet transform; calculating the signal amplitude of the stationary time sequence component, and processing the signal amplitude by adopting fast Fourier transform to obtain a high-frequency-band amplitude signal and a low-frequency-band amplitude signal; clustering the high-frequency-band amplitude signal and the low-frequency-band amplitude signal by adopting a fuzzy C-means algorithm to obtain a high-frequency low-amplitude component, a medium-frequency medium-amplitude component and a low-frequency high-amplitude component; respectively predicting high-frequency low-amplitude components, intermediate-frequency medium-amplitude components and low-frequency high-amplitude components by adopting an optimized self-adaptive mixed kernel least square support vector machine prediction model to obtain prediction results of all the components; reconstructing the prediction results of all the components to obtain the prediction result of the SDN network data flow; and controlling according to the SDN network data flow of the prediction result.
Preferably, the formula for converting non-stationary SDN network traffic data into stationary time sequence components by using discrete wavelet transform is as follows:
A0[s(t)]=s(t)
Figure BDA0003482282230000021
Figure BDA0003482282230000022
wherein t is a time sequence number, s (t) is an initial network flow sequence, and i is a decomposition scale; h and G are a low-pass filter and a high-pass filter of wavelet decomposition; a. theiAnd DiThe wavelet coefficients of the low frequency part and the high frequency part obtained in the i-th layer decomposition are the initial time series s (t), respectively.
Further, the process of determining the decomposition scale includes:
step 1: setting the initial decomposition proportion as N;
step 2: performing stationary decomposition on the time sequence signal according to the initial decomposition proportion;
and step 3: determining whether the (N +1) th component meets the decomposition stopping standard, if so, stopping the decomposition, otherwise, increasing the decomposition proportion and continuing the decomposition; the stop decomposition criterion is whether the number of extremum points is 1 or 0.
Preferably, the formula for processing the signal amplitude by using the fast fourier transform is as follows:
Figure BDA0003482282230000031
where x (N) denotes a decomposition component sequence, N denotes the number of sequences, N denotes a sample point, k is 0,1,2even(n) denotes the even sequence in the component, xodd(n) represents the odd sequence in the component.
Preferably, the process of clustering the high-frequency-band amplitude signal and the low-frequency-band amplitude signal by using the fuzzy C-means algorithm includes:
step 1: initializing decomposition component information and a membership matrix; setting the maximum iteration times and a threshold value epsilon;
step 2: judging whether the current iteration times are less than the maximum iteration times, if so, calculating membership matrixes mu of the high-frequency-band amplitude signals and the low-frequency-band amplitude signals respectivelyij(ii) a Otherwise, outputting a clustering result and a clustering center;
and step 3: clustering center c according to membership matrix pairs of various types of datajUpdating is carried out;
and 4, step 4: calculating the change quantity of the data value function according to the membership matrix and the updated clustering center;
and 5: and comparing the change amount of the data value function with a set threshold epsilon, outputting a clustering result and a clustering center if the change amount is a set threshold when raining, and returning to the step S2 after adding 1 to the iteration times.
Preferably, the process of predicting the high-frequency low-amplitude component, the medium-frequency medium-amplitude component and the low-frequency high-amplitude component by using the optimized adaptive mixed kernel least squares support vector machine prediction model comprises the following steps: aiming at different amplitude-frequency characteristics of each flow sequence component, a Gaussian kernel (K) is constructed for the high-frequency low-amplitude componentGAU) LSSVM prediction model, which constructs a polynomial kernel (K) for the intermediate frequency and amplitude componentsPOL) The LSSVM prediction model is used for constructing a linear kernel (K) for low-frequency high-amplitude componentsLIN) And respectively carrying out prediction analysis by using the LSSVM prediction model.
Further, Gaussian nucleus (K)GAU) The kernel function formula of the LSSVM prediction model is as follows:
Figure BDA0003482282230000041
the prediction function is:
Figure BDA0003482282230000042
polynomial kernel (K)POL) The kernel function of the LSSVM prediction model is:
KPOL(x,xi)=((x,xi)+1)dd is a natural number
The prediction function is:
Figure BDA0003482282230000043
linear kernel (K)LIN) The kernel function of the LSSVM prediction model is:
KLIN(x,xi)=xTxi
the prediction function becomes:
Figure BDA0003482282230000044
wherein, KGAUDenotes the Gaussian kernel, x denotes the test sample, xiRepresenting the input vector, sigma the kernel function parameter, alphaiRepresenting the weight vector, m the sample size, b the offset, KPOLRepresenting a polynomial kernel, KLINRepresenting a linear kernel and T a transpose.
Further, the process of optimizing the prediction model of the adaptive hybrid kernel least squares support vector machine includes:
step 1: initializing parameters of the artificial bee colony algorithm, wherein the initialized parameters comprise: total number of bees N of colonyCAnd the number of leading bees NeNumber of following bees NoNumber of algorithmic solutions NsMaximum number of iterations M and food source parameter combination (γ, σ);
step 2: setting a fitness function f (i);
and step 3: leading bees to search for honey sources, searching for new solutions, calculating the fitness value of each solution, and if the new fitness value is larger, updating and replacing the old solution;
and 4, step 4: after the honey source is updated by the leading bees, calculating following probability according to the benefit degree of the honey source, and selecting the bees to follow by the following bees according to the following probability and carrying out field search;
and 5: if the update failure times of the solution exceed the maximum search times, the solution cannot be optimized continuously, the follower bee gives up the solution, and the follower bee is converted into a scout bee and starts to search a new honey source;
step 6: if the maximum iteration times are reached, finishing the training and outputting the optimal parameter combination (gamma, sigma); otherwise, returning to the step 3.
Preferably, the process of predicting the high-frequency low-amplitude component, the medium-frequency medium-amplitude component and the low-frequency high-amplitude component by using the optimized adaptive mixed kernel least squares support vector machine prediction model comprises the following steps:
preferably, the formula for reconstructing the prediction results of all components is as follows:
Figure BDA0003482282230000051
wherein s (t) represents an initial network flow sequence, H (2t-k) represents a low-pass filter of wavelet decomposition, t represents a time sequence number, k represents time shift, Ai+1[s(t)]The wavelet coefficients of the high frequency part of the (i +1) th layer are represented, and G (2t-k) represents the high pass filter of the wavelet decomposition.
According to the invention, a membership mechanism is introduced by utilizing a fuzzy C-means algorithm, and the time sequence component is divided into three types of high-frequency low-amplitude component, medium-frequency medium-amplitude component and low-frequency high-amplitude component according to the amplitude-frequency characteristic of the time sequence component, so that a powerful basis is provided for subsequent classification prediction. Because the high-frequency time sequence needs a local kernel function with good local learning capability, and the low-frequency time sequence needs a global kernel function with good global learning capability, the invention constructs a hybrid kernel least square support vector machine prediction model according to the algorithm principles and the application fields of different kernel functions, and adopts LSSVM models with different kernels to predict and reconstruct the classification result, thereby effectively improving the prediction precision of network flow.
Drawings
FIG. 1 is an overall flow chart in an embodiment of the present invention;
FIG. 2 is a DWT decomposition scale determination in an embodiment of the present invention;
FIG. 3 is a FCM algorithm flow in an embodiment of the present invention;
FIG. 4 is a flow chart of an ABC optimization prediction model in an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An SDN network flow control method based on a fuzzy C-means and mixed kernel least squares support vector machine, as shown in fig. 1, includes: acquiring non-stationary SDN network flow data, and converting the non-stationary SDN network flow data into stationary time sequence components by adopting discrete wavelet transform; calculating the signal amplitude of the stationary time sequence component, and processing the signal amplitude by adopting fast Fourier transform to obtain a high-frequency-band amplitude signal and a low-frequency-band amplitude signal; clustering the high-frequency-band amplitude signal and the low-frequency-band amplitude signal by adopting a fuzzy C-means algorithm to obtain a high-frequency low-amplitude component, a medium-frequency medium-amplitude component and a low-frequency high-amplitude component; respectively predicting high-frequency low-amplitude components, intermediate-frequency medium-amplitude components and low-frequency high-amplitude components by adopting an optimized self-adaptive mixed kernel least square support vector machine prediction model to obtain prediction results of all the components; reconstructing the prediction results of all the components to obtain the prediction result of the SDN network data flow; and controlling according to the SDN network data flow of the prediction result.
The initial flow sequence is decomposed into low frequency components (reflecting trend features) and high frequency components (reflecting local detail features) using Discrete Wavelet Transform (DWT). The original input time sequence is decomposed and reconstructed by wavelet filter H, G and h, g mainly in combination with the Mallat fast algorithm, and the calculation process is as follows:
A0[s(t)]=s(t)
Figure BDA0003482282230000061
Figure BDA0003482282230000062
wherein t is a time sequence number, s (t) is an initial network flow sequence, and i is a decomposition scale; h and G are a low-pass filter and a high-pass filter of wavelet decomposition; a. theiAnd DiThe wavelet coefficients of the low frequency part and the high frequency part obtained in the i-th layer decomposition of the initial time series s (t) are respectively.
As shown in fig. 2, the process of determining the decomposition scale includes:
step 1: setting the initial decomposition proportion as N;
step 2: performing stationary decomposition on the time sequence signal according to the initial decomposition proportion;
and step 3: determining whether the (N +1) th component meets the decomposition stopping standard, if so, stopping the decomposition, otherwise, increasing the decomposition proportion and continuing the decomposition; the stop decomposition criterion is whether the number of extremum points is 1 or 0.
Calculating the signal amplitude of the stationary time series component comprises: the signal amplitudes of the decomposed components are solved using a sum-of-squares function. The sum of squares function represents the signal amplitude intensity of the decomposed components, which is calculated by the formula:
Figure BDA0003482282230000071
where x (t) represents a time series signal.
Decomposing a high frequency band and a low frequency band by using Fast Fourier Transform (FFT); the fast Fourier transform has the advantages of low requirement on operation and fast operation, and is applied to the derivation of the wavelet decomposition component frequency characteristics. And analyzing the bandwidth of the DWT high-frequency component by adopting fast Fourier transform to obtain a high frequency band and a low frequency band of the DWT high-frequency component.
The formula for processing the signal amplitude by using fast fourier transform is as follows:
Figure BDA0003482282230000072
where x (N) denotes a decomposition component sequence, N denotes the number of sequences, N denotes a sample point, and k is 0,1,2even(n) denotes the even sequence in the component, xodd(n) represents the odd sequence in the component. The whole transformation process needs to go through 2Nlog2N times of multiplication operation.
And classifying the flow sequence components by using a fuzzy C-means algorithm to prepare for establishing a corresponding kernel prediction model according to different class components in the next step. Since the low frequency components with high amplitude represent the variation tendency of these components and the high frequency components with low amplitude represent the details including noise, these components should be classified into at least 3 classes including a high frequency low amplitude component, a middle frequency middle amplitude component and a low frequency high amplitude component.
As shown in fig. 3, the process of clustering the high-band amplitude signal and the low-band amplitude signal by using the fuzzy C-means algorithm includes:
step 1: initializing decomposition component information and a membership matrix; setting the maximum iteration times and a threshold value epsilon;
step 2: judging whether the current iteration times are less than the maximum iteration times, if so, calculating membership matrixes mu of the high-frequency-band amplitude signals and the low-frequency-band amplitude signals respectivelyij(ii) a Otherwise, outputting a clustering result and a clustering center;
membership matrix muijThe calculation formula of (2) is as follows:
Figure BDA0003482282230000081
wherein, dijRepresenting the Euclidean distance, d, of the ith point in time from the jth cluster centerikAnd representing the Euclidean distance between the ith time point and the kth cluster center, m represents a fuzzy factor, and c represents the number of cluster centers.
And step 3: clustering center c according to membership matrix pairs of various types of datajUpdating is carried out;
the computational expression of the cluster center is:
Figure BDA0003482282230000082
where n denotes the number of samples, xjRepresenting the jth sample.
And 4, step 4: calculating the change quantity of the data value function according to the membership matrix and the updated clustering center;
the objective function (cost function) is:
Figure BDA0003482282230000083
wherein c represents the number of clusters; mu.sijRepresenting a degree of membership; dijAnd expressing Euclidean distance from the node to the cluster center, wherein the formula is as follows: d is a radical ofij=||xi-cjL, in which xiIs the ith sample, cjRepresenting the jth cluster center; m represents a weighted index, which is essentially a parameter describing the degree of blurring, with an optimal range of [1, 2.5 ]]In general, 2 is preferred.
And 5: and comparing the change amount of the data value function with a set threshold epsilon, outputting a clustering result and a clustering center if the change amount is a set threshold when raining, and returning to the step S2 after adding 1 to the iteration times. The condition for comparing the amount of change of the data cost function with the set threshold epsilon is:
U(k+1)-U(k)≤ε
wherein U (k +1) represents the cost function after (k +1) iterations, and U (k) represents the cost function after k iterations.
The method for constructing the prediction model of the hybrid kernel least square support vector machine comprises the following steps: introducing a set of SDN network traffic sample sets
Figure BDA0003482282230000091
Wherein xiAs input vector (initial network traffic time series), yiFor its class label, m is the sample capacity, RnRepresenting an n-dimensional real number set and R a real number set.
Constructing a linear regression function:
Figure BDA0003482282230000092
wherein the weight vector w ∈ Rn,
Figure BDA0003482282230000093
The input data is mapped into a high-dimensional feature space.
The regression problem of LSSVM translates to the minimum problem to solve the following formula:
Figure BDA0003482282230000094
the constraint conditions are as follows:
Figure BDA0003482282230000095
wherein e isiγ is a regularization coefficient, which is the error between the ith estimate and the true value.
According to the dual principle, the optimization problem of the LSSVM is converted into a Lagrange equation:
Figure BDA0003482282230000096
and (3) calculating partial derivatives of w, b, e and alpha, and obtaining the optimization condition of the Lagrangian function as follows:
Figure BDA0003482282230000097
converting the optimized conditions into a matrix, namely:
Figure BDA0003482282230000101
wherein the content of the first and second substances,
Figure BDA0003482282230000102
Ωkj=K(xk,xj) K, j is 1, … …, m is a kernel function matrix; gamma is a regularization coefficient; alpha ═ alpha1;...;αm];Y=[y1;...ym]。
The prediction function of the LSSVM is finally obtained as follows:
Figure BDA0003482282230000103
wherein (alpha)1,…,αN) Is a weight vector, K (x, x)i) Is the kernel function and b is the offset.
Since the high frequency time series requires a local kernel function with good local learning capability, while the low frequency time series requires a global kernel function with good global learning capability. Therefore, aiming at different amplitude-frequency characteristics of each flow sequence component, a Gaussian kernel (K) is constructed for the high-frequency low-amplitude componentGAU) LSSVM prediction model, which constructs a polynomial kernel (K) for the intermediate frequency and amplitude componentsPOL) LSSVM prediction model, linear kernel (K) is constructed for low-frequency high-amplitude componentLIN) LSSVM prediction model.
(1) Constructing Gaussian kernel (K) for high-frequency low-amplitude componentGAU) LSSVM prediction model. The corresponding formula for the gaussian kernel is as follows:
Figure BDA0003482282230000104
the corresponding prediction function becomes:
Figure BDA0003482282230000105
(2) construction of polynomial kernels (K) for intermediate frequency intermediate amplitude componentsPOL) The LSSVM prediction model has the following corresponding formula of a polynomial kernel function:
KPOL(x,xi)=((x,xi)+1)dd is a natural number
The corresponding prediction function becomes:
Figure BDA0003482282230000106
(3) construction of Linear Kernel (K) for Low frequency high amplitude componentLIN) The corresponding formula of the LSSVM prediction model and the linear kernel function is as follows:
KLIN(x,xi)=xTxi
the corresponding prediction function becomes:
Figure BDA0003482282230000111
wherein, KGAUDenotes the Gaussian kernel, x denotes the test sample, xiRepresenting the input vector (initial network traffic time series), sigma the kernel parameter, alphaiRepresenting the weight vector, m the sample size, b the offset, KPOLRepresenting a polynomial kernel, KLINRepresenting a linear kernel and T a transpose.
The process of optimizing the prediction model of the adaptive hybrid kernel least squares support vector machine comprises the following steps: the main parameters to be optimized in LSSVM are the following two: a regularization coefficient γ and a kernel parameter σ. The regularization coefficient gamma controls the complexity of the model and approaches the amount of error compromise, and the size of the model also influences the popularization capability of the model; the kernel function parameter σ is a parameter for determining the kernel width, and is mainly optimized for the gaussian kernel function parameter. The invention takes the prediction accuracy of the least square support vector machine as a target function, and establishes a function: minF (gamma, sigma), s.t. gamma epsilon [ gamma ]minmax],σ∈[σminmax]。
As shown in fig. 4, the process of optimizing the adaptive hybrid kernel least squares support vector machine prediction model includes:
step 1: initializing parameters of the artificial bee colony algorithm, wherein the initialized parameters comprise: total number of bees N of colonyCNumber of leading bees NeNumber of following bees NoNumber of algorithmic solutions NsMaximum number of iterations M, and food source parameter combinations (γ, σ).
Step 2: setting a fitness function f (i), wherein the expression of the function is as follows:
Figure BDA0003482282230000112
wherein, yiAnd
Figure BDA0003482282230000113
respectively are an actual value and a predicted value of the time sequence, and n is a training sample.
And step 3: and leading the bees to search for honey sources, searching for new solutions, calculating the fitness value of each solution, and updating and replacing the old solution if the new fitness value is larger. The neighborhood search formula is:
vij=xij+rij(xij-xkj)
wherein x isijJ-dimensional coordinate, r, representing the ith honey sourceij∈[-1,1]Is a randomly selected number, xkjRepresenting the j-th dimensional coordinate of the kth honey source.
And 4, step 4: and after the honey source is updated by the leading bees, calculating the following probability according to the benefit degree of the honey source, and selecting the bees to follow by the following bees according to the following probability and carrying out field search. The following probability function is:
Figure BDA0003482282230000121
wherein, f (X)i) Denotes the fitness value, X, of the ith honey sourceiRepresenting the ith honey source.
And 5: and if the update failure times of the solution exceed the maximum search times, the solution cannot be optimized continuously, the follower bee gives up the solution, and the follower bee is converted into a scout bee and starts to search a new honey source. The honey source updating formula is as follows:
xij=xmin(j)+rand(0,1)(xmax(j)-xmin(j))
wherein x ismin(j) And xmax(j) Respectively, a minimum value and a maximum value of the j-th dimension, and rand (0,1) represents a random number over the interval (0, 1).
Step 6: if the maximum iteration times are reached, finishing the training and outputting the optimal parameter combination (gamma, sigma); otherwise, returning to the step 3.
After the optimization of the prediction model is completed, data reconstruction needs to be performed on the predicted values of the wavelet components, and a final SDN network flow prediction result is output after integration. The formula for reconstructing the prediction results of all components is as follows:
Figure BDA0003482282230000122
wherein s (t) represents an initial network flow sequence, H (2t-k) represents a low-pass filter of wavelet decomposition, t represents a time sequence number, k represents time shift, Ai+1[s(t)]The wavelet coefficients of the high frequency part of the (i +1) th layer are represented, and G (2t-k) represents the high pass filter of the wavelet decomposition.
The prediction component reconstruction process is as follows: the average predicted component is stretched by inserting zero value between 2 samples, namely 'up sampling', and then is subjected to low-pass filter to obtain approximation of large-scale low resolution, namely low-pass output; after the detail signal is up-sampled, the detail signal passes through a high-pass filter to finally obtain high-pass output. And adding the low-pass output and the high-pass output to obtain a reconstructed SDN network flow predicted value.
The above-mentioned embodiments, which further illustrate the objects, technical solutions and advantages of the present invention, should be understood that the above-mentioned embodiments are only preferred embodiments of the present invention, and should not be construed as limiting the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A SDN network flow control method based on a fuzzy C mean value and a mixed kernel least square support vector machine is characterized by comprising the following steps: acquiring non-stationary SDN network flow data, and converting the non-stationary SDN network flow data into stationary time sequence components by adopting discrete wavelet transform; calculating the signal amplitude of the stationary time sequence component, and processing the signal amplitude by adopting fast Fourier transform to obtain a high-frequency-band amplitude signal and a low-frequency-band amplitude signal; clustering the high-frequency band amplitude signal and the low-frequency band amplitude signal by adopting a fuzzy C-means algorithm to obtain a high-frequency low-amplitude component, a medium-frequency medium-amplitude component and a low-frequency high-amplitude component; respectively predicting high-frequency low-amplitude components, intermediate-frequency medium-amplitude components and low-frequency high-amplitude components by adopting an optimized self-adaptive mixed kernel least square support vector machine prediction model to obtain prediction results of all the components; reconstructing the prediction results of all the components to obtain the prediction result of the SDN network data flow; and controlling according to the SDN network data flow of the prediction result.
2. The SDN network flow control method based on the fuzzy C-means and mixed kernel least squares support vector machine as claimed in claim 1, wherein the formula for transforming non-stationary SDN network traffic data into stationary time sequence components by using discrete wavelet transform is as follows:
A0[s(t)]=s(t)
Figure FDA0003482282220000011
Figure FDA0003482282220000012
wherein t is a time sequence number, s (t) is an initial network flow sequence, and i is a decomposition scale; h and G are a low-pass filter and a high-pass filter of wavelet decomposition; a. theiAnd DiThe wavelet coefficients of the low frequency part and the high frequency part obtained in the i-th layer decomposition are the initial time series s (t), respectively.
3. The SDN network flow control method based on the fuzzy C-means and mixed kernel least squares support vector machine as claimed in claim 2, wherein the process of determining the decomposition scale comprises:
step 1: setting the initial decomposition ratio to be N;
step 2: performing stationary decomposition on the time sequence signal according to the initial decomposition proportion;
and step 3: determining whether the (N +1) th component meets the decomposition stopping standard, if so, stopping the decomposition, otherwise, increasing the decomposition proportion and continuing the decomposition; the stop decomposition criterion is whether the number of extremum points is 1 or 0.
4. The SDN network flow control method based on the fuzzy C-means and mixed kernel least squares support vector machine according to claim 1, wherein the formula for processing the signal amplitude by using the fast Fourier transform is as follows:
Figure FDA0003482282220000021
k=0,1,2,...,N-1
wherein x (N) represents a decomposition component sequence, N represents the number of sequences, N represents a sample point, and xeven(n) denotes the even sequence in the component, xodd(n) represents the odd sequence in the component.
5. The SDN network flow control method based on the fuzzy C-means and mixed kernel least squares support vector machine according to claim 1, wherein the process of clustering the high-band amplitude signals and the low-band amplitude signals by using the fuzzy C-means algorithm comprises:
step 1: initializing decomposition component information and a membership matrix; setting the maximum iteration times and a threshold value epsilon;
step 2: judging whether the current iteration times are less than the maximum iteration times, if so, calculating membership matrixes mu of the high-frequency-band amplitude signals and the low-frequency-band amplitude signals respectivelyij(ii) a Otherwise, outputting a clustering result and a clustering center;
and step 3: clustering center c according to membership matrix pairs of various types of datajUpdating is carried out;
and 4, step 4: calculating the change quantity of the data value function according to the membership matrix and the updated clustering center;
and 5: and comparing the change amount of the data value function with a set threshold epsilon, outputting a clustering result and a clustering center if the change amount is a set threshold when raining, and returning to the step S2 after adding 1 to the iteration times.
6. The SDN network flow control method according to claim 1, wherein the step of predicting the high-frequency low-amplitude component, the medium-frequency medium-amplitude component, and the low-frequency high-amplitude component using the optimized adaptive hybrid kernel least squares support vector machine prediction model comprises: aiming at different amplitude-frequency characteristics of each flow sequence component, a Gaussian kernel (K) is constructed for the high-frequency low-amplitude componentGAU) LSSVM prediction model, which constructs a polynomial kernel (K) for the intermediate frequency and amplitude componentsPOL) The LSSVM prediction model is used for constructing a linear kernel (K) for low-frequency high-amplitude componentsLIN) And respectively carrying out prediction analysis by using the LSSVM prediction model.
7. The SDN network flow control method based on fuzzy C-means and mixed kernel least squares support vector machine as claimed in claim 6, wherein Gaussian kernel (K)GAU) The kernel function formula of the LSSVM prediction model is as follows:
Figure FDA0003482282220000031
the prediction function is:
Figure FDA0003482282220000032
polynomial kernel (K)POL) The kernel function of the LSSVM prediction model is:
KPOL(x,xi)=((x,xi)+1)dd is a natural number
The prediction function is:
Figure FDA0003482282220000033
linear kernel (K)LIN) The kernel function of the LSSVM prediction model is:
KLIN(x,xi)=xTxi
the prediction function becomes:
Figure FDA0003482282220000034
wherein, KGAUDenotes the Gaussian kernel, x denotes the test sample, xiRepresenting the input vector, sigma the kernel function parameter, alphaiRepresenting the weight vector, m the sample size, b the offset, KPOLRepresenting a polynomial kernel, KLINRepresenting a linear kernel and T a transpose.
8. The SDN network flow control method based on the fuzzy C-means and the hybrid kernel least squares support vector machine as claimed in claim 6, wherein the process of optimizing the parameters in the prediction model of the adaptive hybrid kernel least squares support vector machine by using the artificial bee colony algorithm comprises:
step 1: initializing parameters of the artificial bee colony algorithm, wherein the initialized parameters comprise: total number of bees N of colonyCNumber of leading bees NeNumber of following bees NoNumber of algorithmic solutions NsMaximum number of iterations M and food source parameter combination (γ, σ);
step 2: setting a fitness function f (i);
and step 3: leading bees to search for honey sources, searching for new solutions, calculating the fitness value of each solution, and if the new fitness value is larger, updating and replacing the old solution;
and 4, step 4: after the honey source is updated by the leading bees, calculating following probability according to the benefit degree of the honey source, and selecting the bees to follow by the following bees according to the following probability and carrying out field search;
and 5: if the update failure times of the solution exceed the maximum search times, the solution cannot be optimized continuously, the follower bee gives up the solution, and the follower bee is converted into a scout bee and starts to search a new honey source;
step 6: if the maximum iteration times are reached, finishing the training and outputting the optimal parameter combination (gamma, sigma); otherwise, returning to the step 3.
9. The SDN network flow control method based on the fuzzy C-means and mixed kernel least squares support vector machine as claimed in claim 1, wherein the formula for reconstructing the prediction results of all components is:
Figure FDA0003482282220000041
wherein s (t) represents an initial network flow sequence, H (2t-k) represents a low-pass filter of wavelet decomposition, t represents a time sequence number, k represents time shift, Ai+1[s(t)]The wavelet coefficients of the high frequency part of the (i +1) th layer are represented, and G (2t-k) represents the high pass filter of the wavelet decomposition.
CN202210071611.XA 2022-01-21 2022-01-21 SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine Active CN114500335B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210071611.XA CN114500335B (en) 2022-01-21 2022-01-21 SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210071611.XA CN114500335B (en) 2022-01-21 2022-01-21 SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine

Publications (2)

Publication Number Publication Date
CN114500335A true CN114500335A (en) 2022-05-13
CN114500335B CN114500335B (en) 2023-06-16

Family

ID=81472505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210071611.XA Active CN114500335B (en) 2022-01-21 2022-01-21 SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine

Country Status (1)

Country Link
CN (1) CN114500335B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115389888A (en) * 2022-10-28 2022-11-25 山东科华电力技术有限公司 Partial discharge real-time monitoring system based on high-voltage cable
CN115442246A (en) * 2022-08-31 2022-12-06 武汉烽火技术服务有限公司 Flow prediction method, device, equipment and storage medium of data plane network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086928A (en) * 2018-07-27 2018-12-25 福州大学 Photovoltaic plant realtime power prediction technique based on SAGA-FCM-LSSVM model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086928A (en) * 2018-07-27 2018-12-25 福州大学 Photovoltaic plant realtime power prediction technique based on SAGA-FCM-LSSVM model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MOJTABA YEGANEJOU ET.AL: "Classification via Deep Fuzzy c-Means Clustering", 《2018 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE)》 *
李媛媛;陈捷;黄筱调;洪荣晶;: "基于改进模糊C均值的回转支承寿命状态识别", 计算机集成制造系统 *
陆一祎: "模糊多输出最小二乘支持向量机的分类与回归研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115442246A (en) * 2022-08-31 2022-12-06 武汉烽火技术服务有限公司 Flow prediction method, device, equipment and storage medium of data plane network
CN115442246B (en) * 2022-08-31 2023-09-26 武汉烽火技术服务有限公司 Traffic prediction method, device, equipment and storage medium of data plane network
CN115389888A (en) * 2022-10-28 2022-11-25 山东科华电力技术有限公司 Partial discharge real-time monitoring system based on high-voltage cable
CN115389888B (en) * 2022-10-28 2023-01-31 山东科华电力技术有限公司 Partial discharge real-time monitoring system based on high-voltage cable

Also Published As

Publication number Publication date
CN114500335B (en) 2023-06-16

Similar Documents

Publication Publication Date Title
CN110334875B (en) Wind power combination probability prediction method considering evaluation index conflict
CN112001270B (en) Ground radar automatic target classification and identification method based on one-dimensional convolutional neural network
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN114500335B (en) SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine
Cateni et al. A hybrid feature selection method for classification purposes
CN111785329A (en) Single-cell RNA sequencing clustering method based on confrontation automatic encoder
CN112684012A (en) Equipment key force-bearing structural part fault diagnosis method based on multi-parameter information fusion
Bodyanskiy Computational intelligence techniques for data analysis
CN114964778A (en) Bearing fault diagnosis method based on wavelet time-frequency graph and deep learning
CN115225516B (en) LSSVM network flow prediction method based on improved ABC-VMD
CN112784920A (en) Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part
Zhang et al. Features fusion exaction and KELM with modified grey wolf optimizer for mixture control chart patterns recognition
CN112307927A (en) BP network-based identification research for MPSK signals in non-cooperative communication
CN113837122B (en) Wi-Fi channel state information-based contactless human body behavior recognition method and system
CN105303051A (en) Air pollutant concentration prediction method
Fukuda et al. Analysis of dynamics in chaotic neural network reservoirs: Time-series prediction tasks
CN117131022B (en) Heterogeneous data migration method of electric power information system
CN112528554A (en) Data fusion method and system suitable for multi-launch multi-source rocket test data
CN117079017A (en) Credible small sample image identification and classification method
CN116865884A (en) Broadband spectrum sensing method based on online learning
CN117034060A (en) AE-RCNN-based flood classification intelligent forecasting method
CN115423091A (en) Conditional antagonistic neural network training method, scene generation method and system
CN111429436B (en) Intrinsic image analysis method based on multi-scale attention and label loss
CN114705431A (en) Rolling bearing fault diagnosis method based on multi-parameter screening criterion and GWO-PNN
CN115392102A (en) Method and device for establishing energy consumption prediction model and method and system for predicting energy consumption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant