CN115018038B

CN115018038B - Network interface flow prediction method based on sparrow optimization width learning system

Info

Publication number: CN115018038B
Application number: CN202210247884.5A
Authority: CN
Inventors: 李少波; 李笑瑜; 周鹏; 陈光林
Original assignee: Guizhou University
Current assignee: Guizhou University
Priority date: 2022-03-14
Filing date: 2022-03-14
Publication date: 2024-03-05
Anticipated expiration: 2042-03-14
Also published as: CN115018038A

Abstract

The invention discloses a network interface flow prediction method based on a sparrow optimization width learning system, which comprises the following steps: the obtained network interface flow data determine a prediction period T, and the network flow at the moment T is predicted by utilizing the network flow information of [ T-12, T-1 ]; initializing network parameters; randomly generating p groups of shrinkage coefficients and regularization coefficients in the value range of the parameters as initial super parameters; respectively using p groups of initial super-parameters to automatically train a width learning system model to generate initial fitness; (5) optimizing the super-parameters by utilizing a sparrow optimization algorithm; (6) Training a width learning system by using the updated super parameters, and updating the fitness value; (7) Judging the maximum iteration times, outputting network parameters corresponding to the optimal adaptability, training a width learning model by using the network parameters, and establishing a network interface flow prediction model; otherwise, returning to the step (5). The invention can predict the accuracy of the network flow and reduce the influence of the network super-parameters on the prediction result.

Description

Network interface flow prediction method based on sparrow optimization width learning system

Technical Field

The invention belongs to the technical field of network interface flow prediction, and relates to a network interface flow prediction method based on a sparrow optimization width learning system.

Background

Most of the existing network traffic prediction models are deep learning models. The amount of network parameters for deep learning models is typically large, consuming more time and computing resources to train the model. The width learning system can complete training of the model in a shorter time, but the super parameters have a larger influence on the performance of the network model. The current network parameter adjustment mode is mostly manual parameter adjustment. This method is relatively dependent on the experience of the researcher and requires repeated training of the model to make adjustments, which is time consuming.

Disclosure of Invention

The invention aims to solve the technical problems that: a network interface flow prediction method based on a sparrow optimization width learning system is provided to solve the technical problems in the prior art.

The technical scheme adopted by the invention is as follows: a network interface flow prediction method based on a sparrow optimization width learning system comprises the following steps:

(1) For the obtained network interface flow data, determining a prediction period T, and predicting the network flow at the moment T by using the network flow information of [ T-12, T-1 ];

(2) Initializing network parameters: the method comprises the steps of population number, seeker proportion and maximum iteration times of a sparrow optimization algorithm; the method comprises the steps of a contraction coefficient value range, a regularization coefficient value range, the number of windows of a feature mapping layer, the number of nodes in a single window and the number of nodes of an enhancement layer in a width learning system;

(3) Randomly generating p groups of shrinkage coefficients and regularization coefficients as initial super parameters in the value range of the parameters, wherein p is the number of sparrows;

(4) Respectively using p groups of initial super parameters and network interface flow data to automatically train a width learning system model to generate initial fitness;

(5) Optimizing the super parameters by utilizing a sparrow optimization algorithm;

(6) The adaptability value is updated by utilizing the network interface flow data and the updated super-parameter training width learning system;

(7) Judging whether the maximum iteration times are reached, if so, outputting network parameters corresponding to the optimal adaptability, training a width learning model by using the network parameters, and establishing a network interface flow prediction model; otherwise, returning to the step (5).

The method for optimizing the super-parameters by utilizing the sparrow optimization algorithm in the step (5) comprises the following steps:

1) Taking the p groups of initial super parameters generated in the step (3) as initial positions of p sparrows, and calculating the number of the cable seekers, wherein the calculation formula is as follows:

pNum＝p×p_scale

wherein pNum is the number of explorers in the population, p is the population size, and p_scale is the ratio of the explorers;

2) Updating the seeker position, and calculating the following formula:

wherein X is a super parameter to be optimized, namely the sparrow position,a value representing the jth dimension of the ith sparrow in the t-th iteration; r is R ₂ Indicating a warning value in the range of 0,1]Is a random number which is uniformly distributed; ST represents a safety threshold, Q is a random number matrix in normal distribution, L is a 1 x d dimension matrix, and d is the population size; when R is ₂ When ST is less than the threshold, the current position is safe, and sparrows search food in the accessories; otherwise, the current position is dangerous, and the seeker needs to guide the sparrow group to search a new place to search food;

3) Updating the follower position, and calculating the following formula:

wherein A is ⁺ ＝A ^T (AA ^T ) ^-1 A is a 1 xd dimensional matrix, each latitude value is from [ -1,1]Randomly generating;indicating the position of sparrow in the worst state, when i is less than n/2, indicating that the follower does not get food, the state of the sparrow is bad, and the sparrow needs to go to other places where more food is obtained; otherwise, continuing to search for food nearby the seeker;

4) For sparrows found to be dangerous as a scouter and their locations updated, the calculation formula is as follows:

wherein f _i And f _g The fitness and the optimal fitness of the ith sparrow are respectively f _w Is the worst fitness; beta is a random number in standard normal distribution, K is a value of [ -1,1]Uniform random numbers within the range, epsilon is a small value (0 < epsilon < 10) ^-10 ) To prevent the denominator from being 0.

The invention has the beneficial effects that: compared with the prior art, the invention optimizes two super parameters of the regularization parameter and the contraction coefficient in the width learning system by using the sparrow optimization algorithm, thereby avoiding the complicated manual parameter adjustment process; and training a width learning system by utilizing the optimal super parameters obtained by optimization, and improving the accuracy of network flow prediction.

Drawings

FIG. 1 is a flow chart of a width learning system algorithm;

FIG. 2 is a flow chart of a sparrow search algorithm;

FIG. 3 is a pseudo code diagram of the SSA-BLS algorithm;

FIG. 4 is a flowchart of the SSA-BLS algorithm;

fig. 5 is a graph of predicted results of a traffic data set of a core network in a certain city in europe;

FIG. 6 is a graph of predicted outcome for a British academic backbone network traffic dataset;

FIG. 7 is a graph of the prediction results of the enterprise cloud platform switch interface traffic data set;

FIG. 8 is a diagram of the breadth-learning system at LSTM runtime.

Detailed Description

The invention will be further described with reference to specific examples.

Preliminary knowledge: the width learning system is a novel random weight neural network. Compared with the traditional neural network, the neural network is based on a random vector function linked neural network, and is mainly used for solving the problems of large calculation amount and high calculation cost of deep learning. The width learning system contains 3 layers, namely an input layer, a hidden layer and an output layer. The hidden layer is composed of a feature mapping layer and an enhancement node layer. The hidden layer of the width learning system is of a single-layer structure and consists of a feature mapping layer and an enhancement node layer.

The algorithm flow of the width learning system is as follows. Let training data be X ε R ^N×M The sample comprises N samples, each sample has M dimensions, and the corresponding label is Y epsilon R ^N×C . By n feature mapping functions(i=1,., n) mapping training data X into n sets of feature maps, each set of maps yielding K _i A node in which the ith group of features maps Z _i The calculation method is shown in formula (1).

\*MERGEFORMAT Z _i ＝φ _i (XW _ei +β _ei ) \*MERGEFORMAT (1)

Wherein W is _ei And beta _ei Is a randomly generated feature map weight matrix and bias matrix. In the practical application process, phi _i Often a non-linear mapping function.

Representing n groups of feature mapping nodes obtained by the feature mapping layer as Z _in ＝(Z ₁ ,Z ₂ ,…,Z _n ) Will Z _in Is connected with the enhanced node layer and passes through the activation function ζ of the enhanced node layer _j Will Z _in Mapping into m groups of enhanced nodes, each group consisting of q nodes, then the j-th group of enhanced nodes H _j May be expressed as.

H _j ＝ζ _j (Z _in W _hj +β _hj ) \*MERGEFORMAT (2)

Wherein W is _hj Is a randomly generated enhancement layer node weight matrix, beta _hj Is the corresponding bias matrix; activation function ζ _j Different nonlinear activation functions may be selected to adequately extract characteristic information of the input data.

Similarly, m groups of enhanced nodes H _j Denoted as H _jm ＝(H ₁ ,H ₂ ,…,H _m ) Will Z _in And H is _jm The combination is expressed as:

A＝(Z _in |H _jm ) \*MERGEFORMAT (3)

the output of the width learning system is then:

Y＝(Z _in |H _jm )W＝AW \*MERGEFORMAT (4)

wherein, W is the output layer connection weight matrix, and there are:

W＝A ⁺ Y \*MERGEFORMAT (5)

A ⁺ is a pseudo-inverse matrix of the matrix A, and the model training is completed by calculating a connection weight matrix W by solving pseudo-inverse:

wherein lambda > 0 is a regularization coefficient; i is the identity matrix.

Compared with the traditional nerve, the width learning system has two main characteristics: firstly, the sparse self-encoder is utilized to refine random features of input data into sparse and compact feature sets, and important features are explored through a sparse feature learning model, so that the features of the input data are better described, and the operation efficiency is improved; secondly, aiming at the problem that in the deep learning system, if the network model cannot reach the required precision, the number of network layers is increased or the structure is changed, and then the training model is readjusted, so that the calculated amount and the calculated time are increased, the width learning system adopts incremental learning, and the model is dynamically adjusted by adding the enhancement nodes, so that the training time is greatly shortened, and a certain precision can be ensured.

An algorithm flow chart of the width learning system is shown in fig. 1.

The sparrow search algorithm is a novel group intelligent optimization algorithm and is based on the search and anti-predation behaviors of sparrows on foods. Sparrow search algorithms divide sparrows into seekers and followers and formulate the following rules for sparrow exercise.

1) Sparrows in a population are classified into explorers and followers according to the degree of fitness of the sparrows. The fitness reflects the merits of sparrow predation sites. The better the sparrow position with higher fitness, the better the food can be found.

2) Sparrows with high fitness are seekers, and other sparrows act as followers. The seeker is responsible for investigating the food-rich site and directing the follower to the foraging site and direction. The follower can search for the best seeker at the predatory location and then find food around it.

3) The fitness value of the sparrow individuals is dynamically changed, so that the identities of the seeker and the follower can be mutually changed, but the proportion of the seeker is kept unchanged.

4) The lower the fitness value of the follower, the worse its predation position. These followers may randomly fly to other places to find food.

5) A certain proportion of individuals are randomly selected from the sparrow population as scouts and are responsible for monitoring the safety of the surrounding environment. When predators are found, the scout will give an alarm, and when the alarm value is greater than the safety value, the explorer will take the follower to a safer area to find food.

(6) Sparrows at the edges of the population will quickly shift to a safe area to obtain a better location when the hazard is perceived, while sparrows at the center will randomly move.

The algorithm flow of SSA is as follows, and the algorithm flow chart is shown in fig. 2.

The first step: parameter initialization mainly comprises setting the number of sparrows, the proportion of explorers, the positions of sparrows, the maximum iteration number and the like.

Step 2: judging whether the current position of the population is safe or not, and updating the position of the explorer by using a formula (7).

Wherein,the value of the j dimension of the i-th sparrow in the t-th iteration is represented. R is R ₂ Indicating a warning value in the range of 0,1]Is a random number which is uniformly distributed. ST represents a safety threshold value, and the value range is [0.5,1.0]Q is a random number in normal distribution, L is a 1 x d dimension matrix, and d is the population size. When R is ₂ When ST is less than the current position, the current position is safe, and sparrow groups search for food; otherwise, the current position is threatened, and the cable is detectedThe user needs to guide the sparrow group to find new places to find food.

And a third step of: and judging the state of the follower, and updating the position of the follower according to the formula (8).

Wherein,indicating the position of sparrow in worst condition, A ⁺ ＝A ^T (AA ^T ) ^-1 A is a 1 xd-dimensional matrix, each latitude value is from [ -1,1]Is randomly generated. When i is less than n/2, the follower does not get food, the state is poor, and the follower needs to go to other places where more food can be obtained; otherwise, continue to find food in the vicinity of the seeker.

Fourth step: part of sparrows in the population are found dangerous and become alertors, and the positions of the alerters are determined according to the formula (9).

Wherein f _i And f _g The fitness and the optimal fitness of the ith sparrow are respectively f _w Is the worst fitness. Beta is a random number in standard normal distribution, K is a value of [ -1,1]The uniform random number in the range, epsilon, is a small value to prevent the denominator from being 0.

Step 5: updating the sparrow fitness.

Step 6: judging whether the iteration stop condition is met, and if the iteration stop condition is not met, repeating the steps 2 to 5.

Example 1: as shown in fig. 1-8, a network interface flow prediction method based on a sparrow optimization width learning system, the method comprises the following steps:

(5) The sparrow optimization algorithm is utilized to optimize the super parameters, and the detailed steps are as follows:

pNum＝p×p_scale

2) Updating the seeker position, and calculating the following formula:

3) Updating the follower position, and calculating the following formula:

wherein f _i And f _g The fitness and the optimal fitness of the ith sparrow are respectively f _w Is the worst fitness. Beta is a random number in standard normal distribution, K is a value of [ -1,1]A uniform random number within the range, ε being a small value to prevent the denominator from being 0;

In order to predict the accuracy of network flow and reduce the influence of network superparameters on a prediction result, two superparameters of a width learning system, namely a contraction coefficient (r) and a regularization coefficient (lambda), are optimized by utilizing a sparrow search algorithm, and an optimal superparameter output by optimization is utilized to establish a network flow prediction model. I call this method SSA-BLS with pseudo code and algorithm flow diagrams shown in FIGS. 3 and 4.

The experimental data set uses the European core network traffic data set of a certain city and the uk academic backbone network traffic data set.

European core network traffic data set: the data is from the uk academic backbone network traffic dataset: the data set collected aggregate traffic in bits for the uk academic network backbone from 30 minutes at 9 of 11 months 2004 to 11 minutes at 27 of 1 month 2005, with a sampling interval of five minutes.

Taking data of 7 months 1 to 25 days in 2005 in a European core network traffic data set as a training set and taking data of 7 months 26 to 28 days in 7 months as a test set; the data of 1 month, 1 day and 24 days of 2005 in the uk academic backbone network flow data set are used as training sets, and the data of 1 month, 25 days and 1 month, 27 days are used as test sets.

1) Parameters and evaluation index

The SSA-BLS parameters are configured as follows. The population size is 50, the producer accounts for 20%, the maximum iteration number is 5, and the dimension is 2; the number of windows of the mapping layer is 10, the number of nodes in each window of the mapping layer is 10, the number of nodes of the enhancement layer is 50, and the values of the shrinkage coefficient (r) and the regularization coefficient (lambda) are respectively in the range of [0.09,0.999999 ]]And [2 ] ^-30 ,2 ^-35 ]。

MSE, RMSE, MAE, MAPE, MA is used as an evaluation index, and the calculation modes are shown in formulas (10), (11), (12), (13) and (14).

Where n is the total number of samples,as predicted value, y _i Is a true value. The smaller MSE, RMSE, MAE, MAPE the better the MAPE the closer to 100% the better the model predictive performance.

2) Results and discussion

The experiments used the SSA-BLS model and [ T-12, T-1]]To predict the flow value at time T. In order to verify the performance of the SSA-BLS model, the SSA-BLS model is compared with BLS, ELM, SCN, RVFL with similar structures, with the RVFL deformation dRVFL and an LSTM model commonly used in network traffic prediction, each model is independently operated for 100 times, and the average value of the evaluation indexes of the operation results is taken as a final result. Wherein, the values of the contraction coefficient r and the regularization coefficient lambda of the BLS model are respectively from {0.1,0.5,0.9,0.99,0.9999,0.99999} and {2 } ^-30 ,2 ^-20 ,2 ^-10 0.5,1,5,10, the rest parameters are the same as SSA-BLS model; the number of nodes of the maximum hidden layer of the SCN model is 250, and the number of the maximum candidate nodes is 100; the regularization coefficient of RVFL is 1e-3, and the number of hidden layer nodes is 100; the dRVFL parameter is the same as RVFL; the LSTM model contains 12 hidden layers per layer, with a training learning rate of 1e-2, a batch_size of 64, and an epoch of 15. The predicted performance of each model on the test set of the two data sets is shown in tables 1-2.

Table 1 experimental results of flow data set of core network of certain city in europe

	MSE	RMSE	MAE	MAPE	MA
						SSA-BLS	0.0159047	0.1261069	0.0937315	0.0294284	97.057155％
BLS	0.0781227	0.2551322	0.1878021	0.0571019	94.289801％
						SCN	0.0154907	0.1244372	0.0934485	0.0295662	97.043378％
RVFL	0.0254023	0.1593589	0.1208186	0.0388347	96.116525％
						dRVFL	0.0227553	0.1507691	0.1135728	0.0367191	96.328085％
ELM	0.1394488	0.3686439	0.2710739	0.0780252	92.197470％
						LSTM	0.0781441	0.2502372	0.1884968	0.0517535	94.824642％

Table 2 experimental results of the uk academic backbone network traffic dataset

Table 3 Experimental results of enterprise cloud platform switch interface flow data set

	MSE	RMSE	MAE	MAPE	MA
						SSA-BLS	0.0000734	0.0082991	0.0063628	0.0021407	99.785924％
BLS	0.0103714	0.0811761	0.0563080	0.0176119	98.238804％
						SCN	0.0001742	0.0130396	0.0067544	0.0021857	99.781427％
RVFL	0.0361230	0.1899801	0.1288009	0.0400804	95.991952％
						dRVFL	0.0327578	0.1807739	0.1277397	0.0403579	95.964208％
ELM	0.0579519	0.2382928	0.1327614	0.0400340	95.996599％
						LSTM	0.0283041	0.1057008	0.0759722	0.0238097	97.619024％

Fig. 5 and 6 are graphs of SSA-BLS model versus other models for predicted versus actual values on a training set of common data sets.

In addition, to better verify the predictive performance of the SSA-BLS model, the model is applied to a private traffic dataset. The private traffic data set is the actual ingress traffic data from the switch interface of a business 2021, 10 th, 5 th to 18 th. The model was validated using data from day 5 of year 10 to day 16 of year 10 of the private dataset as the training set and data from day 17 to day 18 of year 10 as the test set.

Because the sampling intervals of the enterprise switch interface traffic data are not equal, resampling is performed first: the mean value of the interface flow over 5 minutes is calculated and if no flow data is present over 5 minutes, the previous value is used for filling. Meanwhile, the original data has a great abnormal flow value, and the data is smoothed by utilizing spectrum smoothing (spectral smoother) in order to reduce the influence of the abnormal value on prediction. The SSA-BLS model and the comparative model parameters are as above, and the predicted performance of each model is shown in Table 3.

FIG. 7 is a graph of predicted versus actual values of the SSA-BLS model versus other models on a training set in a private dataset. From tables 1, 2 and 3, it can be seen that the SSA-BLS model has better ultra-short term predictive performance than other models in both the uk academic backbone network traffic data set and the private data set, and also has higher model accuracy in the european core network traffic data set in a certain city. In terms of the accuracy of single-step prediction of network traffic, the prediction accuracy of the SSA-BLS model is higher than that of other models in a British academic backbone network traffic data set and an enterprise cloud platform exchanger interface traffic data set, and the prediction performance of the SSA-BLS model in a European core network traffic data set is only slightly inferior to that of an SCN model, but the SSA-BLS model has higher accuracy overall, so that the model can select better network super-parameters and better time sequence characteristics of capturing traffic. In terms of time consumption, FIG. 8 is the time spent by the breadth-learning system training once to run 1 epoch with the LSTM model. Wherein, dataset1, dataset2 and dataset3 respectively represent a UK academic main trunk network traffic data set, a European core network traffic data set and an enterprise cloud platform switch

Interface traffic data sets. As can be clearly seen in fig. 8, the width learning system can complete training in a shorter time, and the greater the data volume, the greater the temporal advantage of the BLS.

The invention provides a width learning model based on a sparrow optimization algorithm, which optimizes two super parameters of a contraction coefficient (r) and a regularization coefficient (lambda) in the width learning model by utilizing the sparrow optimization algorithm, and trains the model by utilizing the output optimal super parameters, so that the influence of the super parameters on the model is reduced, and the accuracy of the model is improved. The model avoids the complicated manual parameter adjustment process, and the better network super-parameter combination is selected by utilizing an algorithm, so that the performance of the BLS is optimal. The SSA-BLS model is applied to the field of short-term prediction of network traffic, and two public data sets of the network traffic and a real data set of the interface traffic of an enterprise cloud platform network switch are selected for experiments. In order to verify the model effect, the SSA-BLS model is compared with models such as BLS, and experiments show that the SSA-BLS model can select better super parameters to enable the prediction accuracy of network flow to reach more than 97%.

The foregoing is merely illustrative of the present invention, and the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the scope of the present invention, and therefore, the scope of the present invention shall be defined by the scope of the appended claims.

Claims

1. A network interface flow prediction method based on a sparrow optimization width learning system is characterized by comprising the following steps of: the method comprises the following steps:

(6) Training a width learning system by utilizing the network interface flow data and the updated super parameters, and updating the fitness value;

2. The network interface traffic prediction method based on the sparrow optimization width learning system according to claim 1, wherein the method comprises the following steps: the method for optimizing the super-parameters by utilizing the sparrow optimization algorithm in the step (5) comprises the following steps:

1) Taking the p groups of initial super parameters generated in the step (3) as initial positions of p sparrows, and calculating the number of explorers, wherein the calculation formula is as follows:

pNum＝p×p_scale

2) Updating the seeker position, and calculating the following formula:

wherein X is a super parameter to be optimized, namely the sparrow position,a value representing the jth dimension of the ith sparrow in the t-th iteration; r is R ₂ Representation ofWarning value, the value range is [0, 1]]Is a random number which is uniformly distributed; ST represents a safety threshold, Q is a random number matrix in normal distribution, L is a 1 x d dimension matrix, and d is the population size; when R is ₂ When ST is less than the threshold, the current position is safe, and sparrows search food in the accessories; otherwise, the current position is dangerous, and the seeker needs to guide the sparrow group to search a new place to search food;

3) Updating the follower position, and calculating the following formula:

wherein A is ⁺ ＝A ^T (AA ^T ) ^-1 A is a 1 xd dimensional matrix, each latitude value is from [ -1,1]Randomly generating;indicating the position of sparrow in the worst state, when i is less than n/2, indicating that the follower does not get food, the state of the sparrow is bad, and the sparrow needs to go to other places where more food is obtained; otherwise, continue to find food in the vicinity of the seeker;

wherein f _i And f _g The fitness and the optimal fitness of the ith sparrow are respectively f _w Is the worst fitness; beta is a random number in standard normal distribution, K is a value of [ -1,1]Uniform random number in the range, epsilon value range is 0 < epsilon < 10 ^-10 。