CN115018038A

CN115018038A - Network interface flow prediction method based on sparrow optimization width learning system

Info

Publication number: CN115018038A
Application number: CN202210247884.5A
Authority: CN
Inventors: 李少波; 李笑瑜; 周鹏; 陈光林
Original assignee: Guizhou University
Current assignee: Guizhou University
Priority date: 2022-03-14
Filing date: 2022-03-14
Publication date: 2022-09-06
Anticipated expiration: 2042-03-14
Also published as: CN115018038B

Abstract

The invention discloses a network interface flow prediction method based on a sparrow optimization width learning system, which comprises the following steps: determining a prediction period T according to the obtained network interface flow data, and predicting the network flow at the T moment by using the network flow information of [ T-12, T-1 ]; initializing network parameters; randomly generating p groups of contraction coefficients and regularization coefficients in the value range of the parameters as initial hyper-parameters; automatically training a width learning system model by using p groups of initial hyper-parameters respectively to generate initial fitness; (5) optimizing the hyper-parameters by using a sparrow optimization algorithm; (6) training a width learning system by using the updated hyper-parameters, and updating the fitness value; (7) judging the maximum iteration times, outputting network parameters corresponding to the optimal fitness, training a width learning model by using the network parameters, and establishing a network interface flow prediction model; otherwise, returning to the step (5). The method and the device can predict the accuracy of the network flow and reduce the influence of the network hyper-parameters on the prediction result.

Description

Network interface flow prediction method based on sparrow optimization width learning system

Technical Field

The invention belongs to the technical field of network interface flow prediction, and relates to a network interface flow prediction method based on a sparrow optimization width learning system.

Background

Most of the existing network traffic prediction models are deep learning models. The amount of network parameters of the deep learning model is usually large, and much time and computing resources are consumed for training the model. The width learning system can complete the training of the model in a short time, but the super-parameters of the width learning system have large influence on the performance of the network model. Most of the current network parameter adjusting modes are manual parameter adjusting. This method is more dependent on the experience of the researcher and requires repeated training of the model to make adjustments, which is time consuming.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the utility model provides a network interface flow prediction method based on a sparrow optimization width learning system, which aims to solve the technical problems in the prior art.

The technical scheme adopted by the invention is as follows: a network interface flow prediction method based on a sparrow optimization width learning system comprises the following steps:

(1) determining a prediction period T for the obtained network interface flow data, and predicting the network flow at the T moment by using the network flow information of [ T-12, T-1 ];

(2) initializing network parameters: the method comprises the population quantity, the seeker proportion and the maximum iteration times of a sparrow optimization algorithm; a contraction coefficient value range, a regularization coefficient value range and the number of windows of a feature mapping layer, the number of nodes in a single window and the number of nodes of an enhancement layer in a width learning system;

(3) randomly generating p groups of contraction coefficients and regularization coefficients as initial hyper-parameters in the value range of the parameters, wherein p is the number of sparrows;

(4) automatically training a width learning system model by using p groups of initial hyper-parameters and network interface flow data respectively to generate initial fitness;

(5) optimizing the hyper-parameters by using a sparrow optimization algorithm;

(6) updating the fitness value by utilizing the network interface flow data and the updated hyper-parameter training width learning system;

(7) judging whether the maximum iteration times is reached, if so, outputting network parameters corresponding to the optimal adaptability, training a width learning model by using the network parameters, and establishing a network interface flow prediction model; otherwise, returning to the step (5).

The method for optimizing the hyper-parameters by using the sparrow optimization algorithm in the step (5) comprises the following steps:

1) and (4) regarding the p groups of initial hyper-parameters generated in the step (3) as the initial positions of p sparrows, and calculating the number of searchers, wherein the calculation formula is as follows:

pNum＝p×p_scale

wherein, pNum is the number of the explorers in the population, p is the size of the population, and p _ scale is the proportion of the explorers;

2) and updating the position of the seeker, wherein the calculation formula is as follows:

wherein X is a hyper-parameter to be optimized, namely the position of a sparrow,

a value representing the jth dimension of the ith sparrow in the tth iteration; r ₂ Represents a warning value with a value range of [0, 1]]Are random numbers that are uniformly distributed; ST represents a safety threshold value, Q is a random number matrix in normal distribution, L is a 1 x d dimensional matrix, and d is the size of a population; when R is ₂ When ST is less than the preset range, the current position is safe, and sparrows look for food at accessories; otherwise, the current position is dangerous, and the seeker needs to guide the sparrows to find a new place to find food;

3) and updating the position of the follower, wherein the calculation formula is as follows:

wherein, A ⁺ ＝A ^T (AA ^T ) ^-1 A is a matrix of dimension 1 x d,each latitude value is from-1, 1]Generating randomly;

the sparrows are shown in the worst state, when i is less than n/2, the followers are shown not to obtain food, the state of the followers is not good, and the followers need to go to other places to obtain more food; otherwise, the seeker continues to search for food in the vicinity of the seeker;

4) for sparrows that find a hazard as a scout and update their location, the calculation formula is as follows:

wherein f is _i And f _g Respectively the fitness and the best fitness of the ith sparrow, f _w The worst fitness is obtained; beta is a random number in standard normal distribution, and K is a value of [ -1,1 [)]Uniform random number in the range, epsilon being a small value (0 < epsilon < 10) ^-10 ) To prevent the denominator from being 0.

The invention has the beneficial effects that: compared with the prior art, the regularization parameter and the shrinkage coefficient in the width learning system are optimized by using a sparrow optimization algorithm, so that a complicated manual parameter adjusting process is avoided; and training the width learning system by using the optimized optimal hyper-parameter to improve the accuracy of network flow prediction.

Drawings

FIG. 1 is a flowchart of a breadth learning system algorithm;

FIG. 2 is a flowchart of a sparrow search algorithm;

FIG. 3 is a pseudo-code diagram of the SSA-BLS algorithm;

FIG. 4 is a flowchart of the SSA-BLS algorithm;

FIG. 5 is a diagram of a prediction result of a core network traffic data set in a city of Europe;

FIG. 6 is a graph of UK academic backbone network traffic data set prediction results;

fig. 7 is a graph of enterprise cloud platform switch interface traffic data set prediction results;

FIG. 8 is a diagram of the run time of the width learning system on LSTM.

Detailed Description

The invention is further described below with reference to specific examples.

Preparing knowledge: the breadth learning system is a novel random weight neural network. Compared with the traditional neural network, the neural network is based on the random vector function link neural network and is mainly used for solving the problems of large calculation amount and high calculation cost of deep learning. The width learning system has 3 layers, an input layer, a hidden layer and an output layer. The hidden layer is composed of a feature mapping layer and an enhanced node layer. The hidden layer of the width learning system is of a single-layer structure and consists of a feature mapping layer and an enhanced node layer.

The algorithm flow of the breadth learning system is as follows. Let the training data be X ∈ R ^N×M The method comprises N samples, each sample has M dimensions, and the corresponding label is Y e R ^N×C . By mapping functions over n features

(i 1.. n.) the training data X is mapped into n sets of feature maps, each set of maps yielding K _i A node, wherein the ith group of feature maps Z _i The calculation method is shown in formula (1).

\*MERGEFORMAT Z _i ＝φ _i (XW _ei +β _ei ) \*MERGEFORMAT (1)

Wherein, W _ei And beta _ei Are a randomly generated feature mapping weight matrix and a bias matrix. In the practical application process, phi _i Often a non-linear mapping function.

Representing n groups of feature mapping nodes obtained by the feature mapping layer as Z _in ＝(Z ₁ ,Z ₂ ,…,Z _n ) Is a reaction of Z _in Activation function zeta connected to the enhancement node layer via the enhancement node layer _j Will Z _in Mapping into m groups of enhanced nodes, each group consists of q nodes, then the jth group of enhanced nodes H _j Can be represented as.

H _j ＝ζ _j (Z _in W _hj +β _hj ) \*MERGEFORMAT (2)

Wherein, W _hj Is a randomly generated enhancement layer node weight matrix, beta _hj Is the corresponding bias matrix; activation function ζ _j Different non-linear activation functions may be selected to adequately extract the characteristic information of the input data.

Similarly, m groups of enhanced nodes H _j Is represented by H _jm ＝(H ₁ ,H ₂ ,…,H _m ) Is a reaction of Z _in And H _jm The merge is represented as:

A＝(Z _in |H _jm ) \*MERGEFORMAT (3)

the output of the width learning system is then:

Y＝(Z _in |H _jm )W＝AW \*MERGEFORMAT (4)

wherein, W is the output layer connection weight matrix, for this there are:

W＝A ⁺ Y \*MERGEFORMAT (5)

A ⁺ the pseudo-inverse matrix is a matrix A, and the model training is completed by solving the pseudo-inverse to calculate a connection weight matrix W:

wherein λ > 0 is a regularization coefficient; i is the identity matrix.

The width learning system has two major characteristics compared with the traditional nerve: firstly, a sparse self-encoder is utilized to refine random features of input data into sparse and compact feature sets, and important features are explored through a sparse feature learning model, so that the features of the input data are better described, and the operation efficiency is improved; secondly, aiming at the problem that the computation amount and the computation time are increased by increasing the number of network layers or readjusting the training model after changing the structure if the network model cannot reach the required precision in the deep learning system, the width learning system adopts incremental learning, and the model is dynamically adjusted by increasing the enhanced nodes, so that the training time is greatly shortened, and certain precision can be ensured.

The algorithm flow diagram of the width learning system is shown in fig. 1.

The sparrow searching algorithm is a new group intelligent optimization algorithm and is based on the behaviors of searching food and anti-predation of sparrows. The sparrow search algorithm divides sparrows into seekers and followers, and makes the following rules for the movement of sparrows.

1) Sparrows in the population are divided into explorers and followers according to the fitness of the sparrows. The high and low fitness reflects the advantages and disadvantages of the positions where sparrows prey. The higher the fitness, the better the sparrow position, and the better the food can be found.

2) Sparrows with high fitness are the seeker, and other sparrows serve as the followers. The explorer is responsible for surveying the location where food is abundant and guiding the followers to the location and direction of foraging. The followers are able to search for the seeker with the best predation position and then forage around them.

3) The fitness value of sparrow individuals is dynamically changing, so the identities of the seeker and follower can change with each other, but the proportion of seekers remains unchanged.

4) The lower the fitness value of the follower, the worse the predation position. These followers may randomly fly elsewhere to feed.

5) Randomly selecting a certain proportion of individuals from the sparrow population as reconnaissance persons to monitor the safety of the surrounding environment. When the predator is found, the reconnaissance can send out an alarm, and when the alarm value is greater than the safety value, the seeker can take the follower to forage in a safer area.

(6) When a danger is realized, sparrows located at the edge of the colony can be quickly transferred to a safe area to obtain a better position, while sparrows located at the center can be randomly moved.

The algorithm flow of the SSA is as follows, and the algorithm flow chart thereof is shown in fig. 2.

The first step is as follows: initializing parameters, wherein the parameters mainly comprise setting the number of sparrow populations, the proportion of seekers, the positions of sparrows, the maximum iteration times and the like.

Step 2: and (4) judging whether the current position of the population is safe or not, and updating the position of the seeker by using a formula (7).

Wherein, the first and the second end of the pipe are connected with each other,

the value of the jth dimension representing the ith sparrow in the tth iteration. R ₂ Represents a warning value with a value range of 0,1]Are random numbers that are uniformly distributed. ST represents a safety threshold value, and the value range is [0.5,1.0 ]]Q is a random number in normal distribution, L is a 1 x d dimensional matrix, and d is the size of the population. When R is ₂ When ST is less than the preset time, the current position is safe, and the sparrows are used for searching food; otherwise, the current position is threatened, and the searchers need to guide the sparrows to search for new places to search for food.

The third step: and judging the state of the follower, and updating the position of the follower according to a formula (8).

Wherein the content of the first and second substances,

indicating the position of the sparrow in the worst state, A ⁺ ＝A ^T (AA ^T ) ^-1 A is a 1 x d dimensional matrix with latitude values from-1, 1]Is randomly generated. When i is less than n/2, the follower is not provided with food, the self-body state is not good, and other places where more food can be provided are required to be reached; instead, the seeker continues to search for food in the vicinity of the seeker.

The fourth step: and (4) finding danger by part of sparrows in the population to become the alarmer, and determining the position of the alarmer according to the formula (9).

Wherein f is _i And f _g Respectively the fitness and the best fitness of the ith sparrow, f _w Is the worst fitness. Beta is a random number in standard normal distribution, and K is a value of [ -1,1 [)]The uniform random number in the range, ε is a small value to prevent the denominator from being 0.

And 5, step 5: and updating the fitness of the sparrows.

And 6, a step of: and judging whether the iteration stop conditions are met or not, and if not, repeating the steps 2 to 5.

Example 1: as shown in fig. 1 to 8, a method for predicting network interface traffic based on a sparrow optimization breadth learning system includes the following steps:

(5) the method comprises the following steps of optimizing hyper-parameters by using a sparrow optimization algorithm:

pNum＝p×p_scale

wherein A is ⁺ ＝A ^T (AA ^T ) ^-1 A is a 1 x d dimensional matrix with latitude values from-1, 1]Generating randomly;

wherein, f _i And f _g Fitness and maximum fitness of the ith sparrow respectivelyGood adaptability, f _w Is the worst fitness. Beta is a random number in standard normal distribution, and K is a value of [ -1,1 [)]A uniform random number within the range, ε being a small value to prevent the denominator from being 0;

(6) training a width learning system by using the network interface flow data and the updated hyper-parameters, and updating the fitness value;

In order to predict the accuracy of network flow and reduce the influence of network hyper-parameters on a prediction result, the method optimizes two hyper-parameters of a breadth learning system, namely a contraction coefficient (r) and a regularization coefficient (lambda), by using a sparrow search algorithm, and establishes a network flow prediction model by using the optimized output optimal hyper-parameters. I name this method SSA-BLS, whose pseudo code and algorithm flow diagrams are shown in FIGS. 3 and 4.

The experimental data set uses a core network traffic data set of a city in Europe and a British academic backbone network traffic data set.

Core network traffic data set for a certain city in europe: data is from british academic backbone network traffic dataset: the data set collected aggregated traffic for the backbone network of the british academic network from 30 points at 19/9/2004 to 11 points at 27/2005 in units of bits with a sampling interval of five minutes.

Taking data from 7 month 1 to 7 month 25 days in 2005 in a core network traffic data set of a certain city in Europe as a training set, and taking data from 7 month 26 to 7 month 28 days as a test set; data from 1 month to 1 month and 24 days in the British academic main network traffic data set in 2005 are used as a training set, and data from 1 month to 25 months to 1 month and 27 days are used as a test set.

1) Parameters and evaluation indexes

The SSA-BLS parameters are configured as follows. The population size is 50, the producer proportion is 20%, the maximum iteration number is 5, and the dimensionality is 2; the number of windows of the mapping layer is 10, the number of nodes in each window of the mapping layer is 10,the number of nodes of the strengthening layer is 50, and the value ranges of the shrinkage coefficient (r) and the regularization coefficient (lambda) are respectively [0.09 and 0.999999 ]]And [2 ] ^-30 ,2 ^-35 ]。

MSE, RMSE, MAE, MAPE and MA are used as evaluation indexes, and the calculation mode is shown in formulas (10), (11), (12), (13) and (14).

Wherein n is the total number of samples,

is a predicted value, y _i Are true values. The smaller the MSE, RMSE, MAE and MAPE, the better, and the closer the MAPE to 100% the better the prediction performance of the model.

2) Results and discussion

The SSA-BLS model and [ T-12, T-1] were used for the experiments]The flow rate value at time T is predicted. In order to verify the performance of the SSA-BLS model, the SSA-BLS model is compared with BLS, ELM, SCN and RVFL with similar structures, with RVFL deformation dRVFL and an LSTM model commonly used in network flow prediction, each model is independently operated for 100 times, and the average value of evaluation indexes of the operation results is taken as a final result. Wherein, the BLS model shrinkage coefficient r and the regularization coefficient lambda valueAre respectively selected from {0.1,0.5,0.9,0.99,0.9999,0.99999} and {2 } ^-30 ,2 ^-20 ,2 ^-10 0.5,1,5,10, and the rest parameters are the same as SSA-BLS model; the maximum hidden layer node number of the SCN model is 250, and the maximum candidate node number is 100; the regularization coefficient of RVFL is 1e-3, and the number of nodes of the hidden layer is 100; dRVFL parameter with RVFL; the LSTM model comprises 3 hidden layers, 12 blocks in each layer, the learning rate of training is 1e-2, the batch _ size is 64, and the epoch is 15. The predicted performance of each model on the test set of the two data sets is shown in tables 1-2.

TABLE 1 Experimental results of core network traffic data set in a certain city of Europe

	MSE	RMSE	MAE	MAPE	MA
						SSA-BLS	0.0159047	0.1261069	0.0937315	0.0294284	97.057155％
BLS	0.0781227	0.2551322	0.1878021	0.0571019	94.289801％
						SCN	0.0154907	0.1244372	0.0934485	0.0295662	97.043378％
RVFL	0.0254023	0.1593589	0.1208186	0.0388347	96.116525％
						dRVFL	0.0227553	0.1507691	0.1135728	0.0367191	96.328085％
ELM	0.1394488	0.3686439	0.2710739	0.0780252	92.197470％
						LSTM	0.0781441	0.2502372	0.1884968	0.0517535	94.824642％

Table 2 british academic backbone network traffic data set experimental results

Table 3 enterprise cloud platform switch interface flow data set experimental results

	MSE	RMSE	MAE	MAPE	MA
						SSA-BLS	0.0000734	0.0082991	0.0063628	0.0021407	99.785924％
BLS	0.0103714	0.0811761	0.0563080	0.0176119	98.238804％
						SCN	0.0001742	0.0130396	0.0067544	0.0021857	99.781427％
RVFL	0.0361230	0.1899801	0.1288009	0.0400804	95.991952％
						dRVFL	0.0327578	0.1807739	0.1277397	0.0403579	95.964208％
ELM	0.0579519	0.2382928	0.1327614	0.0400340	95.996599％
						LSTM	0.0283041	0.1057008	0.0759722	0.0238097	97.619024％

FIGS. 5 and 6 are graphs comparing predicted values and actual values of the SSA-BLS model and other models on a training set of a common data set.

In addition, to better verify the predictive performance of the SSA-BLS model, the model is applied to private traffic data sets. The private traffic data set is from real incoming traffic data of a switch interface of a certain enterprise from 10 months and 5 days to 18 days 2021. The data from 10 month 5 to 10 month 16 of 2021 in the private data set is used as a training set, and the data from 10 month 17 to 18 days is used as a test set to verify the model.

Because the sampling interval of the enterprise switch interface traffic data is unequal, resampling is carried out firstly: the mean of the port flow was calculated over 5 minutes and if there was no flow data over 5 minutes, the previous value was used for filling. Meanwhile, the original data has a great abnormal flow value, and in order to reduce the influence of the abnormal value on the prediction, the data is smoothed by spectral smoothing (spectral smoothing). The SSA-BLS model and comparative model parameters are as above, and the predicted performance of each model is shown in Table 3.

FIG. 7 is a graph comparing the predicted values and the true values of the SSA-BLS model and other models on the training set in the private data set. As can be seen from tables 1, 2 and 3, the SSA-BLS model has better ultra-short-term prediction performance than other models in both the main network traffic data set and the private data set in the british academic society, and has higher model accuracy in the core network traffic data set of a certain city in europe. In the aspect of accuracy of network traffic single-step prediction, the prediction accuracy of the SSA-BLS model in the british academic trunk network traffic data set and the enterprise cloud platform switch interface traffic data set is higher than that of other models, and the prediction performance of the SSA-BLS model on a core network traffic data set of a certain city in europe is only a little inferior to that of the SCN model, but has higher accuracy overall, which indicates that the model can select better network hyper-parameters and can better capture the time sequence characteristics of traffic. In terms of time consumption, FIG. 8 is the time consumed by the width learning system to train once with the LSTM model for 1 epoch. Wherein dataset1, dataset2 and dataset3 respectively represent British academic backbone network traffic data set, European city core network traffic data set and enterprise cloud platform switch

An interface traffic data set. As is clear from fig. 8, the width learning system can complete training in a shorter time, and the larger the amount of data, the greater the time advantage of BLS.

The invention provides a sparrow optimization algorithm-based width learning model, which optimizes two hyper-parameters, namely a shrinkage coefficient (r) and a regularization coefficient (lambda) in the width learning model by using a sparrow optimization algorithm, and trains the model by using the output optimal hyper-parameter, thereby reducing the influence of the hyper-parameter on the model and improving the accuracy of the model. The model avoids a fussy manual parameter adjusting process, and a better network hyper-parameter combination is selected by utilizing an algorithm, so that the performance of the BLS is optimal. The SSA-BLS model is applied to the field of short-term network flow prediction, and two network flow public data sets and a real data set of the enterprise cloud platform network switch interface flow are selected for experiments. In order to verify the model effect, the SSA-BLS model is compared with the BLS model and the like, and experiments show that the SSA-BLS model can select better hyper-parameters to enable the prediction accuracy of the network flow to reach more than 97%.

The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and therefore, the scope of the present invention should be determined by the scope of the claims.

Claims

1. A sparrow optimization width learning system-based network interface flow prediction method is characterized by comprising the following steps: the method comprises the following steps:

(2) initializing network parameters: the method comprises the population quantity, the seeker proportion and the maximum iteration number of a sparrow optimization algorithm; a contraction coefficient value range, a regularization coefficient value range and the window number of a characteristic mapping layer, the node number in a single window and the node number of an enhancement layer in a width learning system;

(5) optimizing the hyper-parameters by using a sparrow optimization algorithm;

(7) judging whether the maximum iteration times is reached, if so, outputting network parameters corresponding to the optimal fitness, training a width learning model by using the network parameters, and establishing a network interface flow prediction model; otherwise, returning to the step (5).

2. The method for predicting the network interface traffic based on the sparrow optimization breadth learning system according to claim 1, wherein the method comprises the following steps: the method for optimizing the hyperparameter by using the sparrow optimization algorithm in the step (5) comprises the following steps:

1) and (4) regarding the p groups of initial hyper-parameters generated in the step (3) as the initial positions of p sparrows, and calculating the number of the seekers according to the following calculation formula:

pNum＝p×p_scale

a value representing the jth dimension of the ith sparrow in the tth iteration; r is ₂ Represents a warning value with a value range of [0, 1]]Are random numbers that are uniformly distributed; ST represents a safety threshold, Q is a random number matrix in normal distribution, L is a 1 x d dimensional matrix, and d is the size of a population; when R is ₂ When ST is less than the preset range, the current position is safe, and sparrows look for food at accessories; otherwise, the seeker needs to guide the sparrows to find a new place to find food, which indicates that the current position is dangerous;

wherein A is ⁺ ＝A ^T (AA ^T ) ^-1 A is a 1 x d dimensional matrix with latitude values from [ -1,1]Generating randomly;

the sparrows are shown in the worst state, when i is less than n/2, the followers are shown not to obtain food, the state of the followers is not good, and the followers need to go to other places to obtain more food; otherwise, continue to search for food near the seeker;

wherein, f _i And f _g The fitness and the best fitness of the ith sparrow respectively, f _w The worst fitness is obtained; beta is a random number in standard normal distribution, and K is a value of [ -1,1 [)]Uniform random number in the range of 0 < epsilon < 10 ^-10 。