CN111242270A

CN111242270A - Time series prediction model based on improved multi-target difference optimization echo state network

Info

Publication number: CN111242270A
Application number: CN202010029463.6A
Authority: CN
Inventors: 任伟杰; 王依雯; 韩敏
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2020-01-13
Filing date: 2020-01-13
Publication date: 2020-06-05

Abstract

A time series prediction model based on an improved multi-objective difference optimization echo state network. Firstly, a population is initialized randomly, and fitness evaluation is performed on individuals in the population in sequence. And secondly, setting the maximum iteration times of the population for iteration. Thirdly, generating variant individuals and test individuals for each individual in the population by adopting differential evolution, forming the test individuals into a test population, and forming the current generation population and the test population into a mixed population; and decomposing the mixed population into K sub-populations by using the reference vector, and updating the sub-populations. All the updated sub-populations are the next generation new population, and the previous step is returned. And finally, when the iteration times reach the maximum, obtaining a final population, and selecting one individual from the final population as an optimal output individual. The invention optimizes the parameters of the reserve pool by using an improved multi-objective differential evolution algorithm, improves the optimization performance of the model, can improve the prediction precision of the time sequence by using the proposed model, and has good generalization capability and practical application value.

Description

Time series prediction model based on improved multi-target difference optimization echo state network

Technical Field

The invention relates to a complex time series prediction model, in particular to a time series prediction model based on an improved multi-objective differential optimization echo state network.

Background

Time series are widely present in various aspects of social life. It is widely available in the economic field, such as commodity price, stock index. It is also widely used in industry, such as bearing health monitoring. In the hydrological field, annual runoff and galaxy morphology, etc. Therefore, the accuracy of time series predictions is often a goal sought by many researchers. In the past decades, various time series prediction models have been proposed, such as autoregressive models, neural networks, support vector regression and fuzzy systems. Neural networks in particular represent a great advantage in dealing with non-linearities. Neural networks have been under profound development with the continued efforts of many scholars. For example, Jaeger and Haas propose a stochastic algorithm for training neural networks, called Echo State Networks (ESNs). ESNs were found to have certain advantages in time series prediction. The training algorithm is simple, the training time is short, and the overall optimality of the knowledge is guaranteed. Despite the above advantages of the echo state network, there are some problems, such as adaptability and stability problems of the reserve pool, the occurrence of ill-conditioned solutions and the setting of different time-series ESN parameter values.

In recent years, many scholars optimize the neural network by using a group intelligent optimization algorithm, and good research results are obtained. For example, the "optimization of BP neural network trinocular vision calibration based on swarm algorithm" (published: CN 110009696A) of George crystal and the like uses artificial swarm algorithm to select the optimal initial weight and bias for BP neural network. However, the artificial bee colony algorithm has a high dependency on specific problems and application environments, and has high computational complexity. The training time of the BP neural network is long, and the applicability of the algorithm for optimizing the BP neural network by the bee colony algorithm is not high on the whole. Xuandongji et al, "method for estimating SOC based on BP neural network optimized by ant colony algorithm" (publication number: CN 109738807A) selects ant colony algorithm to optimize BP neural network to estimate SOC of power battery at present stage. Although the ant colony algorithm has strong robustness, the ant colony algorithm has the characteristics of obvious experience, low convergence speed and low applicability, so that the ant colony algorithm is not suitable for being directly applied to practical problems. The power transformer fault diagnosis method and system based on the improved firefly algorithm optimized probabilistic neural network (publication number: CN 110363277A) of Yihui et al optimizes the probabilistic neural network to diagnose the power transformer fault, thereby achieving the effects of high diagnosis precision and small error. However, the firefly algorithm is low in discovery rate, low in solution precision and low in solution speed, so that the cost of the firefly algorithm for optimizing the probabilistic neural network is high.

The presently known group intelligence optimization algorithms have certain disadvantages. The local search capability of the genetic algorithm is not strong, the global optimal solution cannot be found, and the search efficiency is low. The robustness of the particle swarm algorithm is poor, the particle swarm algorithm is easy to fall into a local optimal solution, and the initialization of parameters greatly affects the performance of the algorithm. Although the artificial fish swarm algorithm is not easy to fall into local optimization, the artificial fish swarm algorithm is not suitable for simultaneously optimizing parameters with different ranges, and the algorithm is complex in structure and low in efficiency. The pigeon swarm optimization algorithm has low convergence speed and relatively poor stability. Teaching and learning optimization algorithms have the phenomenon of premature convergence. Research shows that the differential evolution algorithm is simple in structure and has strong global searching capability, and local and global information can be balanced to search. Therefore, the method selects differential evolution to optimize the reserve pool parameters of the echo state network, and selects multi-objective difference to optimize two objective functions in order to further improve the accuracy and stability of prediction.

Disclosure of Invention

The invention mainly aims to solve the problem that the parameters of the reserve pool of the echo state network are difficult to determine when the parameters of the reserve pool of the echo state network correspond to different time sequences, and provides a prediction model for optimizing the parameters of the reserve pool of the echo state network by utilizing an improved multi-objective differential evolution algorithm, so that the parameters do not need to be manually adjusted to save time, and the prediction model is suitable for prediction of different time sequences and improves the prediction precision.

In order to achieve the purpose, the invention adopts the technical scheme that:

the echo state network is composed of input layer,The device comprises a reserve pool and an output layer, wherein the setting of the parameters of the reserve pool directly determines the prediction performance of the echo state network. The manual adjustment takes time and can not select the most appropriate parameter value, so the invention utilizes the improved reserve pool parameter of the multi-objective difference optimization echo state network to construct a time sequence prediction model based on the improved multi-objective difference algorithm optimization echo state network. The parameters to be optimized are reserve pool parameters of the echo state network, and are reserve pool scale, spectrum radius, sparsity and input transformation factors. The differential evolution method used by the invention has the advantages of simple structure, easy realization, high robustness and strong global search capability. The differential evolution algorithm is mainly divided into four operations: initialization, mutation operations, crossover operations, and selection operations. The objective function of the single-target optimization is the minimum absolute error of a predicted value, the defect is that the stability of the prediction effect cannot be ensured, and the invention adopts two objective functions which are respectively f for the improved multi-target differential evolution algorithm₁(x) And f₂(x) Wherein the first function represents the minimum sum of absolute errors to ensure prediction accuracy. Another function indicates that the variance of the error is minimal to ensure the stability of the prediction. The objective function is shown in equation (1):

here, y (t) is the true value of the data, and y' (t) is the output value, i.e., the predicted value, of the echo state network. n is the number of sample data.

The method comprises the following steps:

step 1: and (4) randomly initializing a population, wherein NP is the number of population individuals, and G is the current iteration number. Each dimension of the population of individuals represents a parameter of a pool in the echo state network, each parameter having a selection range, and therefore each dimension of the individuals is constrained.

Step 2: for individuals in the population x_iWherein i is 1, 2. The individual x_iEach dimension of (a) is respectively assigned to a parameter corresponding to a reserve pool in the echo state network: reservoir size, sparsity, spectral radius, and input transform factors. Number of experimentsThe data is divided into a training set and a test set. Inputting the training set data into an echo state network to obtain a predicted value and carrying out fitness evaluation of the individual, namely F (x). And sequentially evaluating the fitness of the individuals in the population.

And step 3: setting the maximum Iteration times of the population as Max-Iteration, if the Iteration times are more than the set maximum Iteration times, stopping Iteration, and executing the step 8; otherwise, step 4 is executed.

And 4, step 4: and (3) generating a variant individual and a test individual for each individual in the population by adopting a variant operation and a cross operation of differential evolution, and forming the generated test individuals into a test population which is recorded as Trial-Pop. At this time, the contemporary population and the test population are combined into a mixed population, which is designated as Mix-Pop. The scaling factor F and the cross probability CR used in the mutation operation and the cross operation are all arbitrary values between 0 and 1, and cannot be 0 or 1.

And 5: using a set of randomly uniformly distributed reference vectors L ═ L₁,l₂,...,l_KK decomposes the mixed population into K sub-populations. The population decomposition principle is to find a reference vector nearest to each individual in the mixed population and divide each individual into a sub-population corresponding to the vector nearest to the individual. Population individual x_iIn the vector l_kThe projected distance above is denoted as d₁The vertical distance is denoted as d₂。d₁And d₂Respectively representing the convergence and diversity of the population individuals.

The population decomposition principle is specifically that the included angle between each population individual in the mixed population and K reference vectors is calculated respectively, and the minimum included angle is found and is represented as α_i＝min{arg(cos(x_i,l_k) ) } and the population of individuals x_iDivision into vectors l_kThe corresponding sub-population, Subpop { k }.

Step 6: each sub-population is based on

Division into P_conAnd P_divTwo subsets, wherein

D for all individuals of the G-1 generation₁Average value of (a). If the population individual corresponds to d₁Is less than

The population individuals are divided into subsets P_conAnd (c) removing the residue. Otherwise, it is placed in the subset P_divAnd (c) removing the residue. Obviously, the subset P_conIs better than the subset P_div。

The sub-population update rule is as follows:

① if P_con|>NP/K, to subset P_conAccording to psi (x) ═ d₁(x)+θ×d₂(x) And performing ascending sequence arrangement, and selecting the first NP/K solutions to form the next generation of the sub-population. Where θ is a penalty function, | P_conAnd | represents the size of the subset. NP denotes the size of the population, and K denotes the number of sub-populations.

② if P_conThe invention will use the subset P | -. NP/K_conThe solution in (a) is taken as the next generation sub-population.

③ if P_con|<NP/K, this time subset P_conThe solution in (1) is fully entered into the next generation sub-population, and the subset P is added_divAccording to d₁The values are sorted in ascending order and then in the subset P_divBefore (NP/K- | P)_con|) the solutions enter the next generation sub-population.

And 7: all updated sub-populations are the next generation new population, and when G is G +1, step 3-step 6 are performed.

And 8: and when the iteration times reach the maximum, randomly selecting an individual from the population as an optimal output individual, assigning each dimension of the output individual to a reserve pool parameter of the echo state network, inputting the test set data into the echo state network, and outputting a predicted value. And drawing a prediction curve and an error curve according to the real value of the test set data and the predicted value of the network.

In summary, steps 1 to 8 are descriptions of the time series prediction model based on the improved multi-objective differential optimization echo state network proposed by the present invention.

The invention has the characteristics and benefits that: the multi-target difference algorithm is simple in structure, easy to implement, high in robustness and strong in global search capability, and meanwhile, the multi-target optimization enables the prediction effect to achieve accuracy and stability at the same time. The final population obtained by the improved multi-target differential algorithm has good convergence and diversity, and the parameters of the network reserve pool in the echo state are selected for different time sequences, so that the prediction precision is improved.

Drawings

FIG. 1 is a flow chart of the steps of the model presented herein.

FIG. 2 is a schematic diagram of population decomposition of the multi-objective differential evolution algorithm.

FIG. 3 is a diagram illustrating a sub-population decomposition of a multi-objective differential evolution algorithm.

FIG. 4 is a graph showing a simulation result of an IMODE-ESN prediction curve and an error curve of the average temperature in Sunkel months in the first embodiment; fig. 4(a) is a prediction graph, fig. 4(b) is an error graph, fig. 4(c) is a local prediction curve, and fig. 4(d) is a local error curve.

FIG. 5 is a diagram showing the simulation results of the prediction curve and error curve of the IMODE-ESN in the Beijing PM2.5 time series according to the second embodiment; fig. 5(a) is a prediction graph, fig. 5(b) is an error graph, fig. 5(c) is a local prediction curve, and fig. 5(d) is a local error curve.

Detailed Description

The technical scheme of the invention is explained in detail in the following with the accompanying drawings:

the Echo State Network (ESN) has the advantage that only the output connection weight from the reserve pool to the output layer needs to be adjusted in the Network learning process, and other connection weights are generally kept unchanged after being randomly assigned. The method is based on the improved multi-objective differential evolution algorithm to optimize the ESN, and is abbreviated as an IMODE-ESN model. The model realizes the self-adaptation of ESN parameters, and the parameters to be optimized are the size of the reserve pool, the spectrum radius of the reserve pool, the sparsity and the input transformation factor.

The Differential Evolution (DE) has the characteristics of simple structure, easy implementation, high robustness, strong global search capability, and the like. The differential evolution algorithm is mainly divided into four operations: initialization, mutation operations, crossover operations, and selection operations. Meanwhile, in order to balance the accuracy and stability of prediction, the improved multi-objective difference is used for optimizing the parameters of the reserve pool so as to make up for the defects of a single-objective optimization algorithm. Thus, the present invention uses two objective functions, respectively f₁(x) And f₂(x) Wherein the first function represents the minimum sum of absolute errors to ensure prediction accuracy. Another function indicates that the variance of the error is minimal to ensure the stability of the prediction.

The invention is based on an improved time series prediction model of a multi-target difference optimization echo state network, as shown in figure 1, and is carried out according to the following steps:

step 1: and (4) randomly initializing a population, wherein the size NP of the population is 100, and G is the current iteration number. Each dimension of the population of individuals represents a parameter of a pool in the echo state network, each parameter having a selection range, and therefore each dimension of the individuals is constrained.

Step 2: for individuals in the population x_iWherein i is 1, 2. The individual x_iEach dimension of (a) is respectively assigned to a parameter corresponding to a reserve pool in the echo state network: reservoir size, sparsity, spectral radius, and input transform factors. The experimental data is divided into a training set and a test set. Inputting the training set data into an echo state network to obtain a predicted value and carrying out fitness evaluation of the individual, namely F (x). And sequentially evaluating the fitness of the individuals in the population.

And step 3: setting the maximum Iteration frequency Max-Iteration of the population as 30, if the Iteration frequency is more than the set maximum Iteration frequency, stopping Iteration, and executing the step 8; otherwise, step 4 is executed.

And 4, step 4: and (3) generating a variant individual and a test individual for each individual in the population by adopting a variant operation and a cross operation of differential evolution, and forming the generated test individuals into a test population which is recorded as Trial-Pop. At this time, the contemporary population and the test population are combined into a mixed population, which is designated as Mix-Pop. The scaling factor F used in the mutation operation and the crossover operation is 0.9, and the crossover probability CR is 0.3.

And 5: as shown in fig. 2, a set of randomly uniformly distributed reference vectors L ═ L is used₁,l₂,...,l ₂₀20, the mixed population is decomposed into 20 sub-populations. The population decomposition principle is to find a reference vector nearest to each individual in the mixed population and divide each individual into a sub-population corresponding to the vector nearest to the individual. Population individual x_iIn the vector l_kThe projected distance above is denoted as d₁The vertical distance is denoted as d₂The formulas are shown as (2) and (3). d₁And d₂Respectively representing the convergence and diversity of the population individuals.

d₁(x)＝l^TF^*(x) (2)

Wherein l represents a randomly uniformly distributed reference vector, F (x)_min＝{min(f₁(x)),min(f₂(x) ) } represents an ideal point. F^*(x)＝F(x)-F(x)_min。

The population decomposition principle is specifically that the included angles between each population individual in the mixed population and 20 reference vectors are respectively calculated, and the minimum included angle is found and is represented as α_i＝min{arg(cos(x_i,l_k) ) } and the population of individuals x_iDivision into vectors l_kThe corresponding sub-population, Subpop { k }.

Step 6: as shown in fig. 3, each sub-population is based on

Division into P_conAnd P_divTwo subsets, wherein

The sub-population update rule is as follows:

① if P_con|>10, pair subset P_conAccording to psi (x) ═ d₁(x)+θ×d₂(x) And performing ascending arrangement, and selecting the first 10 solutions to form a next generation of sub-population. Where θ is a penalty function, | P_conAnd | represents the size of the subset. NP indicates the size of the population, NP is 100, K indicates the number of sub-populations, and K is 20.

② if P _con10, the present invention combines the subset P_conThe solution in (a) is taken as the next generation sub-population.

③ if P_con|<10, this time subset P_conThe solution in (1) is fully entered into the next generation sub-population, and the subset P is added_divAccording to d₁The values are sorted in ascending order and then in the subset P_divBefore (10- | P) selection_con|) the solutions enter the next generation sub-population.

The invention selects the average air temperature time sequence of the Dalian month and the PM2.5 time sequence of Beijing to carry out simulation experiments. Meanwhile, four optimization algorithms are selected to optimize the ESN under the same condition to serve as a comparison model. The method comprises the steps of optimizing ESN (AFSA-ESN) by an artificial fish swarm algorithm, optimizing ESN (PSO-ESN) by a particle swarm algorithm, optimizing ESN (TLBO-ESN) by a teaching and learning optimization algorithm and optimizing ESN (IBEA-ESN) by an evolution algorithm based on indexes. The estimated indicators of predictive performance are Root Mean Square Error (RMSE), symmetric root mean square absolute percent error (SMAPE), and Normalized Root Mean Square Error (NRMSE). The formula for RMSE, SMAPE and NRMSE is defined as follows:

here y (t) is the true value of the data and y' (t) is the predicted value of the network.

Is the average of the data and n is the number of sample data.

The invention provides the range of the parameter values to be optimized for two sets of experiments. And then finding the most suitable parameter value in the given range through an optimization algorithm. Pool parameter range settings are shown in table 1.

TABLE 1 Reserve parameter ranges

The first embodiment is as follows:

example one illustrates the beneficial effects of the present invention through a data set of average monthly air temperature and monthly rainfall in Dalian. And selecting 792 samples of the average temperature and precipitation data sets of the areas in the Dalian province from 1 month in 1951 to 12 months in 2017. The sampling interval is months. The Dalian month precipitation was used as a second dimension of the experimental data to assist in the prediction of the average Dalian month air temperature. 75% of the data were used for training and 25% for testing. In the training samples, the state of the first 50 samples is discarded to clear the effect of the initial transient. To demonstrate the effectiveness of the invention, the invention makes the range of the reserve pool parameters of the ESN the same as those shown in Table 1 for PSO-ESN, AFSA-ESN, TLBO-ESN, IBEA-ESN. The population size and the maximum number of iterations are also the same. Table 2 shows the selection of the most suitable values for the reserve pool parameters for the time series of the average air temperature of the month.

TABLE 2 average temperatures of Dalian month and Never month IMODE-ESN model parameters

The results and the running time of the single-step prediction of the monthly average air temperature time series of the Dalian city by different models are shown in the table 3. The IMODE-ESN outperforms other comparative models in terms of RMSE, SMAPE and NRMSE. As can be seen from Table 3, the SMAPE of the PSO-ESN is large, indicating that the PSO-ESN cannot predict the sequence trend of the mean air temperature over months. Although AFSA-ESN predicts better sequence trends than PSO-ESN, the errors in RMSE and NRMSE are large, indicating that there is a large bias in the prediction at some point. The IMODE-ESN is smaller than other models in terms of RMSE, SMAPE and NRMSE. Table 3 also shows the run times of the grand month average air temperature time series on the different models. It can be seen that the AFSA-ESN has the longest run time. The IMODE-ESN has a slightly longer runtime than the PSO-ESN and IBEA-ESN, and TLBO-ESN has a runtime four times that of IMODE-ESN. Although IBEA-ESN has the shortest run time, the prediction accuracy is worse than TLBO-ESN and IMODE-ESN. It can be seen that the advantage of the IMODE-ESN model on RMSE is apparent. FIG. 4(a) shows a predicted IMODE-ESN versus the average air temperature over a month, and FIG. 4(b) shows an error curve of IMODE-ESN versus the average air temperature over a month. As can be seen from fig. 4(a), the IMODE-ESN can fit the real value curve of the monthly average air temperature well, and as can be seen from fig. 4(b), the error between the real value and the predicted value is small. Because the true value curve in fig. 4(a) fits well to the predicted value curve, fig. 4(c) gives an enlarged view of the curve marked by the black box in fig. 4(a) to increase observability. Fig. 4(d) is a corresponding magnified error curve.

TABLE 3 average climate in Dalian month simulation results of test set

Example two:

preferably, example two uses the Beijing PM from the air pollution index (AQI) of Beijing_2.5Time series. The present invention uses Convergence Cross Mapping (CCM) for Beijing PM_2.5Time-series selection of auxiliary variables, discovery of PM₁₀CO and SO₂For PM_2.5The influence of (2) is large. Accordingly, the present invention selects PM₁₀CO and SO₂As PM_2.5Auxiliary variables for time series prediction. The invention selects 8759 samples for simulation from 2016, 1 month and 2 days to 2016, 12 months and 31 days. 70% of the total samples were used for training and 30% of the total samples were used for testing. In the training samples, the state of the first 100 samples is discarded to eliminate the effect of the initial transient.

The range settings for the pool parameters are known from table 1. In order to prove the effectiveness of the model provided by the invention, the same setting is carried out on AFSA-ESN, PSO-ESN, TLBO-ESN and IBEA-ESN models. Table 4 shows the most suitable pool parameters selected by the present invention for the beijing PM2.5 time series. As can be seen from Table 4, the pool size was 22, slightly greater than the lower limit of the pool range. For ESN, the current reserve pool has a simple structure and higher operation speed.

TABLE 4 Beijing PM_2.5Time series IMODE-ESN model parameters

Table 5 shows different models for PM of Beijing_2.5Single step prediction results of time series. It can be seen that the IMODE-ESN is still superior to other comparative models with respect to RMSE, SMAPE and NRMSE. As can be seen, the RMSE for AFSA-ESN, TLBO-ESN, IBEA-ESN are large, indicating that they do not predict Beijing PM well_2.5Time series. FIGS. 5(a) and 5(a) show IMODE-ESN according to test data, respectivelyObtained Beijing PM_2.5A time series of prediction curves and error curves. IMODE-ESN can well predict PM of Beijing_2.5Trend of change in time series. Because the true value curve in fig. 5(a) fits well to the predicted value curve, fig. 5(c) gives an enlarged view of the curve from step 1000 to step 1020 in fig. 5(a) for increased observability. Fig. 5(d) is a corresponding magnified error curve.

Table 5 shows the run times of the different models after 20 iterations. It can be seen that the AFSA-ESN still has the longest run time. The runtime of IMODE-ESN is slightly longer than that of IBEA-ESN. Although IBEA-ESN has the shortest run time, the prediction accuracy is worse than that of IMODE-ESN.

TABLE 5 Beijing PM_2.5Time series test set simulation results

The simulation results show that the IMODE-ESN model has the average temperature time sequence and the PM of Beijing in the Dalian month_2.5Has excellent performance on time sequence. According to the invention, the appropriate parameter setting of the reserve pool can be selected to be suitable for different time series, so that the generalization performance of the model is improved. From the aspect of prediction accuracy, the RMSE predicted by the model is minimum; the computational complexity of the model is relatively low from a runtime perspective. Therefore, the model provided by the invention has good prospects in the aspects of prediction precision and calculation complexity.

Although the above description describes the embodiments of the present invention in detail with reference to the drawings, the present invention is not limited to the description of the above embodiments. It will be understood by those skilled in the art that various changes and substitutions may be made without departing from the spirit and scope of the invention.

Claims

1. A time series prediction model for optimizing an echo state network based on an improved multi-objective differential evolution algorithm is characterized in that the improved multi-objective differential evolution algorithm adopts two objective functions,are respectively f₁(x) And f₂(x) Wherein the first function represents a minimum sum of absolute errors to ensure prediction accuracy; the other function represents that the variance of the error is minimum to ensure the stability of the prediction; the objective function is shown in equation (1):

here, y (t) is the true value of the data, and y' (t) is the output value of the echo state network, namely the predicted value; n is the number of sample data;

the method comprises the following steps:

step 1: randomly initializing a population, wherein NP is the number of population individuals, and G is the current iteration number; each dimension of the population individuals represents one parameter of a reserve pool in the echo state network, and each parameter has a selection range, so that each dimension of the population individuals is constrained;

step 2: for individuals in the population x_iWherein i 1,2, NP; the individual x_iEach dimension of (a) is respectively assigned to a parameter corresponding to a reserve pool in the echo state network: reservoir scale, sparsity, spectral radius and input transformation factor; dividing experimental data into a training set and a testing set; inputting training set data into an echo state network to obtain a predicted value and carrying out fitness evaluation of the individual, namely F (x); sequentially evaluating the fitness of individuals in the population;

and step 3: setting the maximum Iteration times of the population as Max-Iteration, if the Iteration times are more than the set maximum Iteration times, stopping Iteration, and executing the step 8; otherwise, executing step 4;

and 4, step 4: generating a variant individual and a test individual for each individual in the population by adopting a variant operation and a cross operation of differential evolution, forming the generated test individuals into a test population, and recording the test population as Trial-Pop; at the moment, the contemporary population and the test population form a mixed population which is recorded as Mix-Pop;

and 5: using a set of randomly uniformly distributed reference vectors L ═ L₁,l₂,...,l_KK will mix together 1,2Decomposing the population into K sub-populations; the population decomposition principle is that a reference vector nearest to each individual in the mixed population is found, and each individual is divided into sub-populations corresponding to the vector nearest to the individual; population individual x_iIn the vector l_kThe projected distance above is denoted as d₁The vertical distance is denoted as d₂；

Step 6: each sub-population is based on

Division into P_conAnd P_divTwo subsets, wherein

D for all individuals of the G-1 generation₁Average value of (d); if the population individual corresponds to d₁Is less than

The population individuals are divided into subsets P_conLining; otherwise, it is placed in the subset P_divLining; obviously, the subset P_conIs better than the subset P_div；

The sub-population update rule is as follows:

① if P_con|>NP/K, to subset P_conAccording to psi (x) ═ d₁(x)+θ×d₂(x) Performing ascending sequence arrangement, and selecting the first NP/K solutions to form a next generation of sub-population; where θ is a penalty function, | P_con| represents the size of the subset; NP represents the size of the population, K represents the number of sub-populations;

② if P_conThe invention will use the subset P | -. NP/K_conThe solution in (a) is taken as a next generation sub-population;

③ if P_con|<NP/K, this time subset P_conThe solution in (1) is fully entered into the next generation sub-population, and the subset P is added_divAccording to d₁The values are sorted in ascending order and then in the subset P_divBefore (NP/K- | P)_conI) entering the next generation of sub-population;

and 7: all updated sub-populations are the next generation new population, and at the moment, G is G +1, and the steps 3-6 are executed;

and 8: when the iteration times reach the maximum, obtaining a final population, randomly selecting an individual from the population as an optimal output individual, assigning each dimension of the output individual to a reserve pool parameter of the echo state network, inputting test set data into the echo state network, and outputting a predicted value; and drawing a prediction curve and an error curve according to the real value of the test set data and the predicted value of the network.

2. The time series prediction model for optimizing the echo state network based on the improved multi-objective difference algorithm as claimed in claim 1, wherein the scaling factor F and the cross probability CR used in the mutation operation and the cross operation in step 4 are all arbitrary values between 0 and 1, and cannot be 0 or 1.

3. The model of claim 1, wherein the population decomposition rule of step 5 is to calculate the included angle between each population individual in the mixed population and K reference vectors, and find the minimum included angle, which is represented as α_i＝min{arg(cos(x_i,l_k) ) } and the population of individuals x_iDivision into vectors l_kThe corresponding sub-population, Subpop { k }.