CN112733996B - GA-PSO (genetic algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost - Google Patents
GA-PSO (genetic algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost Download PDFInfo
- Publication number
- CN112733996B CN112733996B CN202110049321.0A CN202110049321A CN112733996B CN 112733996 B CN112733996 B CN 112733996B CN 202110049321 A CN202110049321 A CN 202110049321A CN 112733996 B CN112733996 B CN 112733996B
- Authority
- CN
- China
- Prior art keywords
- pso
- xgboost
- hydrological
- model
- optimal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A10/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
- Y02A10/40—Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping
Abstract
The invention discloses a GA-PSO optimization XGboost-based hydrological time sequence prediction method, which comprises the steps of collecting rainfall values corresponding to hydrological stations and flow of the corresponding hydrological stations, and organizing a hydrological time sequence dataset; preprocessing data, and dividing a sample data set into a training set and a test set; optimizing various super parameters such as the learning rate lr of the XGboost, the number n _ estimators of the base learners, the minimum leaf weight min _ weights, the maximum tree depth max _ depth and the like by adopting an improved GA-PSO combined optimization algorithm, and training an XGboost model by utilizing a sample data set to finally obtain a GA-PSO optimized XGboost hydrological time sequence prediction model; and testing the GA-PSO optimized XGboost hydrological prediction model. According to the invention, the GA-PSO is adopted to optimize the parameters of the XGboost model, and the model obtained by using the optimal parameters is used for hydrologic prediction, so that the accuracy is higher.
Description
Technical Field
The invention belongs to a hydrological prediction technology, and particularly relates to a GA-PSO (genetic algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost.
Background
At present, the hydrology industry in China advances from traditional hydrology to modern hydrology, the observation technology of the automatic hydrology station is rapidly popularized, and the coverage of hydrology data is more and more comprehensive from manual recording of hydrology data to data recording of the current automatic station every few minutes or even every second. The hydrological data have the characteristics of large quantity, various categories, spatiotemporal property, quick updating and the like, and meanwhile, the hydrological data are influenced by various conditions such as seasonal climate, geomorphic characteristics, hydrological laws and the like, so that a lot of valuable laws and information are hidden. How to make powerful analysis on them and obtain useful information from them to serve hydrologic forecasting, flood detection, etc. becomes a focus of attention. In the traditional hydrology industry, a physical model is generally established according to the hydrology environment and process, and then manual experience is added for prediction. From the information perspective, if a specific pattern rule can be mined from the long-term time series historical data owned by the drainage basin, the future water level flow of the drainage basin can be effectively predicted by utilizing the approximate trend, and the method is helpful for preventing flood disasters, so the prediction importance of the hydrologic time series is self-evident.
In recent years, a few scholars apply machine learning methods to hydrological time series prediction, such as: the method has the advantages that the method also achieves better effects, and has some problems while improving the calculation speed and precision of the traditional model: the LSTM and BP neural networks have strong learning ability, but are easy to fall into local optimization, a large number of parameters are needed, and the convergence rate is low; the support vector machine has good prediction effect, but for large-scale training samples, the calculation speed is slow and the selection of the hyper-parameters is depended on. Therefore, it is necessary to find a prediction model with both efficiency and accuracy.
The genetic algorithm and the particle swarm algorithm are the most frequently used and most basic optimization algorithms when the parameters are optimized for the model, in the optimization process of the GA algorithm, the whole population exists in a coding form, the variation trend is gradually and uniformly close to the optimal area, but the GA algorithm is 'memoryless', and the particles are updated only through crossing and variation, so that the global search capability is stronger; in contrast, the PSO algorithm "has memory", updates the particle by changing the velocity and position of the particle, is closely related to the position of the previous time, is more suitable for the local optimal search, has less parameters to be adjusted, and has a fast convergence rate but avoids the premature convergence.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the defects in the prior art, and provides a GA-PSO optimization XGboost-based hydrological time sequence prediction method.
The technical scheme is as follows: the invention discloses a GA-PSO (genetic algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost, which comprises the following steps of:
s1, collecting rainfall values of all rainfall stations corresponding to a water system basin within a certain time period and water levels of corresponding water level stations, and organizing a hydrological time series data set;
s2, preprocessing each hydrological sample data in the hydrological time series data set of the S1, and dividing the sample data set into a hydrological training data set L and a hydrological testing data set T;
step S3, optimizing various super parameters such as learning rate lr, number n _ estimators of base learners, minimum leaf weight, maximum tree depth and the like of the XGboost model by adopting an improved GA-PSO combined optimization algorithm, and training the XGboost model by utilizing a sample data set to finally obtain a GA-PSO optimized XGboost hydrological time sequence prediction model;
and step S4, testing the GA-PSO optimized XGboost hydrological prediction model.
The step S1 is to obtain a data set and corresponding tag information, and the step S1 is further to: and organizing current and previous 7-hour rainfall values of the rainfall station corresponding to the water system drainage basin and current and previous 7-hour flow values of the corresponding water system station as a water system time sequence data set.
The step S2 is to pre-process the data in the data set and partition the data set, and the step S2 is further to:
step S2.1, the preprocessing of the hydrological sample data x (t) in step S2 includes missing value processing, error value correction and normalization;
the normalization formula is as follows:
wherein x is*Is a normalized value, x is an initial value, xminIs the minimum value in the original sequence, xmaxIs the maximum value in the original sequence;
and S2.2, taking the first 80% of the preprocessed hydrological time series data set as a hydrological training data set L, and taking the remaining 20% of the preprocessed hydrological time series data set as a hydrological test data set T.
The XGboost model has a plurality of parameters, and the more optimal parameter can improve the accuracy of sequence prediction, so that the learning rate lr, the number n _ estimators of base learners, the minimum leaf weight min _ weights, the maximum tree depth max _ depth and other super parameters of the XGboost model are optimized by adopting an improved GA-PSO algorithm, and the step S3 specifically comprises the following steps:
s3.1, initializing the learning rate lr of the XGboost model, the number n _ estimators of the base learners, the minimum leaf weight min _ weights and the value range of the maximum tree depth max _ depth parameter, and setting the iteration number of the GA-PSO integral optimization algorithm as T*;
S3.2, randomly generating N subgroups, wherein chromosomes of particles in each subgroup are equivalent to a group of XGboost parameters (lr, N _ estimators, min _ weights, max _ depth);
step S3.3, use R2As individual fitness values, initializing the individual fitness values of all the particles in the N subgroups of step S3.2;
s3.4, performing classical GA optimization on the N subgroups once to finally obtain N optimal particles, wherein the specific GA optimization method comprises the following steps: each subgroup comprises m individuals, and the iteration number of each subgroup is set to be T1Performing selection, crossing and mutation operations on the encoded m individuals to further update the population;
s3.5, calculating the fitness value of each particle after the variation, and updating the optimal individual representing the current iteration times according to the fitness value;
step S3.6, returning to step S3.4 to continue to complete the classical GA optimization until the upper limit T of the iteration times is reached1Satisfying the termination condition, each subgroup will have T1Comparing the fitness of the historical optimal particles, taking the particles with the highest fitness value as the optimal individuals of the subgroup, and finally obtaining N optimal individuals from the N subgroups;
s3.7, decoding the N optimal individuals obtained in the step S3.6 to serve as initial particle swarm of the PSO algorithm, and performing improved PSO optimization, wherein the iteration number of the PSO algorithm is set to be T2;
S3.8, initializing the initial speed of the initial particles of the PSO algorithm, and still adopting R2As a calculation formula of the fitness value, updating the speed and the position of each particle by using the improved formula, thereby updating a historical optimal position, which is marked as pbest and global optimal position gbest of the group;
the particle velocity and position update formula in PSO is:
wherein the content of the first and second substances,representing the velocity of the particles at the current time t,indicating the position of the particle at the current time t,the extreme point of the individual is represented,representing global extreme points, ω being the inertial weight, c1、c2As learning factor, rand1、rand2Is [0,1 ]]Random numbers within the interval;
a non-linear decreasing weight method is adopted for the weight ω:
the learning factor is also in a nonlinear function with the weight:
step S3.9, judging whether the current iteration number is less than or equal to T2If yes, returning to the step S3.8 to continue the current PSO optimization, otherwise, jumping to the step S3.10;
step S3.10,Judging whether the current total iteration number is less than or equal to T*If the number of the GA subgroups can not be met, K individuals of each GA subgroup in the step S3.2 are randomly selected from the historical optimal particles in the PSO to replace the K individuals, and the step S3.2 is returned to continue the optimization; if so, outputting an optimal solution;
the XGBoost in step S4 is a tree integration model, the internal decision tree uses a regression tree, and the detailed process of step S4 is as follows:
the loss function of the GA-PSO optimized XGboost hydrological time series prediction model is set as follows:
wherein the content of the first and second substances,measure the predicted value for the loss functionWith the actual value yiThe difference between them; k represents the number of decision trees contained in the model;the leaf node is a regular term, wherein gamma is a penalty constant of a profit function for segmenting the leaf nodes, M is the number of the leaf nodes, and lambda is a penalty function coefficient of the L2 regular term;
the simplified objective function of the jth training model is:
in the formula (I), the compound is shown in the specification,is the first derivative of the loss function and,the second derivative of the loss function.
Has the advantages that: compared with the prior art, the invention has the advantages that:
according to the invention, the parameters of the XGboost model are optimized by adopting a GA-PSO combined optimization algorithm, so that the situation that local optimization is involved when the optimal parameters are searched is avoided, and the model obtained by utilizing the optimal parameters is used for hydrologic prediction, so that the accuracy is higher. And on the basis of ensuring the prediction accuracy, the method has higher convergence rate, and the calculation speed of large-scale training samples is improved to a certain extent.
The XGboost prediction model after parameter optimization has better prediction effect and prediction precision, and the generalization capability of the prediction model is improved.
Drawings
FIG. 1 is a schematic overall flow diagram of the present invention;
FIG. 2 is a schematic diagram of GA-PSO optimization according to an embodiment of the present invention;
FIG. 3 is a graph comparing the change curves of the fitness values of the GA-PSO and GA and PSO optimization algorithms in the examples;
FIG. 4 is a detailed sequence (471, 481) of the forecast period 1h in the example;
fig. 5 shows the detailed sequence (2068, 2107) of prophase 1h in the example.
Detailed Description
The technical solution of the present invention is described in detail below, but the scope of the present invention is not limited to the embodiments.
As shown in FIG. 1, the invention relates to a prediction method of a hydrological time series based on GA-PSO optimization XGboost, which mainly comprises 4 steps:
and step S1, selecting data of the Longshan watershed to organize a hydrologic time series data set. The time is from 12/24/01/2010 to 7/25/2014, and the time is 31416 pieces of hour data, and one piece of data consists of five attributes including the flow value of the dragon mountain station and the rain values of four rain stations. The four rainfall stations are respectively: dragon mountain, rear love, stream and moon;
s2, preprocessing each hydrological sample data in the hydrological time series data set of the S1, and dividing the sample data set into a hydrological training data set L and a hydrological testing data set T;
step S2.1, the preprocessing of the hydrological sample data in step S2 includes missing value processing, error value correction, and normalization;
the normalization formula is as follows:
wherein x is*Is a normalized value, x is an initial value, xminIs the minimum value in the original sequence, xmaxIs the maximum value in the original sequence;
and S2.2, taking the first 80% of the preprocessed hydrological time series data set as a hydrological training data set L, and taking the remaining 20% of the preprocessed hydrological time series data set as a hydrological test data set T. Selecting 26000 hours of data from 24/01/12/2013/12/11/08 as a training set L, and 5416 pieces of data from 08/12/11/2013/2014/7/25/01 as a test set T;
s3, optimizing the learning rate lr, the number n _ estimators of the base learners, the minimum leaf weight min _ weights and the maximum tree depth max _ depth of the XGboost model by adopting an improved GA-PSO combined optimization algorithm, and training the XGboost model by using a sample data set L to finally obtain the XGboost hydrological time sequence prediction model optimized by GA-PSO;
s3.1, initializing the learning rate lr of the XGboost model, the number n _ estimators of the base learners, the minimum leaf weight min _ weights and the value range of the maximum tree depth max _ depth parameter, setting the range of lr to be (0.01,0.4), the range of n _ estimators to be (10,220), the range of gamma to be (3,10) and the range of max _ depth to be (0, 0.2). Setting the iteration number of the GA-PSO integral optimization algorithm as T*Setting the initial population number to be N-50 and the iteration number T for GA-PSO*Set to 100 times, where the crossover probability cp in the GA used is 0.85, the mutation probability mp is 0.05, and the number of iterations T 150, improved PSO optimization2And (3) optimizing the parameter jinxing of the XGboost model by using a GA-PSO optimization algorithm, wherein the specific flow is shown in figure 2, and the specific steps are as follows:
s3.2, randomly generating N subgroups, wherein chromosomes of particles in each subgroup are equivalent to a group of XGboost parameters (lr, N _ estimators, min _ weights, max _ depth);
step S3.3, use R2As individual fitness values, initializing the individual fitness values of all particles in the N subgroups in step S3.2;
step S3.4, performing classical GA optimization on the 50 subgroups once to finally obtain 50 optimal particles, wherein the specific GA optimization method comprises the following steps: each subgroup contains 50 individuals, and the iteration number of each subgroup is set to be T1Selecting, crossing and mutating the 50 encoded individuals to further update the population;
s3.5, calculating the fitness value of each particle after the variation, and updating the optimal individual representing the current iteration times according to the fitness value;
step S3.6, returning to step S3.4 to continue to complete the classical GA optimization until the upper limit T of the iteration times is reached1Satisfying the termination condition, each subgroup will have T1Comparing the fitness of the historical optimal particles, taking the particles with the highest fitness value as the optimal individuals of the subgroup, and finally obtaining 50 optimal individuals from 50 subgroups;
s3.7, decoding the N optimal individuals obtained in the step S3.6 to serve as initial particle swarm of the PSO algorithm, and performing improved PSO optimization, wherein the iteration number of the PSO algorithm is set to be T2;
S3.8, initializing the initial speed of the initial particles of the PSO algorithm, and still adopting R2As a calculation formula of the fitness value, the velocity and the position of each particle are updated by using the improved formula, so that the historical optimal position, which is recorded as pbest and is the whole population is updatedThe optimal position of the bureau gbest;
the particle velocity and position update formula in PSO is:
wherein the content of the first and second substances,representing the velocity of the particles at the current time t,indicating the position of the particle at the current time t,the extreme point of the individual is represented,representing global extreme points, ω being the inertial weight, c1、c2As a learning factor, rand1、rand2Is [0,1 ]]Random numbers within the interval;
a non-linear decreasing weight method is adopted for the weight ω:
the learning factor is also in a nonlinear function with the weight:
step S3.9, judging whether the current iteration number is less than or equal to T2If yes, returning to the step S3.8 to continue the current PSO optimization, otherwise, jumping to the step S3.10;
s3.10, judging whether the current total iteration times are less than or equal to T*If the number of the individuals in the GA subgroup in step S3.2 is not equal to N/2 equal to 25, then returning to step S3.2 to continue the optimization; if so, outputting an optimal solution;
and step S4, testing the GA-PSO optimized XGboost hydrological prediction model.
The loss function of the GA-PSO optimized XGboost hydrological time series prediction model is set as follows:
wherein the content of the first and second substances,measure the prediction value for the loss functionWith the actual value yiThe difference between them; k represents the number of decision trees contained in the model;the leaf node is a regular term, wherein gamma is a penalty constant of a profit function for segmenting the leaf nodes, M is the number of the leaf nodes, and lambda is a penalty function coefficient of the L2 regular term;
the simplified objective function of the jth training model is:
in the formula (I), the compound is shown in the specification,is the first derivative of the loss function,the second derivative of the loss function.
In the embodiment, the optimal parameters of the XGBoost model with parameters optimized by the GA-PSO optimization algorithm in the forecast period of 1 to 6 hours are shown in table 1 below:
TABLE 1
Predicting the flow data of the dragon mountain by using the optimal model, comparing the flow data with the flow data by using an SVM (support vector machine) model and an LSTM (least squares metric) model, and finally predicting the result as shown in figure 4, wherein MRE (maximum likelihood estimation), MAE (maximum likelihood estimation), RMSE (maximum likelihood estimation) and R (maximum likelihood estimation) are used as evaluation indexes of the predicted result2Four, the calculation formula is as follows:
in the formula, yiIn order to be the actual value of the measurement,in order to have a value that is to be reported,is the average value, and n is the number of samples.
Table 2 shows the comparison between the predicted values of the two prediction models, namely SVM and LSTM, when the optimal parameters are used by the XGboost in the prediction period of 1 h.
TABLE 2
Table 3 shows the comparison of the evaluation indexes of the three models in all the forecast periods.
TABLE 3
Fig. 3 shows a fitness value change curve of the GA-PSO optimization algorithm (GPSO for short) at a forecast period of 1h, compared with the classical GA and the classical PSO algorithms. Two detailed sequences (471, 481) and (2068, 2107) in the test set are selected for display in fig. 4 and fig. 5, respectively.
Claims (5)
1. A hydrological time sequence prediction method for optimizing XGboost based on GA-PSO is characterized by comprising the following steps: the method comprises the following steps:
s1, collecting rainfall values of all rainfall stations corresponding to a water system basin within a certain time period and water levels of corresponding water level stations, and organizing a hydrological time series data set;
s2, preprocessing each hydrological sample data in the hydrological time series data set of the S1, and dividing the sample data set into a hydrological training data set L and a hydrological testing data set T;
step S3, optimizing the learning rate lr, the number n _ estimators of the base learners, the minimum leaf weight min _ weights and the maximum tree depth max _ depth of the XGboost model by adopting an improved GA-PSO combined optimization algorithm, and training the XGboost model by utilizing a hydrologic training data set L to finally obtain the GA-PSO optimized XGboost hydrologic time sequence prediction model; the concrete contents are as follows:
s3.1, initializing the learning rate lr of the XGboost model, the number n _ estimators of the base learners, the minimum leaf weight min _ weights and the value range of the maximum tree depth max _ depth parameter, and setting the iteration number of the GA-PSO integral optimization algorithm as T*;
S3.2, randomly generating N subgroups, wherein chromosomes of particles in each subgroup are equivalent to a group of XGboost parameters (lr, N _ estimators, min _ weights, max _ depth);
step S3.3, use R2As individual fitness values, initializing the individual fitness values of all particles in the N subgroups in step S3.2;
step S3.4, performing classical GA optimization on the N subgroups to finally obtain N optimal particles, wherein the specific GA optimization method comprises the following steps: each subgroup comprises m individuals, and the iteration number of each subgroup is set to be T1Performing selection, crossing and mutation operations on the encoded m individuals to further update the population;
s3.5, calculating the fitness value of each particle after the variation, and updating the optimal individual representing the current iteration times according to the fitness value;
step S3.6, returning to step S3.4, and continuing to finish GA optimization on subgroups until the upper limit T of iteration times is reached1The termination condition is satisfied, then each subgroup has T1Comparing the fitness of the historical optimal particles, taking the particles with the highest fitness value as the optimal individuals of the subgroup, and finally obtaining N optimal individuals from the N subgroups;
s3.7, decoding the N optimal individuals obtained in the step S3.6 to serve as initial particle swarm of the PSO algorithm, and performing improved PSO optimization, wherein the iteration number of the PSO algorithm is set to be T2;
S3.8, initializing the initial speed of the PSO algorithm initial particles, and still adopting R2As a formula for calculating the fitness value, the improved formula is usedUpdating the speed and the position of each particle so as to update the historical optimal position, which is marked as pbest and the global optimal position gbest of the group;
step S3.9, judging whether the current iteration number is less than or equal to T2If yes, returning to the step S3.8 to continue the current PSO optimization, otherwise, jumping to the step S3.10;
s3.10, judging whether the current total iteration times are less than or equal to T*If the number of the GA subgroups can not be met, K individuals of each GA subgroup in the step S3.2 are randomly selected from the historical optimal particles in the PSO to replace the K individuals, and the step S3.2 is returned to continue optimization; if so, outputting an optimal solution;
and S4, testing the test set T by the optimal XGboost hydrological prediction model optimized by the GA-PSO obtained in the step S3.
2. The GA-PSO optimized XGboost-based hydrological time series prediction method according to claim 1, characterized in that: the hydrologic time series data set in step S1 includes current and previous 7-hour rainfall values of the rainfall station corresponding to the water system watershed, and current and previous 7-hour flow values of the corresponding hydrologic station.
3. The GA-PSO optimized XGboost-based hydrological time series prediction method according to claim 1, characterized in that: the preprocessing of the hydrological sample data x (t) in the step S2 includes missing value processing, error value correction and normalization;
the normalization formula is as follows:
wherein x is*Is a normalized value, x is an initial value, xminIs the minimum value in the original sequence, xmaxIs the maximum value in the original sequence;
and taking the first 80% of the preprocessed hydrographic time sequence data set as a hydrographic training data set L, and taking the rest 20% of the data as a hydrographic testing data set T.
4. The GA-PSO optimized XGboost-based hydrological time series prediction method according to claim 1, characterized in that: the particle velocity and position update formula in step S3.8 is:
wherein, the first and the second end of the pipe are connected with each other,representing the velocity of the particles at the current time t,indicating the position of the particle at the current time t,the extreme point of the individual is represented,representing global extreme points, ω being the inertial weight, c1、c2As a learning factor, rand1、rand2Is [0,1 ]]Random numbers within the interval;
a non-linear decreasing weight method is adopted for the weight ω:
the learning factor is also in a nonlinear function with the weight:
5. the GA-PSO optimized XGboost-based hydrological time series prediction method according to claim 1, characterized in that: the detailed process of step S4 is:
the loss function of the GA-PSO optimized XGboost hydrological time series prediction model is set as follows:
wherein the content of the first and second substances,measure the prediction value for the loss functionWith the actual value yiThe difference between them; k represents the number of decision trees contained in the model;is a regular term, wherein gamma is a penalty constant of a gain function for segmenting leaf nodes, M is the number of the leaf nodes, and lambda is an L2 regular term penalty function coefficient;
the simplified objective function of the jth training model is:
in the formula (I), the compound is shown in the specification,is the first derivative of the loss function,is the second derivative of the loss function;
and testing the test set by using the optimal parameters of the XGboost model found by the GA-PSO optimization algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110049321.0A CN112733996B (en) | 2021-01-14 | 2021-01-14 | GA-PSO (genetic algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110049321.0A CN112733996B (en) | 2021-01-14 | 2021-01-14 | GA-PSO (genetic algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112733996A CN112733996A (en) | 2021-04-30 |
CN112733996B true CN112733996B (en) | 2022-07-12 |
Family
ID=75593039
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110049321.0A Active CN112733996B (en) | 2021-01-14 | 2021-01-14 | GA-PSO (genetic algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112733996B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113326660B (en) * | 2021-06-17 | 2022-11-29 | 广西路桥工程集团有限公司 | Tunnel surrounding rock extrusion deformation prediction method based on GA-XGboost model |
CN113503750B (en) * | 2021-06-25 | 2022-07-29 | 太原理工大学 | Method for determining optimal back pressure of direct air cooling unit |
CN113553760A (en) * | 2021-06-25 | 2021-10-26 | 太原理工大学 | Soft measurement method for final-stage exhaust enthalpy of steam turbine |
CN114282431B (en) * | 2021-12-09 | 2023-08-18 | 淮阴工学院 | Runoff interval prediction method and system based on improved SCA and QRGRU |
CN115225560B (en) * | 2022-07-15 | 2023-08-22 | 国网河南省电力公司信息通信公司 | Route planning method in power communication service |
CN115169243A (en) * | 2022-07-28 | 2022-10-11 | 中铁三局集团有限公司 | GA-PSO-GLSSVM algorithm-based soil-rock composite stratum deep foundation pit deformation time sequence prediction method |
CN117272051B (en) * | 2023-11-21 | 2024-03-08 | 浪潮通用软件有限公司 | Time sequence prediction method, device and medium based on LSTM optimization model |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112015719A (en) * | 2020-08-27 | 2020-12-01 | 河海大学 | Regularization and adaptive genetic algorithm-based hydrological prediction model construction method |
-
2021
- 2021-01-14 CN CN202110049321.0A patent/CN112733996B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112733996A (en) | 2021-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112733996B (en) | GA-PSO (genetic algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost | |
CN109214592B (en) | Multi-model-fused deep learning air quality prediction method | |
CN104239489B (en) | Utilize the method for similarity searching and improved BP forecast level | |
CN106650767B (en) | Flood forecasting method based on cluster analysis and real-time correction | |
CN113468803B (en) | WOA-GRU flood flow prediction method and system based on improvement | |
CN111401599B (en) | Water level prediction method based on similarity search and LSTM neural network | |
Piltan et al. | Energy demand forecasting in Iranian metal industry using linear and nonlinear models based on evolutionary algorithms | |
CN110363349B (en) | ASCS-based LSTM neural network hydrological prediction method and system | |
CN113537600B (en) | Medium-long-term precipitation prediction modeling method for whole-process coupling machine learning | |
CN107346459B (en) | Multi-mode pollutant integrated forecasting method based on genetic algorithm improvement | |
CN110969290A (en) | Runoff probability prediction method and system based on deep learning | |
CN106600959A (en) | Traffic congestion index-based prediction method | |
CN116596044B (en) | Power generation load prediction model training method and device based on multi-source data | |
CN115374995A (en) | Distributed photovoltaic and small wind power station power prediction method | |
CN113361761A (en) | Short-term wind power integration prediction method and system based on error correction | |
CN112015719A (en) | Regularization and adaptive genetic algorithm-based hydrological prediction model construction method | |
CN113554466A (en) | Short-term power consumption prediction model construction method, prediction method and device | |
KR102585381B1 (en) | The method, system and equipment for vegetation restoration or rehabilitation of simulating natural ecosystem based on machine learnig | |
CN114580762A (en) | Hydrological forecast error correction method based on XGboost | |
CN113722980A (en) | Ocean wave height prediction method, system, computer equipment, storage medium and terminal | |
CN112330487A (en) | Photovoltaic power generation short-term power prediction method | |
CN115329930A (en) | Flood process probability forecasting method based on mixed deep learning model | |
Shang et al. | Research on intelligent pest prediction of based on improved artificial neural network | |
CN111310974A (en) | Short-term water demand prediction method based on GA-ELM | |
CN116542382A (en) | Sewage treatment dissolved oxygen concentration prediction method based on mixed optimization algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |