CN117273192A - Macroscopic economic prediction method and system based on gradient lifting model - Google Patents

Macroscopic economic prediction method and system based on gradient lifting model Download PDF

Info

Publication number
CN117273192A
CN117273192A CN202310976243.8A CN202310976243A CN117273192A CN 117273192 A CN117273192 A CN 117273192A CN 202310976243 A CN202310976243 A CN 202310976243A CN 117273192 A CN117273192 A CN 117273192A
Authority
CN
China
Prior art keywords
data
index data
economic prediction
economic
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310976243.8A
Other languages
Chinese (zh)
Inventor
商文颖
赵琳
程孟增
潘霄
张娜
吉星
侯依昕
刘禹彤
胡旌伟
刘广朔
蒋海玮
赵竞智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
STATE GRID LIAONING ECONOMIC TECHNIQUE INSTITUTE
Original Assignee
STATE GRID LIAONING ECONOMIC TECHNIQUE INSTITUTE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by STATE GRID LIAONING ECONOMIC TECHNIQUE INSTITUTE filed Critical STATE GRID LIAONING ECONOMIC TECHNIQUE INSTITUTE
Priority to CN202310976243.8A priority Critical patent/CN117273192A/en
Publication of CN117273192A publication Critical patent/CN117273192A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Game Theory and Decision Science (AREA)
  • Computational Linguistics (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of comprehensive energy systems, and discloses a macroscopic economic prediction method based on a gradient lifting model, which comprises the steps of obtaining original index data related to economy, carrying out correlation analysis on the original index data and economic prediction to obtain a correlation degree, and determining target index data; training the original economic prediction model based on sample index data to obtain a target economic prediction model; and taking the target index data as a data set, and inputting the data set data into a target economic prediction model to obtain an economic prediction result. The invention is effectively based on macroscopic economic prediction of the gradient lifting model, can realize accurate prediction of GDP growth according to the prior index data, and provides reference for decision makers.

Description

Macroscopic economic prediction method and system based on gradient lifting model
Technical Field
The invention belongs to the field of comprehensive energy systems, and designs a macroscopic economic prediction method based on a gradient lifting model.
Background
The predictive ability of macroscopic economic variables is essential to design and implement timely policy actions. Among the macroscopic economic variables, the actual GDP growth is one of the most important data. However, predicting the actual GDP growth requires complex calculations, and official data is often deferred for at least one quarter to obtain. Because of this delay, policy makers often design and implement policies without knowing the necessary information. It would be of great value to accurately predict the actual GDP growth in advance. Predicting macroscopic economic data, such as actual GDP growth, is not a simple process. In order to predict data, a conventional economic prediction model requires predetermined related variables to predict in consideration of causal relationships between dependent variables and independent variables, and a top-down and theoretical driving method is often adopted. This process also requires the predictors' economic intuition and judgment of the data and methods used. Inaccurate predictions may result if the forecaster's assumptions are defective. In view of this, how to improve the accuracy of macroscopic economic predictions is a problem to be solved.
Disclosure of Invention
This section is intended to outline some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. Some simplifications or omissions may be made in this section as well as in the description summary and in the title of the application, to avoid obscuring the purpose of this section, the description summary and the title of the invention, which should not be used to limit the scope of the invention.
The present invention has been made in view of the above-described problems occurring in the prior art.
Therefore, the macro economic prediction method based on the gradient lifting model aims at solving the problem of how to improve the accuracy of macro economic prediction.
In order to solve the technical problems, the invention provides a macroscopic economic prediction method based on a gradient lifting model, which comprises the following steps:
acquiring original index data related to economy, performing correlation analysis on the original index data and the economic forecast to obtain a correlation degree, and determining target index data; training the original economic prediction model based on sample index data to obtain a target economic prediction model; and taking the target index data as a data set, and inputting the data set data into a target economic prediction model to obtain an economic prediction result.
As a preferable scheme of the gradient lifting model-based macroscopic economic prediction method of the invention, the method comprises the following steps: the association degree comprises the steps of carrying out dimensionless treatment on original index data and calculating a reference sequence to obtain gray association degree;
the dimensionless treatment comprises the following steps of carrying out dimensionless treatment by adopting a mean method, wherein the dimensionless treatment is expressed as follows:
the reference sequence comprises the absolute value of the corresponding phase difference value of the reference sequence analysis comparison sequence, and the formed absolute difference matrix is expressed as:
Δ 0i (l)=|a 0 (l)-a i (l)|
the gray correlation degree comprises
Wherein ρ is the resolution factor taken from 0.1 to 0.5, ζ 0i (l) The ith comparison sequence a of the association coefficient with a representative value not exceeding 1 i With reference sequence a 0 The degree of association of r 0i For gray correlation, N is the length of the sequence variable,for the reference variable sequence data, l is the current evaluation index number, i is the evaluation object number, ++>For variable sequence data, a i (l) The ith evaluation object after dimensionless processing for processed variable sequence data is related to the ith index variable x l Is the value of delta 0i (l) Delta is the difference between the average value processed evaluation object sequence data and each item of the reference sequence min And delta max Respectively the maximum number and the minimum number in the absolute difference matrix。
As a preferable scheme of the gradient lifting model-based macroscopic economic prediction method of the invention, the method comprises the following steps: the determining of the target index data comprises comparing and judging the association degree obtained by carrying out correlation analysis on the original index data and economic prediction with a preset association degree threshold value:
when the association degree is larger than an association degree threshold value, taking the original index data as target index data;
and when the association degree is smaller than the association degree threshold value, discarding the original index data, and not carrying out association degree analysis again as target index data.
As a preferable scheme of the gradient lifting model-based macroscopic economic prediction method of the invention, the method comprises the following steps: the target economic prediction model comprises sample index data, wherein the sample index data comprises training sample index data and verification sample index data;
the training sample index data comprises training the original economic prediction model to obtain a trained first economic prediction model;
the verification sample index data includes cross-verifying the first economic prediction model to obtain a verification result, and determining the target economic prediction model based on the verification result.
As a preferable scheme of the gradient lifting model-based macroscopic economic prediction method of the invention, the method comprises the following steps: the first economic prediction model comprises an initial learner which is obtained by calculating a predicted value, a calculation loss function and error rate judgment according to a first economic prediction model result;
the specific steps of calculating the predicted value are that a weak learner is initialized:
the predicted value is obtained through operation processing,
when initializing, c takes the value as the average value of all training sample label values, and the obtained initial learner is expressed as:
f 0 (x)=c
wherein y is i Is the observed value, c is the predicted value, L is the loss function, F 0 (x) Is the average value of the observed values, N is the number of weak learners, f 0 (x) Is an initial learner.
As a preferable scheme of the gradient lifting model-based macroscopic economic prediction method of the invention, the method comprises the following steps: the loss function comprises the steps of calculating a negative gradient, calculating a best fit value and updating a learner;
the negative gradient includes for each sample sequence number i=1, 2, …, N, the calculation of the negative gradient is expressed as:
wherein F (x) i ) Is a learner, gamma mi Is the true value of the residual sample new 12 and the data (x imi ) As training data of the next tree, i is serial number data of the tree to obtain a target regression tree, R mj For the leaf node area j=1, 2, …, J corresponding to the target regression tree, J is the number of leaf nodes of the regression tree, J is the sequence number of the leaf nodes of the regression tree;
the best fit values include the values for j=1, 2, …, J leaf nodes, calculated best fit values expressed as:
wherein F is m-1 (x) Is the m-1 th regression tree, c mj Is R mj Least squares loss of (c) is minimized.
As a preferable scheme of the gradient lifting model-based macroscopic economic prediction method of the invention, the method comprises the following steps: the updating learner comprises the steps that after the updating learner performs all M iterations and updates the Fm (x) function, the learner obtaining the GBDT is expressed as:
wherein F is m (x) Is the mth learner, F m-1 (x) Is the m-1 learner, v is the learning rate, the loss function used by definition is set by setting the learning rate, I is the interval,is a regression tree;
updating a learner and inputting data into the first economic prediction model to obtain a first economic prediction result, and determining the error rate of the first economic prediction model based on the first economic prediction result:
when the error rate is smaller than or equal to a preset error rate threshold, the first economic prediction model is used as the target economic prediction model;
and when the error rate is greater than a preset error rate threshold, the first economic prediction model is not used as a target economic prediction model, and the learner is updated again.
Another object of the present invention is to provide a system for a macroscopic economic prediction method based on a gradient lifting model, which can obtain raw index data related to economy, wherein the raw index data includes energy data and economic data; and carrying out correlation analysis on the original index data and the economic prediction to obtain a correlation degree, determining target index data, taking the target index data as a data set, and inputting the training set and verification set data into a target economic prediction model to obtain an economic prediction result.
The macroscopic economic prediction system based on the gradient lifting model is characterized by comprising a data acquisition module, a correlation analysis module and a prediction module.
The data acquisition module is responsible for being configured to acquire raw index data related to economy.
The correlation analysis module is responsible for being configured to conduct correlation analysis on the original index data and economic prediction to obtain a degree of correlation, and determining target index data based on the degree of correlation.
And the prediction module is responsible for being configured to input the target index data into a target economic prediction model to obtain an economic prediction result.
A computer device comprising a memory and a processor, said memory storing a computer program, characterized in that said processor, when executing said computer program, implements the steps of a macro-economic prediction method based on a gradient lifting model.
A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of a macroscopic economic prediction method based on a gradient lifting model.
The invention has the beneficial effects that: the macro economic prediction method based on the gradient lifting model and the related equipment are provided, and the original index data related to the economy are obtained, wherein the original index data comprise energy data and economic data; performing correlation analysis on the original index data and economic prediction to obtain a correlation degree, determining target index data, and taking the target index data as a data set; and dividing the data set into a verification set and a test set, and inputting the data of the training set and the verification set into a target economic prediction model to obtain an economic prediction result. In the scheme, accurate prediction of GDP growth can be realized according to the conventional index data, and a reference is provided for a decision maker.
Drawings
For a clearer description of the technical solutions of embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the description below are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art, wherein:
fig. 1 is a schematic flow chart of a macroscopic economic prediction method based on a gradient lifting model according to an embodiment of the present invention.
Fig. 2 is a flowchart of a macroscopic economic prediction method based on a gradient lifting model according to an embodiment of the present invention.
FIG. 3 is a flow chart of a decision tree of a macroscopic economic prediction method based on a gradient lifting model according to an embodiment of the present invention.
Fig. 4 is a model diagram of a gradient lifting model of a macroscopic economic prediction method based on the gradient lifting model according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of a gradient lifting model of a macroscopic economic prediction method based on the gradient lifting model according to an embodiment of the present invention.
Fig. 6 is a schematic structural diagram of a cross-validation algorithm of a macroscopic economic prediction method based on a gradient lifting model according to an embodiment of the present invention.
Fig. 7 is a schematic structural diagram of macroscopic economic prediction based on a gradient lifting model according to a macroscopic economic prediction method based on a gradient lifting model according to an embodiment of the present invention.
Fig. 8 is a schematic structural diagram of an electronic device according to a macroscopic economic prediction method based on a gradient lifting model according to an embodiment of the present invention.
FIG. 9 is a schematic workflow diagram of a macro-economic prediction system based on a gradient lifting model according to an embodiment of the present invention.
Detailed Description
So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
While the embodiments of the present invention have been illustrated and described in detail in the drawings, the cross-sectional view of the device structure is not to scale in the general sense for ease of illustration, and the drawings are merely exemplary and should not be construed as limiting the scope of the invention. In addition, the three-dimensional dimensions of length, width and depth should be included in actual fabrication.
Also in the description of the present invention, it should be noted that the orientation or positional relationship indicated by the terms "upper, lower, inner and outer", etc. are based on the orientation or positional relationship shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first, second, or third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected, and coupled" should be construed broadly in this disclosure unless otherwise specifically indicated and defined, such as: can be fixed connection, detachable connection or integral connection; it may also be a mechanical connection, an electrical connection, or a direct connection, or may be indirectly connected through an intermediate medium, or may be a communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Example 1
Referring to fig. 1, a first embodiment of the present invention provides a macroscopic economic prediction method based on a gradient lifting model, including:
s1: and acquiring the original index data related to the economy, performing correlation analysis on the original index data and the economy forecast to obtain a correlation degree, and determining target index data.
Further, economic analysis is performed, preprocessing is performed on the data after the historical data are obtained, feature extraction is performed, a gradient rising model is constructed based on the historical data and the feature data, a recursive algorithm combining cross validation and super-parameter tuning is applied, and a super-parameter combination is found, so that the average value of MSE is minimized.
It should be noted that, the association degree includes performing dimensionless processing on the original index data, and calculating a reference sequence to obtain gray association degree; the dimensionless treatment comprises the step of carrying out dimensionless treatment by adopting a mean value method, and determining a reference data sequenceAnd analysis of sequence->The n+1 column data sequence formation matrix is expressed as:
calculating absolute values of corresponding phase differences of the reference sequence analysis and comparison sequences, forming an absolute difference matrix, and obtaining an analysis sequence after processing:
Δ 0i (l)=|a 0 (l)-a i (l)|
for the input sequence Ai and the reference sequence a, we can calculate the average of N correlation coefficients to get the grey correlation between them:
wherein ρ is the resolution factor taken from 0.1 to 0.5, ζ 0i (l) The ith comparison sequence a of the association coefficient with a representative value not exceeding 1 i With reference sequence a 0 The degree of association of r 0i For gray correlation, N is the length of the sequence variable,for the reference variable sequence data, l is the current evaluation index number, i is the evaluation object number, ++>For variable sequence data, a i (l) The ith evaluation object after dimensionless processing for processed variable sequence data is related to the ith index variable x l Is the value of delta 0i (l) Delta is the difference between the average value processed evaluation object sequence data and each item of the reference sequence min And delta max The maximum number and the minimum number in the absolute difference matrix, respectively.
It should be further noted that determining the target index data includes comparing and judging a correlation degree obtained by performing correlation analysis on the original index data and the economic prediction with a preset correlation degree threshold value: when the association degree is larger than an association degree threshold value, taking the original index data as target index data; and when the association degree is smaller than the association degree threshold value, discarding the original index data, and not carrying out association degree analysis again as target index data.
S2: and training the original economic prediction model based on the sample index data to obtain a target economic prediction model.
Further, the target economic prediction model comprises sample index data, wherein the sample index data comprises training sample index data and verification sample index data; the training sample index data comprises training the original economic prediction model to obtain a trained first economic prediction model; the verification sample index data includes cross-verifying the first economic prediction model to obtain a verification result, and determining the target economic prediction model based on the verification result.
It should be noted that, the first economic prediction model includes calculating a prediction value to obtain an initial learner, calculating a loss function, and performing error rate judgment according to the result of the first economic prediction model;
the specific steps of calculating the predicted value are that a weak learner is initialized:
obtaining the predicted value by arithmetic processing and taking a loss function as square loss,
when initializing, c takes the value as the average value of all training sample label values, and the obtained initial learner is expressed as:
f 0 (x)=c
wherein y is i Is the observed value, c is the predicted value, L is the loss function, F 0 (x) Is the average value of the observed values, N is the number of weak learners, f 0 (x) Is an initial learner.
It should also be noted that the loss function includes calculating a negative gradient, calculating a best fit value, and updating the learner; negative gradients include for each sample sequence number i=1, 2, …, M, the calculation of negative gradients is expressed as:
wherein F (x) i ) Is a learner, gamma mi Is the true value of the residual sample new 12 and the data (x imi ) As training data of the next tree, i is serial number data of the tree to obtain a target regression tree, R mj For the leaf node area j=1, 2, …, J corresponding to the target regression tree, J is the number of leaf nodes of the regression tree, J is the sequence number of the leaf nodes of the regression tree;
furthermore, the best fit value includes the calculated best fit value for j=1, 2, …, J leaf nodes expressed as:
wherein F is m-1 (x) Is the m-1 th regression tree, c mj Is R mj V is a loss function used by the learning rate custom by setting the learning rate, I is the interval,is a regression tree.
It should also be noted that updating the learner after all M iterations and updating the Fm (x) function, the learner that gets the GBDT is expressed as:
wherein F is m (x) Is the mth learner, F m-1 (x) Is the m-1 learner, v is the learning rate, the loss function used by definition is set by setting the learning rate, I is the interval,is a regression tree.
S3: and taking the target index data as a data set, and inputting the data set data into a target economic prediction model to obtain an economic prediction result.
Further, the learner is updated to input data into the first economic prediction model to obtain a first economic prediction result, and an error rate of the first economic prediction model is determined based on the first economic prediction result: when the error rate is smaller than or equal to a preset error rate threshold, the first economic prediction model is used as the target economic prediction model; and when the error rate is greater than a preset error rate threshold, the first economic prediction model is not used as a target economic prediction model, and the learner is updated again.
It should be noted that the present invention uses k-fold cross-validation, which divides the training data into k segments and tests each segment separately to fit the model. Due to the time dependence between the data, k-fold cross-validation aims to set the first k-fold as a training set and the folded data as a test set, which ensures that future data will not be used to test past data, since the predictive model should exclude all data of events occurring chronologically after the events used to fit the model.
Example 2
Referring to fig. 2-8, a macroscopic economic prediction method based on a gradient lifting model is provided for one embodiment of the present invention, and in order to verify the beneficial effects of the present invention, scientific demonstration is performed through experiments.
10 relevant influence index factors such as power and energy with great influence on economic production in energy economy and 32 relevant influence index factors in economic aspect are comprehensively analyzed, and the influence and effect of the energy aspect and the economic aspect on GDP in China are comprehensively analyzed.
1. Feature analysis
TABLE 1 energy index names and meanings
TABLE 2 economic index names and meanings
1. Correlation analysis in electric power energy data
Pearson correlation coefficient
In studying and predicting the linear correlation between two data sets, pearson correlation coefficients are the most commonly used indicators to reflect the degree of correlation between the two. The larger the absolute value of the Pearson correlation coefficient, the higher the linear correlation between the two. The positive and negative of the result represent the correlation between the two, which is positive correlation or negative correlation.
The choice of variables when adding influencing factors for prediction is critical. To screen out the appropriate influencing variables, the Pearson correlation coefficients of each influencing variable x1-x10 and the GDP indicator y are first calculated.
The ten indexes are selected to have direct pearson correlation coefficients with GDP greater than 0.8 and 90% greater than 0.9, so that the ten indexes selected in the method have a strong linear relationship with GDP.
B. Gray correlation analysis
The influence degree of the selected 10 factors on the GDP is judged by the calculation gray correlation analysis, according to the calculation result, the gray correlation degree of X10 and the GDP is highest, the gray correlation degree of X7 is lowest, and the four variables in the top ranking are selected as main indexes affecting the GDP according to the sorting from big to small, namely the total social electricity consumption, the electricity production, the total energy consumption, the disposable electricity and other energy consumption
2. Correlation analysis in economic data
Whether the prediction index selected in modeling is reasonable or not has a great relation with the accuracy of the prediction result. The pearson correlation coefficient heat map can well show the correlation between every two features. From the economic data index used herein, pearson correlation coefficient heat maps were plotted.
For GDP, the first 20 indexes with the highest contribution to the importance of the GDP are selected for modeling, and respective prediction influence factor index systems are constructed. The invention selects the traditional economic index related to national account, employment, currency, trade and currency expansion statistics as the regression index. National accounting variables include actual government consumption, actual private consumption, international balance account, actual annual GDP, actual GDP growth (quarterly), actual GDP growth (homonymous), government balance to GDP specific gravity, government debt total to GDP percentage, foreign exchange reserves, actual inventory, foreign debt total and foreign direct investments. Employment variables include total employment rate and loss rate. The monetary variables include the rate of dollars exchanged, the rate of euros exchanged, and the rate of 10 year old government bond returns. The currency expansion variables include consumer price index and GDP flat index. All variables are quarter data from the fourth quarter in 1981 to the second quarter in 2022. In this study, the actual GDP (homonymy) growth was set as the dependent variable and the other variables were set as independent variables. The observed value number for each variable is 155. More detailed information about the variables.
3. Comprehensive power energy data and economic data aspect index
The method also selects 4 relevant influence index factors of electric power, energy and the like which have great influence on economic production in energy economy, and specifically comprises the total social electricity consumption, the electric power production, the total energy consumption, the disposable electric power consumption and other energy consumption, and 20 relevant influence index factors of economic aspect comprise national tax income, so that an economic prediction index system is comprehensively constructed.
2. Constructing gradient lifting model
The gradient lifting model is an integrated machine learning model proposed by Friedman (2001). The main idea of the gradient lifting model is to combine multiple weak learners to improve the accuracy and robustness of the final model.
2.1 decision Tree
As shown in fig. 2, input: training a data set D; and (3) outputting: regression tree f (x)
In the input space where the training data set is located, recursively dividing each region into two sub-regions and determining output values on each sub-region, and constructing a binary decision tree:
(1) Selecting an optimal segmentation variable j and a segmentation point s, and solving:
traversing variable j, scanning the fixed slicing variable j for slicing point s, and selecting the pair (j, s) that minimizes it.
(2) Dividing the regions by the selected pairs (j, s) and determining the corresponding output values:
R 1 (j,s)={x|x (j) ,s},R 2 (j,s)={x|x (j) >s}
(3) And (3) continuing to call the steps (1) and (2) for the two sub-areas until the stopping condition is met.
(4) Dividing the input space into M regions R 1 ,R 2 ,…,R m Generating a decision tree:
2.2 gradient lifting model
For traditional linear regression models that focus on interpreting the influence of regression variables, high correlation may lead to multiple collinearity problems; however, aggregate models such as gradient boosting focus on predictions, aim at using decision trees that do not use all predictors, but rather select some regression factors to maximize prediction accuracy, and are robust to multiple co-linearity problems.
The gradient lifting model begins with making a leaf and constructing a regression tree, which is a decision tree that aims to estimate a continuous real-valued function rather than a classifier. The regression tree is constructed by an iterative process that continues to split the data into nodes or branches, including smaller groups. Initially, all observations are placed in the same group. The data is then distributed into two partitions, with each possible split being used on each available prediction variable. The predictors of the split tree are the predictors that most clearly divide the observations into two different groups and minimize the residual, which in this study was introduced in friedman (2001) by friedman measurement MSE.
Based on the errors of the previous tree, the gradient lifting model trains another tree and continues to make other trees in this way until the number of designs or fits cannot be improved. To avoid the overfitting problem, the gradient lifting model uses a learning rate to scale the contribution of the new tree. Based on Friedman (2001), the algorithm of the gradient lifting model requires inputs and a micro-loss function, which is a square regression in this study.
3. Cross validation
The machine learning model utilized several hyper-parameters, the study used k-fold cross-validation, which divided the training data into k segments, and each segment was tested separately to fit the model. Due to the time dependence between the data, k-fold cross-validation aims at setting the first k-fold as the training set and the folded data as the test set. This ensures that future data will not be used to test past data, as the predictive model should exclude all data for events that occur chronologically after the events used to fit the model. In this study, k was set to 10 and training data was set to 10 subsets to train and fit the model according to the previous literature, as shown in fig. 6, which is a schematic diagram of cross-validation.
The cross-validation process aims to select the optimal super-parameter set that yields the lowest average mean squared error based on the testing of 10 subsets. In other words, the optimal super parameter set contained in the cross-validation process will be used to predict from the test data set. The hyper-parameter tuning strategy used in this study was a grid search in which all possible combinations of given hyper-parameters were tested. Regarding the number of predictors, all predictors are considered, and the depth of the tree is controlled by the number of splits of the gradient boosting model and the random forest model. The purpose of cross-validation is to find a combination of superparameters that minimizes the average value of MSE.
For a prediction of the actual GDP growth in x years, the same cross-validation will be performed twice. The first process would be to predict the same-as-actual GDP growth prediction before two quarters in a quarter of data. The second process will repeat the prediction for the data for the second quarter and make the final prediction.
As shown in fig. 7, the macroscopic economic prediction apparatus based on the gradient lifting model includes: a data acquisition module 601 configured to acquire raw index data related to economy; a correlation analysis module 602 configured to perform correlation analysis on the original index data and economic prediction to obtain a correlation degree, and determine target index data based on the correlation degree; a prediction module 603 configured to input the target index data into a target economic prediction model to obtain an economic prediction result; the target economic prediction model is obtained by training an original economic prediction model based on sample index data.
Fig. 8 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU, a microprocessor, an application-specific integrated circuit, or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided in the embodiments of the present disclosure.
The memory 1020 may be implemented in the form of ROM, RAM, static storage device, dynamic storage device, etc. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in the device or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
The communication interface 1040 is used to connect with a communication module to implement communication interaction between the device and other devices. The communication module can realize communication in a wired mode or in a wireless mode.
Bus 1050 includes a path to transfer information between the various components of the device.
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.
Example 3
A third embodiment of the present invention, which is different from the first two embodiments, is:
the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Example 4
Referring to fig. 9, a fourth embodiment of the present invention provides a macroscopic economic prediction system based on a gradient lifting model, which includes a data acquisition module, a correlation analysis module, and a prediction module.
The data acquisition module is responsible for being configured to acquire raw index data related to economics.
The relevance analysis module is responsible for being configured to conduct relevance analysis on the original index data and economic predictions to obtain relevance degrees, and determining target index data based on the relevance degrees.
The prediction module is responsible for being configured to input the target index data into a target economic prediction model to obtain an economic prediction result.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims (10)

1. A macroscopic economy prediction method based on a gradient lifting model is characterized by comprising the following steps of: comprising the steps of (a) a step of,
acquiring original index data related to economy, performing correlation analysis on the original index data and the economic forecast to obtain a correlation degree, and determining target index data;
training the original economic prediction model based on sample index data to obtain a target economic prediction model;
and taking the target index data as a data set, and inputting the data set data into a target economic prediction model to obtain an economic prediction result.
2. The method for macroscopic economic prediction based on gradient lifting model as set forth in claim 1, wherein: the association degree comprises the steps of carrying out dimensionless treatment on original index data and calculating a reference sequence to obtain gray association degree;
the dimensionless treatment comprises the following steps of carrying out dimensionless treatment by adopting a mean method, wherein the dimensionless treatment is expressed as follows:
the reference sequence comprises the absolute value of the corresponding phase difference value of the reference sequence analysis comparison sequence, and the formed absolute difference matrix is expressed as:
Δ 0i (l)=|a 0 (l)-a i (l)|
the gray correlation degree comprises
Wherein ρ is the resolution factor taken from 0.1 to 0.5, ζ 0i (l) The ith comparison sequence a of the association coefficient with a representative value not exceeding 1 i With reference sequence a 0 The degree of association of r 0i For gray correlation, N is the length of the sequence variable,for the reference variable sequence data, l is the current evaluation index number, i is the evaluation object number, ++>For variable sequence data, a i (l) The ith evaluation object after dimensionless processing for processed variable sequence data is related to the ith index variable x l Is the value of delta 0i (l) Delta is the difference between the average value processed evaluation object sequence data and each item of the reference sequence min And delta max The maximum number and the minimum number in the absolute difference matrix, respectively.
3. A method of macroscopic economic prediction based on a gradient lifting model as recited in claim 2, wherein: the determining of the target index data comprises comparing and judging the association degree obtained by carrying out correlation analysis on the original index data and economic prediction with a preset association degree threshold value:
when the association degree is larger than an association degree threshold value, taking the original index data as target index data;
and when the association degree is smaller than the association degree threshold value, discarding the original index data, and not carrying out association degree analysis again as target index data.
4. A method of macroscopic economic prediction based on a gradient lifting model as claimed in claim 3, wherein: the target economic prediction model comprises sample index data, wherein the sample index data comprises training sample index data and verification sample index data;
the training sample index data comprises training the original economic prediction model to obtain a trained first economic prediction model;
the verification sample index data includes cross-verifying the first economic prediction model to obtain a verification result, and determining the target economic prediction model based on the verification result.
5. The method for macroscopic economic prediction based on gradient lifting model as recited in claim 4, wherein: the first economic prediction model comprises an initial learner which is obtained by calculating a predicted value, a calculation loss function and error rate judgment according to a first economic prediction model result;
the specific steps of calculating the predicted value are that a weak learner is initialized:
the predicted value is obtained through operation processing,
when initializing, c takes the value as the average value of all training sample label values, and the obtained initial learner is expressed as:
f 0 (x)=c
wherein y is i Is the observed value, c is the predicted value, L is the loss function, F 0 (x) Is the average value of the observed values, N is the number of weak learners, f 0 (x) Is an initial learner.
6. The method for macroscopic economic prediction based on gradient lifting model as recited in claim 5, wherein: the loss function comprises the steps of calculating a negative gradient, calculating a best fit value and updating a learner;
the negative gradient includes for each sample sequence number i=1, 2, …, N, the calculation of the negative gradient is expressed as:
wherein F (x) i ) Is a learner, gamma mi Is the true value of the residual sample new 12 and the data (x imi ) As training data of the next tree, i is serial number data of the tree to obtain a target regression tree, R mj For the leaf node area j=1, 2, …, J corresponding to the target regression tree, J is the number of leaf nodes of the regression tree, J is the sequence number of the leaf nodes of the regression tree;
the best fit values include the values for j=1, 2, …, J leaf nodes, calculated best fit values expressed as:
wherein F is m-1 (x) Is the m-1 th regression tree, c mj Is R mj Least squares loss of (c) is minimized.
7. The method for macroscopic economic prediction based on gradient lifting model as recited in claim 6, wherein: the updating learner comprises the steps that after the updating learner performs all M iterations and updates the Fm (x) function, the learner obtaining the GBDT is expressed as:
wherein F is m (x) Is the mth learner, F m-1 (x) Is the m-1 learner, v is the learning rate, the loss function used by definition is set by setting the learning rate, I is the interval,is a regression tree;
updating a learner and inputting data into the first economic prediction model to obtain a first economic prediction result, and determining the error rate of the first economic prediction model based on the first economic prediction result:
when the error rate is smaller than or equal to a preset error rate threshold, the first economic prediction model is used as the target economic prediction model;
and when the error rate is greater than a preset error rate threshold, the first economic prediction model is not used as a target economic prediction model, and the learner is updated again.
8. A system adopting the macroscopic economic prediction method based on the gradient lifting model as claimed in any one of claims 1 to 7, which is characterized by comprising a data acquisition module, a correlation analysis module and a prediction module;
the data acquisition module is responsible for acquiring original index data related to economy;
the correlation analysis module is responsible for carrying out correlation analysis on the original index data and economic prediction to obtain a correlation degree, and determining target index data based on the correlation degree;
and the prediction module is responsible for being configured to input the target index data into a target economic prediction model to obtain an economic prediction result.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202310976243.8A 2023-08-04 2023-08-04 Macroscopic economic prediction method and system based on gradient lifting model Pending CN117273192A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310976243.8A CN117273192A (en) 2023-08-04 2023-08-04 Macroscopic economic prediction method and system based on gradient lifting model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310976243.8A CN117273192A (en) 2023-08-04 2023-08-04 Macroscopic economic prediction method and system based on gradient lifting model

Publications (1)

Publication Number Publication Date
CN117273192A true CN117273192A (en) 2023-12-22

Family

ID=89205138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310976243.8A Pending CN117273192A (en) 2023-08-04 2023-08-04 Macroscopic economic prediction method and system based on gradient lifting model

Country Status (1)

Country Link
CN (1) CN117273192A (en)

Similar Documents

Publication Publication Date Title
Dejaeger et al. Data mining techniques for software effort estimation: a comparative study
CN111563706A (en) Multivariable logistics freight volume prediction method based on LSTM network
Samarina et al. Inflation targeting and inflation performance: a comparative analysis
US20160085754A1 (en) Robust selection of candidates
Kell et al. Evaluation of the prediction skill of stock assessment using hindcasting
CN103714261B (en) Intelligent auxiliary medical treatment decision supporting method of two-stage mixed model
CN111951097A (en) Enterprise credit risk assessment method, device, equipment and storage medium
CN110633859B (en) Hydrologic sequence prediction method integrated by two-stage decomposition
CN111863247B (en) Brain age cascade refining prediction method and system based on structural magnetic resonance image
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
Fang et al. Prediction modelling of rutting depth index for asphalt pavement using de-noising method
Müller et al. Scientific machine and deep learning investigations of the local buckling behaviour of hollow sections
US11995667B2 (en) Systems and methods for business analytics model scoring and selection
CN112686470A (en) Power grid saturation load prediction method and device and terminal equipment
CN112801315A (en) State diagnosis method and device for power secondary equipment and terminal
CN117273192A (en) Macroscopic economic prediction method and system based on gradient lifting model
CN116720079A (en) Wind driven generator fault mode identification method and system based on multi-feature fusion
Zhang et al. Nowcasting China’s GDP using a Bayesian approach
CN114363004B (en) Risk assessment method, risk assessment device, computer equipment and storage medium
CN115146822A (en) Photovoltaic power generation prediction method and device and terminal equipment
CN115048290A (en) Software quality evaluation method and device, storage medium and computer equipment
CN114372618A (en) Student score prediction method and system, computer equipment and storage medium
Tong et al. Ploutos: Towards interpretable stock movement prediction with financial large language model
TW202207102A (en) Material procurement method, electric device and computer program product
CN115769194A (en) Automatic data linking across datasets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination