CN115796358A

CN115796358A - Carbon emission prediction method and terminal

Info

Publication number: CN115796358A
Application number: CN202211503703.7A
Authority: CN
Inventors: 黄夏楠; 杨丝雨; 刘林; 胡臻达; 涂夏哲; 洪居华; 林伟伟; 邹艺超; 郑欢; 张林垚
Original assignee: State Grid Fujian Electric Power Co Ltd; Economic and Technological Research Institute of State Grid Fujian Electric Power Co Ltd
Current assignee: State Grid Fujian Electric Power Co Ltd; Economic and Technological Research Institute of State Grid Fujian Electric Power Co Ltd
Priority date: 2022-11-28
Filing date: 2022-11-28
Publication date: 2023-03-14

Abstract

The invention discloses a method and a terminal for predicting carbon emission, which are used for acquiring statistical data of a calendar year of a statistical yearbook and cleaning the statistical data; judging the importance degree of the influence factors by a random forest algorithm according to the cleaned statistical data, selecting the influence factors according to the importance, and generating a feature matrix; establishing a carbon emission prediction model based on an improved firework algorithm IFWA and a generalized regression neural network GRNN; training the carbon emission prediction model according to the characteristic matrix, and predicting carbon emission according to the input numerical values of all the influence factors after training; aiming at the problem of complex influence factors, the invention adopts a random forest algorithm to screen the factors and judge the importance degree of the factors, and aims at the problem that the traditional firework algorithm is easy to cause the model to fall into the local optimal solution, and adopts an improved firework algorithm, thereby effectively improving the prediction performance of the model and improving the accuracy of carbon emission prediction.

Description

Carbon emission prediction method and terminal

Technical Field

The invention relates to the technical field of carbon emission prediction, in particular to a carbon emission prediction method and a terminal.

Background

The problem of climate warming caused by carbon emission from fossil energy is a common challenge for countries in the world, and as the most developing countries in the world and the countries with the largest carbon emission, china makes a promise that the intensity of carbon emission is reduced by 60-65% compared with 2005 by 2030, sets a target that the total carbon emission reaches the peak value and strives to reach the peak value as early as possible by 2030 and strives to realize the vision of carbon neutralization before 2060.

In the face of unprecedented pressure for emission reduction and pressure for economic transformation, the exploration of an energy-saving and emission-reducing strategy suitable for the current national situation of China is of great importance. Therefore, the carbon emission amount of energy consumption in China is predicted based on the current policy, a more scientific and reasonable carbon emission reduction policy proposal is provided, and the method has very important practical significance for energy conservation and emission reduction work and sustainable development in China.

In the process of realizing the double-carbon target and constructing a low-carbon society, an important problem is to accurately predict the carbon emission level and provide important references for relevant departments to make medium-long term economic development strategies and adjust current policies. In order to make a targeted emission reduction policy and achieve a desired effect, it is necessary to know factors causing an increase in carbon dioxide emission so as to effectively reduce the carbon dioxide emission by controlling the factors, and the factors change with the policy and the application of low-carbon technology. Although the influence of an external variable is considered in the conventional multi-factor prediction model, the problem of error accumulation is inevitably caused when the model is used for predicting carbon dioxide emission on the premise that the influence factor is unchanged, so that inaccurate carbon prediction is made. Therefore, for the traditional prediction under the condition of 'influence factor invariance', a medium-and-long-term carbon prediction model of 'influence factor variation' needs to be searched by further combining policy trend analysis. In addition, the prediction accuracy of the prediction model needs to be improved, so that the accuracy of carbon dioxide emission prediction is improved.

For carbon prediction research, the methods mainly adopted in the prior art are as follows: on the basis of a gray model, an expanded gray prediction model is provided, the gray model is combined with an autoregressive comprehensive moving average model and a second-order polynomial regression model, model parameters are optimized by adopting a PSO (particle swarm optimization) method, and finally higher carbon emission precision is obtained, but relevant factors for reducing the prediction precision are ignored by the method; an artificial neural network and a nonlinear grey multivariate model are provided to study the relationship among population, GDP, oil trade, natural gas trade and carbon dioxide emission, although the influence of multiple factors is considered, the change of the influencing factors is ignored, so that the error accumulation is caused, and the prediction result is inaccurate; applying machine learning algorithms such as BP neural network, extreme learning machine, support vector machine and the like to the carbon prediction's research 23428shows good prediction effect, but the defect that the prediction model is easy to fall into local optimal solution exists. In summary, although some research results have been obtained for the medium-and long-term carbon prediction, there is still some room for improvement in prediction accuracy.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the carbon emission prediction method and the terminal are provided, and the accuracy of carbon emission prediction can be effectively improved.

In order to solve the technical problems, the invention adopts the technical scheme that:

a method of carbon emission prediction comprising the steps of:

s1, acquiring statistical data of a calendar year of a statistical yearbook, and performing data cleaning on the statistical data;

s2, judging the importance degree of the influence factors by a random forest algorithm according to the cleaned statistical data, selecting the influence factors according to the importance degree, and generating a feature matrix;

s3, establishing a carbon emission prediction model based on an improved firework algorithm IFWA and a generalized regression neural network GRNN;

and S4, training the carbon emission prediction model according to the characteristic matrix, and predicting the carbon emission according to the input numerical values of all the influence factors after training.

In order to solve the technical problem, the invention adopts another technical scheme as follows:

a terminal for carbon emission prediction, comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps of one of the above methods for carbon emission prediction when executing the computer program.

The invention has the beneficial effects that: according to the carbon emission prediction method and the terminal, aiming at the problem of complex influence factors, the factors are screened and the importance degree of the factors is judged by adopting the random forest algorithm, the accuracy of the prediction result is improved, and aiming at the problem that the model is easy to fall into the local optimal solution by adopting the traditional firework algorithm, the prediction performance of the model is effectively improved and the carbon emission prediction precision is effectively improved by adopting the improved firework algorithm.

Drawings

FIG. 1 is a flow chart of a method of carbon emissions prediction in accordance with an embodiment of the present invention;

fig. 2 is a block diagram of a carbon emission prediction terminal according to an embodiment of the present invention;

FIG. 3 is a detailed flow chart of a method of carbon emissions prediction according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a random forest feature selection method of a method for predicting carbon emissions according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a GRNN model of a method for carbon emission prediction according to an embodiment of the invention;

description of the reference symbols:

1. a terminal for carbon emission prediction; 2. a processor; 3. a memory.

Detailed Description

In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.

Referring to fig. 1 and 2, a method for predicting carbon emissions includes the steps of:

From the above description, the beneficial effects of the present invention are: according to the carbon emission prediction method and the terminal, aiming at the problem of complex influence factors, the factors are screened by adopting a random forest algorithm and the importance degree of the factors is judged, the accuracy of the prediction result is improved, and aiming at the problem that the model is easy to fall into the local optimal solution by adopting the traditional firework algorithm, the prediction performance of the model is effectively improved and the carbon emission prediction precision is effectively improved by adopting the improved firework algorithm.

Further, the mathematical expression of GRNN is:

wherein, P _i Is a mode layer neuron transfer function; x is an input variable, X = [ X ] ₁ ,x ₂ ,…,x _n ] ^T ；X _i Learning samples corresponding to the ith neuron; alpha is a smoothing factor; n is the number of mode layer neurons; s. the _Nj Is the summation layer transfer function; y is _ij The connection weight of the ith neuron and the jth molecule in the weighting layer is obtained;

for the output layer output, k is the dimension of the output vector.

The GRNN algorithm is adopted, the GRNN is a variation form of the radial basis function neural network, the GRNN has strong nonlinear mapping capability and learning speed, has stronger advantages than the radial basis function neural network, and has better prediction effect when less sample data exists.

Further, the step of the IFWA algorithm comprises:

a1, initializing the positions of fireworks with a preset number;

a2, calculating the number and the value range of the fireworks, calculating escape fitness, redefining the distance of the fireworks according to the escape fitness, and obtaining new fireworks;

a3, normalizing the firework distance, and calculating and selecting the first n fireworks with the maximum product of the escape fitness and the normalized firework distance to obtain peak value fireworks, wherein n is the preset selected number;

a4, defining and searching exploration fireworks, and forming next generation fireworks with the peak value fireworks;

and A5, according to the set maximum iteration times, iteration is carried out through the steps A3 and A4, and fireworks with the highest fitness are selected after the iteration is finished, so that the smooth factor of GRNN is obtained.

According to the description, the improved firework algorithm is provided, the global search capability of the algorithm is enhanced through peak sparks and exploration sparks, so that the algorithm is easier to jump out of a local optimal solution, and the algorithm has a better adaptability value.

Further, the calculating of the escaping fitness and redefining the firework distance according to the escaping fitness specifically comprise:

carrying out normalization processing on the fitness of all the fireworks to obtain the escape fitness of all the fireworks:

wherein f is _i I =1,2, \ 8230;, N, representing the ith firework x _i Fitness of f _i ', i =1,2, \8230, N, denoting the ith firework x _i The escape fitness of (c), N represents the total number of fireworks, f _min Shows the fitness f of all fireworks _i Minimum value of f _max Shows the fitness f of all fireworks _i Maximum value of (1);

redefining firework distance delta for fireworks with definition fitness of 1 _i The calculation formula is as follows:

wherein x is _i Indicates the ith firework, x _j The jth firework is shown.

From the above description, the invention provides the concept of the escape fitness value, the distance between sparks with the escape fitness of 1 is re-determined, and the top n peak sparks are selected from large to small according to the product of the escape fitness value and the normalized distance, so that the relative positions of the fitness value and the individual are considered, and a plurality of filial generations are prevented from being selected in the same region.

Further, the step A3 specifically includes:

distance delta to fireworks _i Carrying out normalization treatment:

calculating normalized firework distance delta _i ' product of the escaping fitness gamma _i ：

γ _i ＝f _i '*δ _i '；

According to gamma _i The values are arranged from large to small to obtain the first n fireworks which are marked as peak value fireworks;

wherein, delta _min Representing the minimum value, δ, in the firework distance _max Representing the maximum value of the firework distance, delta _i The firework distance of the ith firework is shown, and N is less than or equal to N-1.

As can be seen from the above description, the above is a part of the specific steps of the IFWA algorithm of the present invention.

Further, the step A4 specifically includes:

defining exploration fireworks, wherein the calculation formula of the exploration fireworks is as follows:

wherein x is _i Indicates the ith firework, x _j Showing the jth firework;

and the peak fireworks and the exploration fireworks form the next-generation fireworks.

As can be seen from the above description, the present invention defines the exploration spark, the location of which is at the edge of the search area, and the search area can be expanded after explosion, thereby expanding the search range. The peak spark and the exploration spark are combined, so that the improved firework algorithm can enhance the global search capability of the algorithm, the algorithm can easily jump out of a local optimal solution, and the improved firework algorithm has a better fitness value.

Further, the step S1 specifically includes:

acquiring statistical data of the calendar years of the statistical yearbook, performing abnormal missing processing and overrun abnormal processing on the statistical data, and performing standardized processing on processed samples.

According to the description, the abnormal missing and out-of-limit abnormality of the statistical data are processed, and the effectiveness of the data is improved.

Further, the step S2 specifically includes:

b1, constructing feature vector data aiming at the cleaned statistical data;

b2, according to the feature vector data, randomly extracting k self-help sample sets in a returning mode by applying a bootstrap method, constructing k classification regression trees by the self-help sample sets, and forming k out-of-bag data OOB (object-oriented OOB) by samples which are not extracted each time;

b3, randomly extracting nodes of each classification regression tree to obtain m _y The features are used as randomly generated feature subsets, the information quantity of each feature in the feature subsets is calculated, and one feature is selected according to the information quantity to carry out node splitting to obtain a decision tree;

b4, carrying out importance scoring on the features by Gini index scoring, if the features X _j If a node in decision tree i belongs to set M, X _j The importance in the ith tree is as follows:

b5, total score of RF:

wherein n represents the number of decision trees;

b6, normalizing all the calculated importance scores:

according to VIM _j Features of ≧ ε constitute a feature matrix, where ε =1.

According to the description, the method adopts the random forest algorithm to screen the carbon emission influence factors and judge the importance degree, and eliminates the factors with small influence on the carbon emission, so that the model is simplified to a certain extent, and the prediction speed and precision are improved.

Referring to fig. 2, a terminal for carbon emission prediction includes a processor, a memory, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the above method for carbon emission prediction when executing the computer program.

The carbon emission prediction method and the terminal are suitable for predicting carbon emission and can be suitable for predicting carbon emission in medium and long periods.

Referring to fig. 1 and fig. 3 to 5, a first embodiment of the present invention is:

a method of carbon emission prediction comprising the steps of:

the step S1 specifically comprises the following steps:

acquiring statistical data of the years of the statistical yearbook, performing abnormal missing processing and overrun abnormal processing on the statistical data, and performing standardized processing on processed samples.

In this embodiment, the input data needs to be standardized, a time sequence of the carbon emission influencing factor variable is selected first, then, recognizable errors (abnormal missing, overrun abnormal value, and the like) in the historical data set are corrected, and finally, the influencing factor data is standardized, and the specific operations are as follows:

by inquiring relevant data such as a statistical yearbook or consulting documents, the number of population living in the past 20 years, the energy consumption, the per capita GDP, the carbon emission coefficient, the industrial percentage and the carbon emission are collected as original data, and the data is shown in the following table 1 by taking certain place as an example:

TABLE 1

Wherein, the carbon emission coefficient refers to the carbon emission quantity generated by unit energy in the combustion or use process of each energy, and reflects the technical level.

Handling the abnormal missing problem:

and when the missing proportion is less (< 5%) and the sequence attribute is not strongly correlated with the target prediction sequence (the correlation coefficient r of the sequence and the target prediction sequence satisfies | r | < 0.8), filling by using a median of the sequence, wherein the expression of the correlation coefficient r of the person is as follows:

in the formula, x _i As a sequence attribute, i.e. a factor of influence relating to carbon emissions, y _i The sequence, i.e., carbon emissions, is predicted for the target.

When the deletion rate is high (> 95%) and the importance degree of the attribute is low (the correlation coefficient r between the sequence and the target prediction sequence Pearson meets: | < 0.3), the attribute is directly deleted.

When the missing value is high and the attribute importance degree is high (the correlation coefficient r between the sequence and the target prediction sequence Pearson is more than or equal to 0.8 and less than or equal to r and less than 1), a thermal platform interpolation method is used, namely, a time sequence (matching time sequence) similar to the sample where the missing value is located is found in the non-missing data time sequence, and the missing value is interpolated by using the observation value therein.

And (4) processing overrun abnormal values:

obviously identifying abnormal values and directly deleting the abnormal values with a small number;

the abnormal values are less (< 5%) and the time sequence importance is higher (the correlation coefficient r of the sequence and the target prediction sequence Pearson satisfies 0.8 ≦ r | < 1), and the time sequence average value is taken for filling;

outlier occupancy was high (> 95%), timing failed, and was handled by the missing value method.

And (3) standardization treatment:

and carrying out normalization processing on the data to eliminate the magnitude difference between different dimensional data and avoid larger output errors caused by the magnitude difference. The Z-score normalization method was used for the data normalization of each effect, and the specific normalization formula was as follows:

in the formula: μ is the mean of all sample data; σ is the standard deviation of all sample data.

the step S2 specifically comprises the following steps:

b1, constructing feature vector data aiming at the cleaned statistical data;

b2, according to the feature vector data, randomly extracting k self-help sample sets in a returning mode by applying a bootstrap method, constructing k classification regression trees by the self-help sample sets, and forming k out-of-bag data OOB (object-oriented domain) by samples which are not extracted each time;

b4, performing importance scoring on the features by using Gini index scoring, and if the features X _j If a node in decision tree i belongs to set M, X _j The importance in the ith tree is as follows:

b5, total score of RF:

wherein n represents the number of decision trees;

b6, normalizing all the calculated importance scores:

In this embodiment, an EP algorithm is used to perform feature selection, and first, a feature vector data is constructed from the aforementioned normalized influence factors, and then, the feature vector data is used as an input of a random forest method, and the importance degree of each influence factor is obtained through a calculation result. The method comprises the following specific steps:

(A) Constructing feature vector data for carbon emission prediction according to the collected influence factor historical data sequence, wherein the normalized feature vector data is shown in the following table 2:

(B) The method for selecting the features by the random forest method comprises the following specific steps:

b1 With reference to fig. 4), from the original training dataset, we apply bootstrap method to pull back randomly k new bootstrap sample sets and thus construct k classification regression trees, each time the un-pulled samples constitute k OOBs;

b2 ) randomly draw m at each node of each tree _y The individual features are used as feature subsets which are randomly generated, and one feature with the most classification capability is selected from the individual features for node splitting by calculating the information content of each feature in the feature subsets, so that the decision tree has larger diversity;

b3 Scoring by Gini index)

Calculating the importance of the feature if X _j If a node in decision tree i belongs to set M, X _j The importance in the ith tree is as follows:

b4 Let the RF have n trees in total, then the overall score is:

b5 All calculated importance scores are normalized as follows:

c) And obtaining a feature vector importance degree matrix. Is judged by epsilon =1According to, if VIM _j If not less than epsilon, the influence factor is classified as input data type, if VIM _j < ε, the influence factor can be ignored, and several influence factors exceeding 1 constitute the input matrix.

the mathematical expression of the GRNN is as follows:

is output by the output layer, and k is the dimension of the output vector;

the steps of the IFWA algorithm include:

a1, initializing the positions of fireworks with preset number;

redefining the firework distance according to the escape fitness specifically comprises the following steps:

wherein f is _i I =1,2, \ 8230;, N, representing the ith firework x _i Fitness of f _i ', i =1,2, \ 8230;, N, representing the ith firework x _i The escape fitness of (c), N represents the total number of fireworks, f _min Shows the fitness f of all fireworks _i Minimum value of (d), f _max Shows the fitness f of all fireworks _i Maximum value of (1);

wherein x is _i Represents the ith firework, x _j Showing the jth firework;

a3, normalizing the firework distance, and calculating and selecting the first n fireworks with the maximum product of the escape fitness and the normalized firework distance to obtain peak fireworks, wherein n is a preset selected number;

the step A3 specifically comprises the following steps:

distance delta to fireworks _i And (3) carrying out normalization treatment:

γ _i ＝f _i '*δ _i '；

wherein, delta _min In indicating the distance of fireworksMinimum value of, δ _max Indicating the maximum value of the distance of fireworks, delta _i The firework distance of the ith firework is shown, and N is less than or equal to N-1;

a4, defining and searching exploration fireworks, and forming next-generation fireworks with the peak value fireworks;

the step A4 is specifically as follows:

defining exploration fireworks, wherein a calculation formula of the exploration fireworks is as follows:

wherein x is _i Indicates the ith firework, x _j Showing the jth firework;

the peak value fireworks and the exploration fireworks form the next generation fireworks

A5, according to the set maximum iteration times, iteration is carried out through the steps A3 and A4, and fireworks with the highest fitness are selected after the iteration is finished, so that a smooth factor of GRNN is obtained;

in this embodiment, it is necessary to establish a medium-and-long-term carbon prediction model based on the IFWA-GRNN improved neural network.

(1) GRNN general mathematical model

Assuming that x and y are two random variables and their joint probability density is f (x, y), if x is known to be observed as x ₀ The regression of y against x is:

y(x ₀ ) I.e. when the input is x _o Under the condition (1), a predicted output of y. Applying Parzen non-parametric estimation, from sample data

The density function f (x) is estimated as follows _o ,y)：

Where n is the sample capacity and P is the dimension of the random variable x. Alpha is called the smoothing factor, i.e. the standard deviation of the gaussian function. Substituting equation (6) and swapping the order of integration and summation yields:

(2) Optimizing a smoothing factor alpha in the generalized neural network GRNN by using an IFWA algorithm, and specifically comprising the following steps of:

(21) The position of the N sparks is initialized. Each spark represents a solution in space, labeled spark position;

(22) Calculating a Gauss function value between the input parameter X and the learning sample, and performing weighted summation, as shown in equation (9):

in the formula, X _i Represents the learning sample corresponding to the ith neuron of the pattern layer, y _i Represents X _i M is the sample capacity.

(23) And calculating the number and value range of the sparks to generate new sparks. And calculating the fitness value of each firework according to the fitness function, and generating sparks according to the fitness value. Let f _i I =1,2, \ 8230;, n, representing the ith firework x _i Normalizing the fitness value to obtain an escape fitness value f _i ' is:

redefining the spark distance delta according to the escape fitness _i The calculation formula is as follows:

in the formula x _i Represents the ith firework, x _j The jth firework is shown.

(24) Spark distance delta _i Normalization, i.e. the escape fitness of a spark of 1, the distance calculation formula is:

(25) Calculating the product gamma of the escape fitness value and the normalized distance _i According to γ _i The first N (N is more than or equal to N-1) sparks with the values from large to small are marked as peak sparks, gamma _i The calculation formula is as follows:

γ _i ＝f _i '*δ _i ' (13)

(26) Defining spark x _i To explore the spark, explore spark x _i The calculation formula of (c) is:

combining the peak spark and the exploration spark to form the next-generation fireworks;

(27) The maximum number of iterations K is set. And if the iteration times K are less than K, repeating the step 23), and finally selecting the individual with the maximum application value as the current optimal smoothing factor to obtain the optimal parameter smoothing factor alpha of GRNN.

Inputting the extracted time sequence characteristics into the improved generalized neural network GRNN according to time steps to obtain a preliminary prediction result, and referring to the specific steps in FIG. 5, the method comprises the following steps:

(41) Training sampleThe collection is { trx ₁ ,trx ₂ ,…,trx _n I.e. the eigenvector data for the carbon emission contributors obtained in the preceding text, each sample having a dimension m, i.e. trx _i ＝[trx _i1 ,trx _i2 ,…,trx _im ]；

(42) Dividing a label set and a test set, and dividing the first 70% of the sample feature set into the label set, wherein the label set is { y ₁ ,y ₂ ,…,y _n Dimension k of each label; taking 30% of the sample feature set as a test sample, wherein the test sample set is { tex ₁ ,tex ₂ ,…,tex _p And the GRNN input layer transmits the normalization processing of the input vector set to the mode layer, and the normalization processing formula is as follows:

in the formula: x is a radical of a fluorine atom _i ，x _i ^* Respectively representing the values before and after normalization of the data, x _min 、x _max Respectively representing the minimum value and the maximum value in the sample data;

(43) Setting the number of neurons in the mode layer as n, wherein each neuron corresponds to a different sample in the training sample, the estimated alpha value is used as a smoothing factor of GRNN, and the xth test sample tex _x The training sample trxj corresponding to the jth neuron of the pattern layer is calculated as formula (15), P _j I.e. the output of the jth neuron:

in the formula, trx _ji Is trx _j Of (i) parameter (tex) _xi Is tex _x The ith parameter of (1);

(44) The number of nodes of the summation layer is k +1, and the first neuron calculates the arithmetic sum S of all neuron outputs of the mode layer _D As shown in equation (16), the connection weight of the g-th neuron of the pattern layer and the j-th neuron of the summation layer is the g-th output sample y _g Middle j parameter, remainderThe weighted sum S of the mode layer outputs is calculated for the next k neurons _Nj As shown in formula (17):

in the formula, trx _gi For training sample feature set trx _g Of (i) parameter (tex) _xi Is tex _x The ith parameter of (2).

(45) The number of neurons in an output layer is k, and the output result of each neuron is S _j I.e. the j-th node output S of the summation layer _Nj And the first node outputs S _D As shown in equation (18):

simulating the change of carbon emission influence factors and outputting the carbon emission under different scenes. Before predicting the total carbon emission and the carbon emission intensity in the future, the change trend of carbon emission influence factors is judged by combining with a macroscopic development plan, and the change of macroscopic data is accurately described.

The method specifically comprises the steps of looking up policy plans related to all influence factors, combining with analysis of the trend of the current policy, setting the variation of macroscopic influence factors, studying and judging the population number, the carbon emission coefficient, the per capita GDP, the energy consumption and the variation trend of the industrial structure, obtaining influence factor data of a new year, and adding the influence factor data of the new year and a carbon emission prediction result into a training set of a model to predict year by year. If the second industry proportion in 2019 is 47.71%, the second industry proportion is reduced by 1.5% by setting until 2020, namely the second industry proportion in 2020 is 46.21%, and if the rest influence factors are not changed, the influence factor data of the new year and the carbon emission prediction result in 2019 are added to the training set, and as shown in table 3 below, the carbon emission in 2020 is predicted to realize the year-by-year prediction.

TABLE 3

In this embodiment, finally, relevant policy suggestions may be proposed based on the carbon emission prediction result. According to the prediction result of the IFWA-GRNN model, whether the carbon emission commitment and the goals of carbon peak reaching and carbon neutralization can be achieved or not can be achieved. Reasonable suggestions are provided from the aspects of industrial structure, energy structure, emission reduction technical innovation and the like.

Referring to fig. 2, a second embodiment of the present invention is:

a terminal 1 for carbon emission prediction includes a processor 2, a memory 3, and a computer program stored in the memory 3 and operable on the processor 2, wherein the processor 2 executes the computer program to implement the steps of a method for carbon emission prediction according to the first embodiment.

The invention relates to a carbon emission prediction method, which is characterized in that carbon emission influence factors are analyzed based on a random forest method to obtain an input feature matrix of a carbon prediction model, an IFWA (adaptive Firework) algorithm is introduced to optimize a smooth factor in GRNN to improve the model prediction accuracy, and an IFWA-GRNN improved neural network is used for accurately predicting the carbon emission.

The main principle of the IFWA-GRNN-based medium and long term carbon prediction model established by the invention is as follows:

in the aspect of extracting the carbon emission influence factors, the influence factors are more in category, and each category comprises different specific factors. In order to predict the carbon emission more accurately, the research adopts a random forest algorithm to screen the carbon emission influence factors and judge the importance degree. The random forest algorithm eliminates factors which have small influence on carbon emission, simplifies the model to a certain extent, and improves the prediction speed and accuracy. The influence degrees of different influence factors cannot be visually represented by extracting the carbon emission influence factors in the prior art, so that the influence degrees of the influence factors are quantized by selecting a random forest algorithm, and the accuracy of describing the influence degrees is improved.

In the aspect of a prediction method, a gray model, a machine learning method and a scene analysis method are mainly adopted at present. The fitting ability of the gray model to the nonlinear sequence is inferior to that of a machine learning algorithm, and a plurality of existing researches and choices of the machine learning method achieve better prediction results, wherein the typical methods are a BPNN algorithm, an extreme learning machine algorithm and the like. However, the BPNN algorithm has the characteristics of low training speed, large prediction error, easy falling into local minimum and the like. The extreme learning machine optimizes the above problem but has the defect of easy falling into local optimal solution. Therefore, the invention provides the GRNN algorithm based on the improved firework algorithm to predict the carbon emission, reduce the prediction error, and improve the nonlinear approximation capability and high fault tolerance and robustness.

GRNN is a variation of the radial basis function neural network, and is composed of four layers, an input layer, a mode layer, a summation layer, and an output layer. The number of input layer neurons is equal to the dimension of the input vector in the learning sample, and each neuron is a simple distribution unit that directly passes the input variable to the mode layer. The number of neurons in the pattern layer is equal to the number n of learning samples, and each neuron corresponds to a different sample. Two types of neurons are used in the summation layer for summation. The number of neurons in the output layer is equal to the dimension k of the output vector in the learning sample, each neuron divides the output of the summation layer, and the output of neuron j corresponds to the jth element of the estimation result Y (X). GRNN is a radial basis function network based on mathematical statistics, the theoretical basis of which is a nonlinear regression analysis. The GRNN is a variation of the radial basis function neural network, has strong nonlinear mapping capability and learning speed, and has stronger advantages than the radial basis function neural network. When the sample data is less, the prediction effect is better.

On the basis of the GRNN algorithm, in order to further improve the prediction accuracy, the research adopts an improved firework algorithm to improve the prediction accuracy. The firework algorithm is a parallel explosion type search mode formed by introducing random factors and selecting strategies through a group energy algorithm, and becomes a global probability search method capable of solving an optimal solution of a complex optimization problem. However, the selection strategy only takes the relative position of the individual as the selection probability, does not consider the advantages and disadvantages of the individual fitness value, and the selection probabilities of the individuals with adjacent positions are also approximate, thereby greatly limiting the algorithm search efficiency. Therefore, an improved firework algorithm is provided in the research, the concept of the escape fitness value is provided, the distance between sparks with the escape fitness being 1 is determined again, and the front n peak sparks are selected according to the product of the escape fitness value and the normalization distance from large to small, so that the relative positions of the fitness value and an individual are considered, and a plurality of offspring are prevented from being selected in the same area. On the basis, the exploration spark is defined, the position of the exploration spark is located at the edge of the search area, the existing search area can be expanded after explosion, and the search range is expanded. The peak sparks are combined with the exploration sparks, so that the overall searching capacity of the improved firework algorithm can be enhanced, the algorithm can easily jump out of a local optimal solution, and the improved firework algorithm has a better adaptability value.

The method of the present invention contemplates carbon emission predictions based on the IFWA-GENN model. Firstly, carrying out standardization processing on various influence factors; based on a carbon emission influence factor screening and quantification algorithm of a random forest method, obtaining an input data set of a carbon emission prediction model; an improved firework algorithm is provided, and the global search capability of the algorithm is enhanced through peak sparks and exploration sparks, so that the algorithm is easier to jump out of a local optimal solution and has a better fitness value; the IFWA algorithm was introduced to optimize the smoothing factor in GRNN to improve model prediction accuracy. And predicting the carbon emission and the carbon emission intensity by adopting the optimized GRNN algorithm. And finally, simulating the change of carbon emission influence factors, outputting the carbon emission under different scenes, and providing relevant suggestions for relevant government-made and adjusted policies.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims

1. A method of carbon emission prediction, comprising the steps of:

2. The method of claim 1, wherein the GRNN has a mathematical expression of:

wherein, P _i Is a mode layer neuron transfer function; x is an input variable, X = [ X ] ₁ ,x ₂ ,…,x _n ] ^T ；X _i Learning samples corresponding to the ith neuron; alpha is a smoothing factor; n is the number of mode layer neurons; s. the _Nj A transfer function is a summation layer; y is _ij The connection weight of the ith neuron and the jth molecule in the weighting layer is obtained;

for the output layer output, k is the dimension of the output vector.

3. A method of carbon emissions prediction as claimed in claim 2 wherein the step of the IFWA algorithm comprises:

a1, initializing the positions of fireworks with a preset number;

a2, calculating the number and value range of the fireworks, calculating escape fitness, redefining the distance of the fireworks according to the escape fitness, and obtaining new fireworks;

4. The method for predicting carbon emissions according to claim 3, wherein the calculating of the escaping fitness and the redefining of the firework distance according to the escaping fitness are specifically:

wherein f is _i I =1,2, \8230;, N, denotes the ith firework x _i Fitness of (f) _i ', i =1,2, \ 8230;, N, representing the ith firework x _i The escape fitness of (1), N represents the total number of fireworks，f _min Shows the fitness f of all fireworks _i Minimum value of f _max Shows the fitness f of all fireworks _i Maximum value of (1);

wherein x is _i Indicates the ith firework, x _j The jth firework is shown.

5. A method for carbon emission prediction according to claim 3, wherein the step A3 is specifically:

distance delta to fireworks _i Carrying out normalization treatment:

γ _i ＝f _i '*δ _i '；

6. A method for carbon emission prediction according to claim 3, wherein the step A4 is specifically:

wherein x is _i Represents the ith firework, x _j Showing the jth firework;

7. The method for predicting carbon emissions according to claim 1, wherein the step S1 is specifically:

8. The method for predicting carbon emissions according to claim 1, wherein the step S2 is specifically:

b1, constructing feature vector data aiming at the cleaned statistical data;

b3, randomly extracting nodes of each classification regression tree to obtain m _y The characteristic is used as a randomly generated characteristic subset, the information quantity of each characteristic in the characteristic subset is calculated, and one characteristic is selected according to the information quantity to carry out node splitting to obtain a decision tree;

b5, total score of RF:

wherein n represents the number of decision trees;

b6, performing normalization processing on all the calculated importance scores:

9. A terminal for carbon emission prediction, comprising a processor, a memory and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, performs the steps of a method for carbon emission prediction as claimed in any one of the preceding claims 1 to 8.