CN115796358A - Carbon emission prediction method and terminal - Google Patents

Carbon emission prediction method and terminal Download PDF

Info

Publication number
CN115796358A
CN115796358A CN202211503703.7A CN202211503703A CN115796358A CN 115796358 A CN115796358 A CN 115796358A CN 202211503703 A CN202211503703 A CN 202211503703A CN 115796358 A CN115796358 A CN 115796358A
Authority
CN
China
Prior art keywords
fireworks
firework
carbon emission
fitness
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211503703.7A
Other languages
Chinese (zh)
Inventor
黄夏楠
杨丝雨
刘林
胡臻达
涂夏哲
洪居华
林伟伟
邹艺超
郑欢
张林垚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Fujian Electric Power Co Ltd
Economic and Technological Research Institute of State Grid Fujian Electric Power Co Ltd
Original Assignee
State Grid Fujian Electric Power Co Ltd
Economic and Technological Research Institute of State Grid Fujian Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Fujian Electric Power Co Ltd, Economic and Technological Research Institute of State Grid Fujian Electric Power Co Ltd filed Critical State Grid Fujian Electric Power Co Ltd
Priority to CN202211503703.7A priority Critical patent/CN115796358A/en
Publication of CN115796358A publication Critical patent/CN115796358A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/80Management or planning
    • Y02P90/84Greenhouse gas [GHG] management systems

Abstract

The invention discloses a method and a terminal for predicting carbon emission, which are used for acquiring statistical data of a calendar year of a statistical yearbook and cleaning the statistical data; judging the importance degree of the influence factors by a random forest algorithm according to the cleaned statistical data, selecting the influence factors according to the importance, and generating a feature matrix; establishing a carbon emission prediction model based on an improved firework algorithm IFWA and a generalized regression neural network GRNN; training the carbon emission prediction model according to the characteristic matrix, and predicting carbon emission according to the input numerical values of all the influence factors after training; aiming at the problem of complex influence factors, the invention adopts a random forest algorithm to screen the factors and judge the importance degree of the factors, and aims at the problem that the traditional firework algorithm is easy to cause the model to fall into the local optimal solution, and adopts an improved firework algorithm, thereby effectively improving the prediction performance of the model and improving the accuracy of carbon emission prediction.

Description

Carbon emission prediction method and terminal
Technical Field
The invention relates to the technical field of carbon emission prediction, in particular to a carbon emission prediction method and a terminal.
Background
The problem of climate warming caused by carbon emission from fossil energy is a common challenge for countries in the world, and as the most developing countries in the world and the countries with the largest carbon emission, china makes a promise that the intensity of carbon emission is reduced by 60-65% compared with 2005 by 2030, sets a target that the total carbon emission reaches the peak value and strives to reach the peak value as early as possible by 2030 and strives to realize the vision of carbon neutralization before 2060.
In the face of unprecedented pressure for emission reduction and pressure for economic transformation, the exploration of an energy-saving and emission-reducing strategy suitable for the current national situation of China is of great importance. Therefore, the carbon emission amount of energy consumption in China is predicted based on the current policy, a more scientific and reasonable carbon emission reduction policy proposal is provided, and the method has very important practical significance for energy conservation and emission reduction work and sustainable development in China.
In the process of realizing the double-carbon target and constructing a low-carbon society, an important problem is to accurately predict the carbon emission level and provide important references for relevant departments to make medium-long term economic development strategies and adjust current policies. In order to make a targeted emission reduction policy and achieve a desired effect, it is necessary to know factors causing an increase in carbon dioxide emission so as to effectively reduce the carbon dioxide emission by controlling the factors, and the factors change with the policy and the application of low-carbon technology. Although the influence of an external variable is considered in the conventional multi-factor prediction model, the problem of error accumulation is inevitably caused when the model is used for predicting carbon dioxide emission on the premise that the influence factor is unchanged, so that inaccurate carbon prediction is made. Therefore, for the traditional prediction under the condition of 'influence factor invariance', a medium-and-long-term carbon prediction model of 'influence factor variation' needs to be searched by further combining policy trend analysis. In addition, the prediction accuracy of the prediction model needs to be improved, so that the accuracy of carbon dioxide emission prediction is improved.
For carbon prediction research, the methods mainly adopted in the prior art are as follows: on the basis of a gray model, an expanded gray prediction model is provided, the gray model is combined with an autoregressive comprehensive moving average model and a second-order polynomial regression model, model parameters are optimized by adopting a PSO (particle swarm optimization) method, and finally higher carbon emission precision is obtained, but relevant factors for reducing the prediction precision are ignored by the method; an artificial neural network and a nonlinear grey multivariate model are provided to study the relationship among population, GDP, oil trade, natural gas trade and carbon dioxide emission, although the influence of multiple factors is considered, the change of the influencing factors is ignored, so that the error accumulation is caused, and the prediction result is inaccurate; applying machine learning algorithms such as BP neural network, extreme learning machine, support vector machine and the like to the carbon prediction's research 23428shows good prediction effect, but the defect that the prediction model is easy to fall into local optimal solution exists. In summary, although some research results have been obtained for the medium-and long-term carbon prediction, there is still some room for improvement in prediction accuracy.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the carbon emission prediction method and the terminal are provided, and the accuracy of carbon emission prediction can be effectively improved.
In order to solve the technical problems, the invention adopts the technical scheme that:
a method of carbon emission prediction comprising the steps of:
s1, acquiring statistical data of a calendar year of a statistical yearbook, and performing data cleaning on the statistical data;
s2, judging the importance degree of the influence factors by a random forest algorithm according to the cleaned statistical data, selecting the influence factors according to the importance degree, and generating a feature matrix;
s3, establishing a carbon emission prediction model based on an improved firework algorithm IFWA and a generalized regression neural network GRNN;
and S4, training the carbon emission prediction model according to the characteristic matrix, and predicting the carbon emission according to the input numerical values of all the influence factors after training.
In order to solve the technical problem, the invention adopts another technical scheme as follows:
a terminal for carbon emission prediction, comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps of one of the above methods for carbon emission prediction when executing the computer program.
The invention has the beneficial effects that: according to the carbon emission prediction method and the terminal, aiming at the problem of complex influence factors, the factors are screened and the importance degree of the factors is judged by adopting the random forest algorithm, the accuracy of the prediction result is improved, and aiming at the problem that the model is easy to fall into the local optimal solution by adopting the traditional firework algorithm, the prediction performance of the model is effectively improved and the carbon emission prediction precision is effectively improved by adopting the improved firework algorithm.
Drawings
FIG. 1 is a flow chart of a method of carbon emissions prediction in accordance with an embodiment of the present invention;
fig. 2 is a block diagram of a carbon emission prediction terminal according to an embodiment of the present invention;
FIG. 3 is a detailed flow chart of a method of carbon emissions prediction according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a random forest feature selection method of a method for predicting carbon emissions according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a GRNN model of a method for carbon emission prediction according to an embodiment of the invention;
description of the reference symbols:
1. a terminal for carbon emission prediction; 2. a processor; 3. a memory.
Detailed Description
In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.
Referring to fig. 1 and 2, a method for predicting carbon emissions includes the steps of:
s1, acquiring statistical data of a calendar year of a statistical yearbook, and performing data cleaning on the statistical data;
s2, judging the importance degree of the influence factors by a random forest algorithm according to the cleaned statistical data, selecting the influence factors according to the importance degree, and generating a feature matrix;
s3, establishing a carbon emission prediction model based on an improved firework algorithm IFWA and a generalized regression neural network GRNN;
and S4, training the carbon emission prediction model according to the characteristic matrix, and predicting the carbon emission according to the input numerical values of all the influence factors after training.
From the above description, the beneficial effects of the present invention are: according to the carbon emission prediction method and the terminal, aiming at the problem of complex influence factors, the factors are screened by adopting a random forest algorithm and the importance degree of the factors is judged, the accuracy of the prediction result is improved, and aiming at the problem that the model is easy to fall into the local optimal solution by adopting the traditional firework algorithm, the prediction performance of the model is effectively improved and the carbon emission prediction precision is effectively improved by adopting the improved firework algorithm.
Further, the mathematical expression of GRNN is:
Figure BDA0003967346320000041
Figure BDA0003967346320000042
Figure BDA0003967346320000043
wherein, P i Is a mode layer neuron transfer function; x is an input variable, X = [ X ] 1 ,x 2 ,…,x n ] T ;X i Learning samples corresponding to the ith neuron; alpha is a smoothing factor; n is the number of mode layer neurons; s. the Nj Is the summation layer transfer function; y is ij The connection weight of the ith neuron and the jth molecule in the weighting layer is obtained;
Figure BDA0003967346320000044
for the output layer output, k is the dimension of the output vector.
The GRNN algorithm is adopted, the GRNN is a variation form of the radial basis function neural network, the GRNN has strong nonlinear mapping capability and learning speed, has stronger advantages than the radial basis function neural network, and has better prediction effect when less sample data exists.
Further, the step of the IFWA algorithm comprises:
a1, initializing the positions of fireworks with a preset number;
a2, calculating the number and the value range of the fireworks, calculating escape fitness, redefining the distance of the fireworks according to the escape fitness, and obtaining new fireworks;
a3, normalizing the firework distance, and calculating and selecting the first n fireworks with the maximum product of the escape fitness and the normalized firework distance to obtain peak value fireworks, wherein n is the preset selected number;
a4, defining and searching exploration fireworks, and forming next generation fireworks with the peak value fireworks;
and A5, according to the set maximum iteration times, iteration is carried out through the steps A3 and A4, and fireworks with the highest fitness are selected after the iteration is finished, so that the smooth factor of GRNN is obtained.
According to the description, the improved firework algorithm is provided, the global search capability of the algorithm is enhanced through peak sparks and exploration sparks, so that the algorithm is easier to jump out of a local optimal solution, and the algorithm has a better adaptability value.
Further, the calculating of the escaping fitness and redefining the firework distance according to the escaping fitness specifically comprise:
carrying out normalization processing on the fitness of all the fireworks to obtain the escape fitness of all the fireworks:
Figure BDA0003967346320000045
wherein f is i I =1,2, \ 8230;, N, representing the ith firework x i Fitness of f i ', i =1,2, \8230, N, denoting the ith firework x i The escape fitness of (c), N represents the total number of fireworks, f min Shows the fitness f of all fireworks i Minimum value of f max Shows the fitness f of all fireworks i Maximum value of (1);
redefining firework distance delta for fireworks with definition fitness of 1 i The calculation formula is as follows:
Figure BDA0003967346320000051
wherein x is i Indicates the ith firework, x j The jth firework is shown.
From the above description, the invention provides the concept of the escape fitness value, the distance between sparks with the escape fitness of 1 is re-determined, and the top n peak sparks are selected from large to small according to the product of the escape fitness value and the normalized distance, so that the relative positions of the fitness value and the individual are considered, and a plurality of filial generations are prevented from being selected in the same region.
Further, the step A3 specifically includes:
distance delta to fireworks i Carrying out normalization treatment:
Figure BDA0003967346320000052
calculating normalized firework distance delta i ' product of the escaping fitness gamma i
γ i =f i '*δ i ';
According to gamma i The values are arranged from large to small to obtain the first n fireworks which are marked as peak value fireworks;
wherein, delta min Representing the minimum value, δ, in the firework distance max Representing the maximum value of the firework distance, delta i The firework distance of the ith firework is shown, and N is less than or equal to N-1.
As can be seen from the above description, the above is a part of the specific steps of the IFWA algorithm of the present invention.
Further, the step A4 specifically includes:
defining exploration fireworks, wherein the calculation formula of the exploration fireworks is as follows:
Figure BDA0003967346320000053
wherein x is i Indicates the ith firework, x j Showing the jth firework;
and the peak fireworks and the exploration fireworks form the next-generation fireworks.
As can be seen from the above description, the present invention defines the exploration spark, the location of which is at the edge of the search area, and the search area can be expanded after explosion, thereby expanding the search range. The peak spark and the exploration spark are combined, so that the improved firework algorithm can enhance the global search capability of the algorithm, the algorithm can easily jump out of a local optimal solution, and the improved firework algorithm has a better fitness value.
Further, the step S1 specifically includes:
acquiring statistical data of the calendar years of the statistical yearbook, performing abnormal missing processing and overrun abnormal processing on the statistical data, and performing standardized processing on processed samples.
According to the description, the abnormal missing and out-of-limit abnormality of the statistical data are processed, and the effectiveness of the data is improved.
Further, the step S2 specifically includes:
b1, constructing feature vector data aiming at the cleaned statistical data;
b2, according to the feature vector data, randomly extracting k self-help sample sets in a returning mode by applying a bootstrap method, constructing k classification regression trees by the self-help sample sets, and forming k out-of-bag data OOB (object-oriented OOB) by samples which are not extracted each time;
b3, randomly extracting nodes of each classification regression tree to obtain m y The features are used as randomly generated feature subsets, the information quantity of each feature in the feature subsets is calculated, and one feature is selected according to the information quantity to carry out node splitting to obtain a decision tree;
b4, carrying out importance scoring on the features by Gini index scoring, if the features X j If a node in decision tree i belongs to set M, X j The importance in the ith tree is as follows:
Figure BDA0003967346320000061
b5, total score of RF:
Figure BDA0003967346320000062
wherein n represents the number of decision trees;
b6, normalizing all the calculated importance scores:
Figure BDA0003967346320000063
according to VIM j Features of ≧ ε constitute a feature matrix, where ε =1.
According to the description, the method adopts the random forest algorithm to screen the carbon emission influence factors and judge the importance degree, and eliminates the factors with small influence on the carbon emission, so that the model is simplified to a certain extent, and the prediction speed and precision are improved.
Referring to fig. 2, a terminal for carbon emission prediction includes a processor, a memory, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the above method for carbon emission prediction when executing the computer program.
From the above description, the beneficial effects of the present invention are: according to the carbon emission prediction method and the terminal, aiming at the problem of complex influence factors, the factors are screened by adopting a random forest algorithm and the importance degree of the factors is judged, the accuracy of the prediction result is improved, and aiming at the problem that the model is easy to fall into the local optimal solution by adopting the traditional firework algorithm, the prediction performance of the model is effectively improved and the carbon emission prediction precision is effectively improved by adopting the improved firework algorithm.
The carbon emission prediction method and the terminal are suitable for predicting carbon emission and can be suitable for predicting carbon emission in medium and long periods.
Referring to fig. 1 and fig. 3 to 5, a first embodiment of the present invention is:
a method of carbon emission prediction comprising the steps of:
s1, acquiring statistical data of a calendar year of a statistical yearbook, and performing data cleaning on the statistical data;
the step S1 specifically comprises the following steps:
acquiring statistical data of the years of the statistical yearbook, performing abnormal missing processing and overrun abnormal processing on the statistical data, and performing standardized processing on processed samples.
In this embodiment, the input data needs to be standardized, a time sequence of the carbon emission influencing factor variable is selected first, then, recognizable errors (abnormal missing, overrun abnormal value, and the like) in the historical data set are corrected, and finally, the influencing factor data is standardized, and the specific operations are as follows:
by inquiring relevant data such as a statistical yearbook or consulting documents, the number of population living in the past 20 years, the energy consumption, the per capita GDP, the carbon emission coefficient, the industrial percentage and the carbon emission are collected as original data, and the data is shown in the following table 1 by taking certain place as an example:
TABLE 1
Figure BDA0003967346320000071
Figure BDA0003967346320000081
Wherein, the carbon emission coefficient refers to the carbon emission quantity generated by unit energy in the combustion or use process of each energy, and reflects the technical level.
Handling the abnormal missing problem:
and when the missing proportion is less (< 5%) and the sequence attribute is not strongly correlated with the target prediction sequence (the correlation coefficient r of the sequence and the target prediction sequence satisfies | r | < 0.8), filling by using a median of the sequence, wherein the expression of the correlation coefficient r of the person is as follows:
Figure BDA0003967346320000091
in the formula, x i As a sequence attribute, i.e. a factor of influence relating to carbon emissions, y i The sequence, i.e., carbon emissions, is predicted for the target.
When the deletion rate is high (> 95%) and the importance degree of the attribute is low (the correlation coefficient r between the sequence and the target prediction sequence Pearson meets: | < 0.3), the attribute is directly deleted.
When the missing value is high and the attribute importance degree is high (the correlation coefficient r between the sequence and the target prediction sequence Pearson is more than or equal to 0.8 and less than or equal to r and less than 1), a thermal platform interpolation method is used, namely, a time sequence (matching time sequence) similar to the sample where the missing value is located is found in the non-missing data time sequence, and the missing value is interpolated by using the observation value therein.
And (4) processing overrun abnormal values:
obviously identifying abnormal values and directly deleting the abnormal values with a small number;
the abnormal values are less (< 5%) and the time sequence importance is higher (the correlation coefficient r of the sequence and the target prediction sequence Pearson satisfies 0.8 ≦ r | < 1), and the time sequence average value is taken for filling;
outlier occupancy was high (> 95%), timing failed, and was handled by the missing value method.
And (3) standardization treatment:
and carrying out normalization processing on the data to eliminate the magnitude difference between different dimensional data and avoid larger output errors caused by the magnitude difference. The Z-score normalization method was used for the data normalization of each effect, and the specific normalization formula was as follows:
Figure BDA0003967346320000092
in the formula: μ is the mean of all sample data; σ is the standard deviation of all sample data.
S2, judging the importance degree of the influence factors by a random forest algorithm according to the cleaned statistical data, selecting the influence factors according to the importance degree, and generating a feature matrix;
the step S2 specifically comprises the following steps:
b1, constructing feature vector data aiming at the cleaned statistical data;
b2, according to the feature vector data, randomly extracting k self-help sample sets in a returning mode by applying a bootstrap method, constructing k classification regression trees by the self-help sample sets, and forming k out-of-bag data OOB (object-oriented domain) by samples which are not extracted each time;
b3, randomly extracting nodes of each classification regression tree to obtain m y The features are used as randomly generated feature subsets, the information quantity of each feature in the feature subsets is calculated, and one feature is selected according to the information quantity to carry out node splitting to obtain a decision tree;
b4, performing importance scoring on the features by using Gini index scoring, and if the features X j If a node in decision tree i belongs to set M, X j The importance in the ith tree is as follows:
Figure BDA0003967346320000101
b5, total score of RF:
Figure BDA0003967346320000102
wherein n represents the number of decision trees;
b6, normalizing all the calculated importance scores:
Figure BDA0003967346320000103
according to VIM j Features of ≧ ε constitute a feature matrix, where ε =1.
In this embodiment, an EP algorithm is used to perform feature selection, and first, a feature vector data is constructed from the aforementioned normalized influence factors, and then, the feature vector data is used as an input of a random forest method, and the importance degree of each influence factor is obtained through a calculation result. The method comprises the following specific steps:
(A) Constructing feature vector data for carbon emission prediction according to the collected influence factor historical data sequence, wherein the normalized feature vector data is shown in the following table 2:
Figure BDA0003967346320000104
Figure BDA0003967346320000111
(B) The method for selecting the features by the random forest method comprises the following specific steps:
b1 With reference to fig. 4), from the original training dataset, we apply bootstrap method to pull back randomly k new bootstrap sample sets and thus construct k classification regression trees, each time the un-pulled samples constitute k OOBs;
b2 ) randomly draw m at each node of each tree y The individual features are used as feature subsets which are randomly generated, and one feature with the most classification capability is selected from the individual features for node splitting by calculating the information content of each feature in the feature subsets, so that the decision tree has larger diversity;
b3 Scoring by Gini index)
Figure BDA0003967346320000112
Calculating the importance of the feature if X j If a node in decision tree i belongs to set M, X j The importance in the ith tree is as follows:
Figure BDA0003967346320000113
b4 Let the RF have n trees in total, then the overall score is:
Figure BDA0003967346320000121
b5 All calculated importance scores are normalized as follows:
Figure BDA0003967346320000122
c) And obtaining a feature vector importance degree matrix. Is judged by epsilon =1According to, if VIM j If not less than epsilon, the influence factor is classified as input data type, if VIM j < ε, the influence factor can be ignored, and several influence factors exceeding 1 constitute the input matrix.
S3, establishing a carbon emission prediction model based on an improved firework algorithm IFWA and a generalized regression neural network GRNN;
the mathematical expression of the GRNN is as follows:
Figure BDA0003967346320000123
Figure BDA0003967346320000124
Figure BDA0003967346320000125
wherein, P i Is a mode layer neuron transfer function; x is an input variable, X = [ X ] 1 ,x 2 ,…,x n ] T ;X i Learning samples corresponding to the ith neuron; alpha is a smoothing factor; n is the number of mode layer neurons; s. the Nj Is the summation layer transfer function; y is ij The connection weight of the ith neuron and the jth molecule in the weighting layer is obtained;
Figure BDA0003967346320000126
is output by the output layer, and k is the dimension of the output vector;
the steps of the IFWA algorithm include:
a1, initializing the positions of fireworks with preset number;
a2, calculating the number and the value range of the fireworks, calculating escape fitness, redefining the distance of the fireworks according to the escape fitness, and obtaining new fireworks;
redefining the firework distance according to the escape fitness specifically comprises the following steps:
carrying out normalization processing on the fitness of all the fireworks to obtain the escape fitness of all the fireworks:
Figure BDA0003967346320000131
wherein f is i I =1,2, \ 8230;, N, representing the ith firework x i Fitness of f i ', i =1,2, \ 8230;, N, representing the ith firework x i The escape fitness of (c), N represents the total number of fireworks, f min Shows the fitness f of all fireworks i Minimum value of (d), f max Shows the fitness f of all fireworks i Maximum value of (1);
redefining firework distance delta for fireworks with definition fitness of 1 i The calculation formula is as follows:
Figure BDA0003967346320000132
wherein x is i Represents the ith firework, x j Showing the jth firework;
a3, normalizing the firework distance, and calculating and selecting the first n fireworks with the maximum product of the escape fitness and the normalized firework distance to obtain peak fireworks, wherein n is a preset selected number;
the step A3 specifically comprises the following steps:
distance delta to fireworks i And (3) carrying out normalization treatment:
Figure BDA0003967346320000133
calculating normalized firework distance delta i ' product of the escaping fitness gamma i
γ i =f i '*δ i ';
According to gamma i The values are arranged from large to small to obtain the first n fireworks which are marked as peak value fireworks;
wherein, delta min In indicating the distance of fireworksMinimum value of, δ max Indicating the maximum value of the distance of fireworks, delta i The firework distance of the ith firework is shown, and N is less than or equal to N-1;
a4, defining and searching exploration fireworks, and forming next-generation fireworks with the peak value fireworks;
the step A4 is specifically as follows:
defining exploration fireworks, wherein a calculation formula of the exploration fireworks is as follows:
Figure BDA0003967346320000134
wherein x is i Indicates the ith firework, x j Showing the jth firework;
the peak value fireworks and the exploration fireworks form the next generation fireworks
A5, according to the set maximum iteration times, iteration is carried out through the steps A3 and A4, and fireworks with the highest fitness are selected after the iteration is finished, so that a smooth factor of GRNN is obtained;
in this embodiment, it is necessary to establish a medium-and-long-term carbon prediction model based on the IFWA-GRNN improved neural network.
(1) GRNN general mathematical model
Assuming that x and y are two random variables and their joint probability density is f (x, y), if x is known to be observed as x 0 The regression of y against x is:
Figure BDA0003967346320000141
y(x 0 ) I.e. when the input is x o Under the condition (1), a predicted output of y. Applying Parzen non-parametric estimation, from sample data
Figure BDA0003967346320000142
The density function f (x) is estimated as follows o ,y):
Figure BDA0003967346320000143
Figure BDA0003967346320000144
Where n is the sample capacity and P is the dimension of the random variable x. Alpha is called the smoothing factor, i.e. the standard deviation of the gaussian function. Substituting equation (6) and swapping the order of integration and summation yields:
Figure BDA0003967346320000145
(2) Optimizing a smoothing factor alpha in the generalized neural network GRNN by using an IFWA algorithm, and specifically comprising the following steps of:
(21) The position of the N sparks is initialized. Each spark represents a solution in space, labeled spark position;
(22) Calculating a Gauss function value between the input parameter X and the learning sample, and performing weighted summation, as shown in equation (9):
Figure BDA0003967346320000146
in the formula, X i Represents the learning sample corresponding to the ith neuron of the pattern layer, y i Represents X i M is the sample capacity.
(23) And calculating the number and value range of the sparks to generate new sparks. And calculating the fitness value of each firework according to the fitness function, and generating sparks according to the fitness value. Let f i I =1,2, \ 8230;, n, representing the ith firework x i Normalizing the fitness value to obtain an escape fitness value f i ' is:
Figure BDA0003967346320000151
redefining the spark distance delta according to the escape fitness i The calculation formula is as follows:
Figure BDA0003967346320000152
in the formula x i Represents the ith firework, x j The jth firework is shown.
(24) Spark distance delta i Normalization, i.e. the escape fitness of a spark of 1, the distance calculation formula is:
Figure BDA0003967346320000153
(25) Calculating the product gamma of the escape fitness value and the normalized distance i According to γ i The first N (N is more than or equal to N-1) sparks with the values from large to small are marked as peak sparks, gamma i The calculation formula is as follows:
γ i =f i '*δ i ' (13)
(26) Defining spark x i To explore the spark, explore spark x i The calculation formula of (c) is:
Figure BDA0003967346320000154
combining the peak spark and the exploration spark to form the next-generation fireworks;
(27) The maximum number of iterations K is set. And if the iteration times K are less than K, repeating the step 23), and finally selecting the individual with the maximum application value as the current optimal smoothing factor to obtain the optimal parameter smoothing factor alpha of GRNN.
And S4, training the carbon emission prediction model according to the characteristic matrix, and predicting the carbon emission according to the input numerical values of all the influence factors after training.
Inputting the extracted time sequence characteristics into the improved generalized neural network GRNN according to time steps to obtain a preliminary prediction result, and referring to the specific steps in FIG. 5, the method comprises the following steps:
(41) Training sampleThe collection is { trx 1 ,trx 2 ,…,trx n I.e. the eigenvector data for the carbon emission contributors obtained in the preceding text, each sample having a dimension m, i.e. trx i =[trx i1 ,trx i2 ,…,trx im ];
(42) Dividing a label set and a test set, and dividing the first 70% of the sample feature set into the label set, wherein the label set is { y 1 ,y 2 ,…,y n Dimension k of each label; taking 30% of the sample feature set as a test sample, wherein the test sample set is { tex 1 ,tex 2 ,…,tex p And the GRNN input layer transmits the normalization processing of the input vector set to the mode layer, and the normalization processing formula is as follows:
Figure BDA0003967346320000161
in the formula: x is a radical of a fluorine atom i ,x i * Respectively representing the values before and after normalization of the data, x min 、x max Respectively representing the minimum value and the maximum value in the sample data;
(43) Setting the number of neurons in the mode layer as n, wherein each neuron corresponds to a different sample in the training sample, the estimated alpha value is used as a smoothing factor of GRNN, and the xth test sample tex x The training sample trxj corresponding to the jth neuron of the pattern layer is calculated as formula (15), P j I.e. the output of the jth neuron:
Figure BDA0003967346320000162
in the formula, trx ji Is trx j Of (i) parameter (tex) xi Is tex x The ith parameter of (1);
(44) The number of nodes of the summation layer is k +1, and the first neuron calculates the arithmetic sum S of all neuron outputs of the mode layer D As shown in equation (16), the connection weight of the g-th neuron of the pattern layer and the j-th neuron of the summation layer is the g-th output sample y g Middle j parameter, remainderThe weighted sum S of the mode layer outputs is calculated for the next k neurons Nj As shown in formula (17):
Figure BDA0003967346320000163
Figure BDA0003967346320000164
in the formula, trx gi For training sample feature set trx g Of (i) parameter (tex) xi Is tex x The ith parameter of (2).
(45) The number of neurons in an output layer is k, and the output result of each neuron is S j I.e. the j-th node output S of the summation layer Nj And the first node outputs S D As shown in equation (18):
Figure BDA0003967346320000165
simulating the change of carbon emission influence factors and outputting the carbon emission under different scenes. Before predicting the total carbon emission and the carbon emission intensity in the future, the change trend of carbon emission influence factors is judged by combining with a macroscopic development plan, and the change of macroscopic data is accurately described.
The method specifically comprises the steps of looking up policy plans related to all influence factors, combining with analysis of the trend of the current policy, setting the variation of macroscopic influence factors, studying and judging the population number, the carbon emission coefficient, the per capita GDP, the energy consumption and the variation trend of the industrial structure, obtaining influence factor data of a new year, and adding the influence factor data of the new year and a carbon emission prediction result into a training set of a model to predict year by year. If the second industry proportion in 2019 is 47.71%, the second industry proportion is reduced by 1.5% by setting until 2020, namely the second industry proportion in 2020 is 46.21%, and if the rest influence factors are not changed, the influence factor data of the new year and the carbon emission prediction result in 2019 are added to the training set, and as shown in table 3 below, the carbon emission in 2020 is predicted to realize the year-by-year prediction.
TABLE 3
Figure BDA0003967346320000171
Figure BDA0003967346320000181
In this embodiment, finally, relevant policy suggestions may be proposed based on the carbon emission prediction result. According to the prediction result of the IFWA-GRNN model, whether the carbon emission commitment and the goals of carbon peak reaching and carbon neutralization can be achieved or not can be achieved. Reasonable suggestions are provided from the aspects of industrial structure, energy structure, emission reduction technical innovation and the like.
Referring to fig. 2, a second embodiment of the present invention is:
a terminal 1 for carbon emission prediction includes a processor 2, a memory 3, and a computer program stored in the memory 3 and operable on the processor 2, wherein the processor 2 executes the computer program to implement the steps of a method for carbon emission prediction according to the first embodiment.
The invention relates to a carbon emission prediction method, which is characterized in that carbon emission influence factors are analyzed based on a random forest method to obtain an input feature matrix of a carbon prediction model, an IFWA (adaptive Firework) algorithm is introduced to optimize a smooth factor in GRNN to improve the model prediction accuracy, and an IFWA-GRNN improved neural network is used for accurately predicting the carbon emission.
The main principle of the IFWA-GRNN-based medium and long term carbon prediction model established by the invention is as follows:
in the aspect of extracting the carbon emission influence factors, the influence factors are more in category, and each category comprises different specific factors. In order to predict the carbon emission more accurately, the research adopts a random forest algorithm to screen the carbon emission influence factors and judge the importance degree. The random forest algorithm eliminates factors which have small influence on carbon emission, simplifies the model to a certain extent, and improves the prediction speed and accuracy. The influence degrees of different influence factors cannot be visually represented by extracting the carbon emission influence factors in the prior art, so that the influence degrees of the influence factors are quantized by selecting a random forest algorithm, and the accuracy of describing the influence degrees is improved.
In the aspect of a prediction method, a gray model, a machine learning method and a scene analysis method are mainly adopted at present. The fitting ability of the gray model to the nonlinear sequence is inferior to that of a machine learning algorithm, and a plurality of existing researches and choices of the machine learning method achieve better prediction results, wherein the typical methods are a BPNN algorithm, an extreme learning machine algorithm and the like. However, the BPNN algorithm has the characteristics of low training speed, large prediction error, easy falling into local minimum and the like. The extreme learning machine optimizes the above problem but has the defect of easy falling into local optimal solution. Therefore, the invention provides the GRNN algorithm based on the improved firework algorithm to predict the carbon emission, reduce the prediction error, and improve the nonlinear approximation capability and high fault tolerance and robustness.
GRNN is a variation of the radial basis function neural network, and is composed of four layers, an input layer, a mode layer, a summation layer, and an output layer. The number of input layer neurons is equal to the dimension of the input vector in the learning sample, and each neuron is a simple distribution unit that directly passes the input variable to the mode layer. The number of neurons in the pattern layer is equal to the number n of learning samples, and each neuron corresponds to a different sample. Two types of neurons are used in the summation layer for summation. The number of neurons in the output layer is equal to the dimension k of the output vector in the learning sample, each neuron divides the output of the summation layer, and the output of neuron j corresponds to the jth element of the estimation result Y (X). GRNN is a radial basis function network based on mathematical statistics, the theoretical basis of which is a nonlinear regression analysis. The GRNN is a variation of the radial basis function neural network, has strong nonlinear mapping capability and learning speed, and has stronger advantages than the radial basis function neural network. When the sample data is less, the prediction effect is better.
On the basis of the GRNN algorithm, in order to further improve the prediction accuracy, the research adopts an improved firework algorithm to improve the prediction accuracy. The firework algorithm is a parallel explosion type search mode formed by introducing random factors and selecting strategies through a group energy algorithm, and becomes a global probability search method capable of solving an optimal solution of a complex optimization problem. However, the selection strategy only takes the relative position of the individual as the selection probability, does not consider the advantages and disadvantages of the individual fitness value, and the selection probabilities of the individuals with adjacent positions are also approximate, thereby greatly limiting the algorithm search efficiency. Therefore, an improved firework algorithm is provided in the research, the concept of the escape fitness value is provided, the distance between sparks with the escape fitness being 1 is determined again, and the front n peak sparks are selected according to the product of the escape fitness value and the normalization distance from large to small, so that the relative positions of the fitness value and an individual are considered, and a plurality of offspring are prevented from being selected in the same area. On the basis, the exploration spark is defined, the position of the exploration spark is located at the edge of the search area, the existing search area can be expanded after explosion, and the search range is expanded. The peak sparks are combined with the exploration sparks, so that the overall searching capacity of the improved firework algorithm can be enhanced, the algorithm can easily jump out of a local optimal solution, and the improved firework algorithm has a better adaptability value.
The method of the present invention contemplates carbon emission predictions based on the IFWA-GENN model. Firstly, carrying out standardization processing on various influence factors; based on a carbon emission influence factor screening and quantification algorithm of a random forest method, obtaining an input data set of a carbon emission prediction model; an improved firework algorithm is provided, and the global search capability of the algorithm is enhanced through peak sparks and exploration sparks, so that the algorithm is easier to jump out of a local optimal solution and has a better fitness value; the IFWA algorithm was introduced to optimize the smoothing factor in GRNN to improve model prediction accuracy. And predicting the carbon emission and the carbon emission intensity by adopting the optimized GRNN algorithm. And finally, simulating the change of carbon emission influence factors, outputting the carbon emission under different scenes, and providing relevant suggestions for relevant government-made and adjusted policies.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims (9)

1. A method of carbon emission prediction, comprising the steps of:
s1, acquiring statistical data of a calendar year of a statistical yearbook, and performing data cleaning on the statistical data;
s2, judging the importance degree of the influence factors by a random forest algorithm according to the cleaned statistical data, selecting the influence factors according to the importance degree, and generating a feature matrix;
s3, establishing a carbon emission prediction model based on an improved firework algorithm IFWA and a generalized regression neural network GRNN;
and S4, training the carbon emission prediction model according to the characteristic matrix, and predicting the carbon emission according to the input numerical values of all the influence factors after training.
2. The method of claim 1, wherein the GRNN has a mathematical expression of:
Figure FDA0003967346310000011
Figure FDA0003967346310000012
Figure FDA0003967346310000013
wherein, P i Is a mode layer neuron transfer function; x is an input variable, X = [ X ] 1 ,x 2 ,…,x n ] T ;X i Learning samples corresponding to the ith neuron; alpha is a smoothing factor; n is the number of mode layer neurons; s. the Nj A transfer function is a summation layer; y is ij The connection weight of the ith neuron and the jth molecule in the weighting layer is obtained;
Figure FDA0003967346310000014
for the output layer output, k is the dimension of the output vector.
3. A method of carbon emissions prediction as claimed in claim 2 wherein the step of the IFWA algorithm comprises:
a1, initializing the positions of fireworks with a preset number;
a2, calculating the number and value range of the fireworks, calculating escape fitness, redefining the distance of the fireworks according to the escape fitness, and obtaining new fireworks;
a3, normalizing the firework distance, and calculating and selecting the first n fireworks with the maximum product of the escape fitness and the normalized firework distance to obtain peak value fireworks, wherein n is the preset selected number;
a4, defining and searching exploration fireworks, and forming next-generation fireworks with the peak value fireworks;
and A5, according to the set maximum iteration times, iteration is carried out through the steps A3 and A4, and fireworks with the highest fitness are selected after the iteration is finished, so that the smooth factor of GRNN is obtained.
4. The method for predicting carbon emissions according to claim 3, wherein the calculating of the escaping fitness and the redefining of the firework distance according to the escaping fitness are specifically:
carrying out normalization processing on the fitness of all the fireworks to obtain the escape fitness of all the fireworks:
Figure FDA0003967346310000021
wherein f is i I =1,2, \8230;, N, denotes the ith firework x i Fitness of (f) i ', i =1,2, \ 8230;, N, representing the ith firework x i The escape fitness of (1), N represents the total number of fireworks,f min Shows the fitness f of all fireworks i Minimum value of f max Shows the fitness f of all fireworks i Maximum value of (1);
redefining firework distance delta for fireworks with definition fitness of 1 i The calculation formula is as follows:
Figure FDA0003967346310000022
wherein x is i Indicates the ith firework, x j The jth firework is shown.
5. A method for carbon emission prediction according to claim 3, wherein the step A3 is specifically:
distance delta to fireworks i Carrying out normalization treatment:
Figure FDA0003967346310000023
calculating normalized firework distance delta i ' product of the escaping fitness gamma i
γ i =f i '*δ i ';
According to gamma i The values are arranged from large to small to obtain the first n fireworks which are marked as peak value fireworks;
wherein, delta min Representing the minimum value, δ, in the firework distance max Representing the maximum value of the firework distance, delta i The firework distance of the ith firework is shown, and N is less than or equal to N-1.
6. A method for carbon emission prediction according to claim 3, wherein the step A4 is specifically:
defining exploration fireworks, wherein a calculation formula of the exploration fireworks is as follows:
Figure FDA0003967346310000024
wherein x is i Represents the ith firework, x j Showing the jth firework;
and the peak fireworks and the exploration fireworks form the next-generation fireworks.
7. The method for predicting carbon emissions according to claim 1, wherein the step S1 is specifically:
acquiring statistical data of the calendar years of the statistical yearbook, performing abnormal missing processing and overrun abnormal processing on the statistical data, and performing standardized processing on processed samples.
8. The method for predicting carbon emissions according to claim 1, wherein the step S2 is specifically:
b1, constructing feature vector data aiming at the cleaned statistical data;
b2, according to the feature vector data, randomly extracting k self-help sample sets in a returning mode by applying a bootstrap method, constructing k classification regression trees by the self-help sample sets, and forming k out-of-bag data OOB (object-oriented domain) by samples which are not extracted each time;
b3, randomly extracting nodes of each classification regression tree to obtain m y The characteristic is used as a randomly generated characteristic subset, the information quantity of each characteristic in the characteristic subset is calculated, and one characteristic is selected according to the information quantity to carry out node splitting to obtain a decision tree;
b4, carrying out importance scoring on the features by Gini index scoring, if the features X j If a node in decision tree i belongs to set M, X j The importance in the ith tree is as follows:
Figure FDA0003967346310000031
b5, total score of RF:
Figure FDA0003967346310000032
wherein n represents the number of decision trees;
b6, performing normalization processing on all the calculated importance scores:
Figure FDA0003967346310000033
according to VIM j Features of ≧ ε constitute a feature matrix, where ε =1.
9. A terminal for carbon emission prediction, comprising a processor, a memory and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, performs the steps of a method for carbon emission prediction as claimed in any one of the preceding claims 1 to 8.
CN202211503703.7A 2022-11-28 2022-11-28 Carbon emission prediction method and terminal Pending CN115796358A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211503703.7A CN115796358A (en) 2022-11-28 2022-11-28 Carbon emission prediction method and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211503703.7A CN115796358A (en) 2022-11-28 2022-11-28 Carbon emission prediction method and terminal

Publications (1)

Publication Number Publication Date
CN115796358A true CN115796358A (en) 2023-03-14

Family

ID=85442336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211503703.7A Pending CN115796358A (en) 2022-11-28 2022-11-28 Carbon emission prediction method and terminal

Country Status (1)

Country Link
CN (1) CN115796358A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116611576A (en) * 2023-06-06 2023-08-18 中国科学院空天信息创新研究院 Carbon discharge prediction method and device
CN116882790A (en) * 2023-09-06 2023-10-13 北京建工环境修复股份有限公司 Carbon emission equipment management method and system for mine ecological restoration area

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116611576A (en) * 2023-06-06 2023-08-18 中国科学院空天信息创新研究院 Carbon discharge prediction method and device
CN116611576B (en) * 2023-06-06 2023-10-03 中国科学院空天信息创新研究院 Carbon discharge prediction method and device
CN116882790A (en) * 2023-09-06 2023-10-13 北京建工环境修复股份有限公司 Carbon emission equipment management method and system for mine ecological restoration area
CN116882790B (en) * 2023-09-06 2023-11-21 北京建工环境修复股份有限公司 Carbon emission equipment management method and system for mine ecological restoration area

Similar Documents

Publication Publication Date Title
Jeong et al. Efficient global optimization (EGO) for multi-objective problem and data mining
CN115796358A (en) Carbon emission prediction method and terminal
Gaur Neural networks in data mining
CN109242223B (en) Quantum support vector machine evaluation and prediction method for urban public building fire risk
CN108876044B (en) Online content popularity prediction method based on knowledge-enhanced neural network
Yin Learning nonlinear principal manifolds by self-organising maps
CN106656357B (en) Power frequency communication channel state evaluation system and method
Sun et al. Reference line-based estimation of distribution algorithm for many-objective optimization
CN113241122A (en) Gene data variable selection and classification method based on fusion of adaptive elastic network and deep neural network
CN111353534B (en) Graph data category prediction method based on adaptive fractional order gradient
CN109492816B (en) Coal and gas outburst dynamic prediction method based on hybrid intelligence
CN113918727A (en) Construction project knowledge transfer method based on knowledge graph and transfer learning
Phan et al. Efficiency enhancement of evolutionary neural architecture search via training-free initialization
CN111584010B (en) Key protein identification method based on capsule neural network and ensemble learning
CN115169521A (en) Graph neural network interpretation method for keeping prediction sequence and structure dependency relationship
JP2012079225A (en) Cooperation filtering processing method and program
Abbas System identification using optimally designed functional link networks via a fast orthogonal search technique.
CN115660038A (en) Multi-stage integrated short-term load prediction based on error factors and improved MOEA/D-SAS
CN115481256A (en) Inverse relation rotation embedding knowledge representation method and system based on convolution quaternion
Lin et al. A novel hybrid learning algorithm for parametric fuzzy CMAC networks and its classification applications
CN107220483B (en) Earth temperature mode prediction method
Jiang et al. A CTR prediction approach for advertising based on embedding model and deep learning
Akhavan et al. A graph-based feature selection using class-feature association map (CFAM)
CN117648646B (en) Drilling and production cost prediction method based on feature selection and stacked heterogeneous integrated learning
Gamage et al. A Robust Ensemble Regression Model for Reconstructing Genetic Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination