CN117390550A - Low-carbon park carbon emission dynamic prediction method and system considering emission training set - Google Patents

Low-carbon park carbon emission dynamic prediction method and system considering emission training set Download PDF

Info

Publication number
CN117390550A
CN117390550A CN202311080887.5A CN202311080887A CN117390550A CN 117390550 A CN117390550 A CN 117390550A CN 202311080887 A CN202311080887 A CN 202311080887A CN 117390550 A CN117390550 A CN 117390550A
Authority
CN
China
Prior art keywords
carbon
data
emission
park
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311080887.5A
Other languages
Chinese (zh)
Inventor
谈竹奎
王扬
吴鹏
肖小兵
蔡永翔
付宇
郝树青
徐敏
周科
苏立
李跃
郑友卓
刘安茳
王卓月
苗宇
班诗雪
乔镖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Power Grid Co Ltd
Original Assignee
Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Power Grid Co Ltd filed Critical Guizhou Power Grid Co Ltd
Priority to CN202311080887.5A priority Critical patent/CN117390550A/en
Publication of CN117390550A publication Critical patent/CN117390550A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/08Construction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Primary Health Care (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Educational Administration (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a system for dynamically predicting carbon emission of a low-carbon park by considering an emission training set, which relate to the technical field of data analysis and machine learning and comprise the following steps: defining a park carbon emission prediction calculation range; constructing an online prediction model based on dynamic characteristics of a low-carbon park; and constructing a carbon emission prediction model based on the random forest park carbon emission training set. The invention designs a low-carbon park carbon emission dynamic prediction technology considering emission training sets, which has better universality and prediction precision compared with a support vector machine model, and the prediction precision of the model is 20.6 percent higher. The random forest model is used as an algorithm basis, can be better used for predicting the carbon emission of the low-carbon park building, has certain superiority, performs dynamic training treatment on the basis of a building simulation carbon emission data set, and can effectively improve the accuracy of the prediction of the carbon emission of the low-carbon park building by the random forest model.

Description

Low-carbon park carbon emission dynamic prediction method and system considering emission training set
Technical Field
The invention relates to the technical field of data analysis and machine learning, in particular to a method and a system for dynamically predicting carbon emission of a low-carbon park by considering emission training sets.
Background
The energy and carbon emission management process of the low-carbon intelligent park relates to the access of a large number of metering devices and data acquisition devices, and the external interface protocol of the energy consumption metering devices and various sensors. The communication link is complex, and how to collect and centrally manage the energy and carbon emission data of each building in the park is one of the important points of the construction of the low-carbon park energy and carbon emission management platform.
The low-carbon intelligent park energy-saving and emission-reduction data server is used for processing mass energy consumption data of each unit of the park and providing basic data for analysis of energy consumption homonymies and girth ratios of enterprises in each unit and the same industry. The history archiving server of the management platform needs to be guaranteed to have high-efficiency storage compression performance so as to achieve high-resolution and high-precision storage of mass storage data, save disk space and guarantee long-term stable operation of database service.
The energy consumption analysis process involves different units and different energy mediums in the park. The primary equipment such as water, electricity and gas meters only provides real-time energy consumption data, and relates to analyzing the energy consumption condition by year, month and day, and an energy management platform is required to store the real-time data information statistical calculation value into an energy data server. And continuously correcting the calculation modes, the statistical modes, the energy medium condition change, the park enterprise change and the like of the related data such as the carbon measurement, the standard coal calculation value and the like. These features require a low-carbon intelligent park comprehensive management platform with a powerful statistical calculation function.
The user group related to the energy consumption and carbon emission management of the park is very large, the time span related to data in the comprehensive analysis can be one month, one year or even a plurality of years, the system is required to have high-performance data retrieval response speed, and a foundation is provided for comprehensive data analysis of the park. The energy consumption and carbon emission metering system is applied to the whole park through park energy and carbon emission control, and online monitoring of main building energy consumption in the park is completed so as to perfect a park energy consumption management basic data system. On the basis of completing the collection, reporting, summarizing and analysis of the energy consumption data of the park enterprises, a set of scientific and perfect energy utilization supervision and evaluation is established according to energy conservation regulations and energy conservation monitoring standards and methods, an expert management system is utilized for analyzing and diagnosing the energy consumption data of the industrial enterprises, and units in the park are helped to scientifically utilize energy, so that the energy consumption monitoring and the energy efficiency management level of the whole park are improved, the power-assisted environment-friendly and resource-saving park construction is realized, and the sustainable development is realized.
Carbon emission prediction is the basis for development of low-carbon parks and is also the basis for carbon supervision of low-carbon parks. In order to more practically improve the model precision of the fuzzy prediction, domestic experts gradually develop system theory researches on various novel modeling methods of the fuzzy prediction, including a data mining method, a wavelet matrix analysis method, an artificial neural network, a support vector machine theory and the like. The conventional short-term power load prediction and optimization method based on the parallel random forest regression algorithm is researched, for example, the parallel analysis and algorithm improvement of a model based on the parallel random forest algorithm are realized by utilizing a distributed operation platform integrated with the parallel Spark platform model, and the robustness of the model after improvement is better. Some prediction models adopt a gray correlation analysis method to correct the random forest regression model, and compared with a support vector machine model and a traditional random forest model, the prediction performance of the random forest regression model is improved. The error value of the random forest method obtained by improving the random forest prediction method based on gray projection is obviously smaller than that of a random forest algorithm which is not improved by the support vector machine method, and the prediction precision and the robustness of the model are enhanced. The short-term load prediction feature selection method of the random forest proves that the random forest model of the optimal prediction feature subset has higher prediction precision than the original model. The random forest model is compared with CART, ARIMA and a neural network at home and abroad to prove that the random forest model has higher precision. Meanwhile, a hybrid prediction mode is also provided, for example, a fuzzy clustering prediction technology is combined with a training random forest algorithm model method to perform research and unification, and after various similarity day samples are randomly selected by the fuzzy clustering prediction technology, the training random forest prediction model is researched by using samples with extremely high similarity coefficients of various data, so that the accuracy level of the prediction algorithm model is improved.
Disclosure of Invention
The present invention has been made in view of the above-described problems.
Therefore, the technical problems solved by the invention are as follows: existing research is typically based on a particular building or type of building and cannot cover the campus level. A small amount of researches establish building form park carbon emission simulation models such as offices, hotels, houses and hospitals, but cannot well reflect the change characteristics of park carbon emission diversity, the buildings in the low-carbon park have higher precision requirements on carbon emission in the process of realizing low-carbon indexes, the existing park prediction cannot be satisfied, and the related influence of dynamic emission is not considered.
In order to solve the technical problems, the invention provides the following technical scheme: a method for dynamically predicting carbon emissions in a low-carbon park in consideration of an emissions training set, comprising the steps of,
defining a park carbon emission prediction calculation range; constructing an online prediction model based on dynamic characteristics of a low-carbon park; and constructing a carbon emission prediction model based on the random forest park carbon emission training set.
As a preferred scheme of the low carbon park carbon emission dynamic prediction method considering emission training set in the invention, the method comprises the following steps: the construction of the online prediction model comprises data preprocessing, model construction, model training and model evaluation.
As a preferred scheme of the low carbon park carbon emission dynamic prediction method considering emission training set in the invention, the method comprises the following steps: the data preprocessing comprises partial autocorrelation function analysis, sliding window processing and normalization.
The partial autocorrelation function analysis includes calculating an autocovariance of the data, denoted as,
wherein M is the number of data sets, p (i) is the number of the data sets, M is any time break, i is the ith moment, and ρ is the constructed partial autocorrelation function.
The window sliding processing comprises cutting original data set by sliding window algorithm, converting data into supervised form composed of features and labels, setting the size of sliding window as N, i.e. predicting n+1th sample value by using the previous N pieces of historical data, and changing the number of data sets into M-N+1 after sliding window processing, wherein the format is { (x) 1 ,…,x n+1 ),…,(x n+1 ,…,x 2n+1 ),…}。
The normalization included normalizing the data collected at the edges using the Min-MaxNormalization method.
The Min-MaxNormalization method is expressed as,
wherein Z is narm Z is the normalized data set max Represents the maximum value in the dataset, Z min Representing the minimum in the dataset.
As a preferred scheme of the low carbon park carbon emission dynamic prediction method considering emission training set in the invention, the method comprises the following steps: the model construction comprises the steps of sending the pretreated carbon emission into a neural network, controlling the quantity of the previous state information to be written into a current candidate set by a reset gate, if the reset gate is 0, indicating that the candidate set only keeps the input information of the current sequence, and updating the degree of the state information of the previous moment to be brought into the current state by the gate to obtain the output hidden layer information of the current sequence.
The reset gate is shown as being configured to,
r t =σ(W r ·[h t-1 ,x t ]+b r )
wherein h is t-1 Representing the input of the last moment, x t Represents the input at time t, σ (·) represents the function operation, W r And b r Respectively representing a weight matrix and a bias matrix in the operation process.
The candidate set is represented as a set of candidates,
wherein, tanh (·) represents a function operation, W and b t The method is used for calculating a weight matrix and a deviation matrix in the process.
The update gate is represented as a gate-up gate,
z t =σ(W z ·[h t-1 ,x t ]+b z )
wherein W is z And b z Respectively representing a weight matrix and a bias matrix in the operation process.
The output hidden layer information is represented as,
the model training includes evaluating network performance using MSE as a loss function, the computation of MSE being expressed as,
wherein y is i Is the i-th actual data of the data,is the i-th predicted data, and M represents the training data amount.
In the network training process, relu is selected as an activation function of a full-connection layer, and Nadam algorithm is adopted to optimize the training of the network.
The model evaluation comprises the steps of adopting root mean square error, average absolute error and training time as model performance evaluation indexes.
The root mean square error is expressed as,
the average absolute error is expressed as,
wherein f i Representing the i-th actual data of the data,represents the i-th predicted data, and M represents the total data amount.
As a preferred scheme of the low carbon park carbon emission dynamic prediction method considering emission training set in the invention, the method comprises the following steps: the carbon emission prediction model is constructed by generating a training set from a random forest through a Bagging method, combining the classified regression trees as unit classifiers into an integrated classifier, and obtaining a prediction result which is an arithmetic average value of all the integrated classifiers.
The probabilistic mathematical expression of the Bagging method is,
where n is the sample size.
As a preferred scheme of the low carbon park carbon emission dynamic prediction method considering emission training set in the invention, the method comprises the following steps: the carbon emission prediction model is constructed by adopting a CART decision tree algorithm, the generalization error of the random forest is obtained through the CART decision tree algorithm, the CART decision tree algorithm is characterized in that each original sub-set is divided into 2 or more than two by using a binary recursion segmentation method, each non-leaf node is provided with two nodes, and the node splitting of the sample set follows the minimum principle of Gini indexes.
The Gini index is expressed as,
where k is the total number of class samples of the node, P k Is the probability of the kth feature on the node.
The Gini index of sample set D is expressed as,
wherein C is k Is a subset belonging to the kth class in set D.
The Gini exponent of each node partition is expressed as,
wherein D is 1 And D 2 Is a subset of set D.
The generalization error of the forest is expressed as,
P e =P X,Y (K(X,Y)<0)
wherein X is an input vector, Y is a classification vector, and P X,Y Is a classification error function of X.
The maximum value of the generalization error of the random forest is expressed as,
wherein,s is the average intensity, which is the correlation coefficient.
As a preferred scheme of the low carbon park carbon emission dynamic prediction method considering emission training set in the invention, the method comprises the following steps: the specific step of constructing the carbon emission prediction model comprises normalizing historical simulation data, wherein the historical simulation data comprises historical carbon emission and model input variables.
Model parameters are initialized.
N samples are selected by randomly replacing samples by using a Bootstrap method.
Randomly selecting M from M features of Shan Keshu as feature values to perform node splitting;
by n tree The decision tree forms a random forest, and the prediction result of the random forest modelTaking the average value of the predicted values of each tree.
Predicting again by using the measured data of the previous three days to obtain a prediction resultAs a verification simulation vs.
Another object of the present invention is to provide a low carbon park carbon emission dynamic prediction system considering emission training set, which can solve the problem of poor prediction accuracy in the case of insufficient historical data in the initial and development stages of the existing building and park formation by adopting historical carbon emission data as training set for prediction, improving based on a certain basic algorithm or combining with other algorithms.
In order to solve the technical problems, the invention provides the following technical scheme: a low carbon park carbon emission dynamic prediction system that considers an emission training set, comprising: the system comprises a data collection and processing module, a model selection and training module and a prediction and evaluation module.
The data collection and processing module is responsible for collecting relevant data of the low-carbon park; the model selection and training module is used for selecting a proper prediction model to dynamically predict the carbon emission; the prediction and assessment module predicts future carbon emissions using a trained model.
A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor when executing the computer program implements the steps of a low carbon park carbon emission dynamic prediction method taking into account emission training sets as described above.
A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of a low carbon campus carbon emission dynamic prediction method that takes into account emission training sets as described above.
The invention has the beneficial effects that: the invention designs a low-carbon park carbon emission dynamic prediction technology considering emission training sets, which has better universality and prediction precision compared with a support vector machine model, and the prediction precision of the model is 20.6 percent higher. The random forest model is used as an algorithm basis, can be better used for predicting the carbon emission of the low-carbon park building, has certain superiority, performs dynamic training treatment on the basis of a building simulation carbon emission data set, and can effectively improve the accuracy of the prediction of the carbon emission of the low-carbon park building by the random forest model. Taking a half month test set as an example, the prediction accuracy can be improved by 32.4%, the prediction accuracy of the dynamic training set carbon emission prediction method based on the random forest algorithm is greatly enhanced, the prediction accuracy becomes more accurate along with the progress of the prediction process, and the method has great advancement in the carbon emission prediction of the initial stage and the development stage of the formation of the buildings and parks with partial insufficient data.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
fig. 1 is an overall flowchart of a method for dynamic prediction of carbon emissions from a low-carbon park in consideration of an emissions training set according to a first embodiment of the present invention.
Figure 2 is an overall framework diagram of a low carbon campus carbon emission dynamic prediction system that considers an emission training set, according to a second embodiment of the present invention.
Fig. 3 is a graph showing two precision performance of a low carbon park carbon emission dynamic prediction method considering an emission training set according to a third embodiment of the present invention.
Fig. 4 is a representation of the two model predictions of the low carbon park carbon emission dynamic prediction method taking into account the emission training set according to the third embodiment of the present invention.
Fig. 5 is a diagram showing the result of predicting the carbon emission of 8 months and 16 days in the low carbon park carbon emission dynamic prediction method considering the emission training set according to the third embodiment of the present invention.
FIG. 6 is a diagram showing the results of predicting carbon emissions on 8 months and 17 days for a low carbon park carbon emission dynamic prediction method considering an emission training set according to a third embodiment of the present invention
Fig. 7 is a diagram showing RMSE of a method for dynamic prediction of carbon emissions in a low-carbon park, taking into account an emissions training set, according to a third embodiment of the present invention.
Detailed Description
So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
While the embodiments of the present invention have been illustrated and described in detail in the drawings, the cross-sectional view of the device structure is not to scale in the general sense for ease of illustration, and the drawings are merely exemplary and should not be construed as limiting the scope of the invention. In addition, the three-dimensional dimensions of length, width and depth should be included in actual fabrication.
Also in the description of the present invention, it should be noted that the orientation or positional relationship indicated by the terms "upper, lower, inner and outer", etc. are based on the orientation or positional relationship shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first, second, or third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected, and coupled" should be construed broadly in this disclosure unless otherwise specifically indicated and defined, such as: can be fixed connection, detachable connection or integral connection; it may also be a mechanical connection, an electrical connection, or a direct connection, or may be indirectly connected through an intermediate medium, or may be a communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Example 1
Referring to fig. 1, for an embodiment of the present invention, a method for controlling stable operation of a gravity energy storage system is provided, which is characterized in that:
s1: and defining a park carbon emission prediction calculation range.
S2: and constructing an online prediction model based on the dynamic characteristics of the low-carbon park.
Further, constructing the online prediction model comprises data preprocessing, model construction, model training and model evaluation.
Still further, data preprocessing includes partial autocorrelation function analysis, sliding window processing, and normalization.
Partial autocorrelation function analysis involves calculating the autocovariance of the data, denoted as,
wherein M is the number of data sets, p (i) is the number of the data sets, M is any time break, i is the ith moment, and ρ is the constructed partial autocorrelation function.
The window sliding process includes cutting the original data set with sliding window algorithm, converting the data into supervised form comprising features and labels, setting the size of the sliding window as N, i.e. predicting the (n+1) th sample value with the N previous historical data, and after the sliding window process, changing the number of data sets into M-N+1 in the format { (x) 1 ,…,x n+1 ),…,(x n+1 ,…,x 2n+1 ),…}。
Normalization included normalization of the data collected at the edges using the Min-MaxNormalization method.
The Min-MaxNormalization method is expressed as,
wherein Z is narm Z is the normalized data set max Represents the maximum value in the dataset, Z min Representing the minimum in the dataset.
The model construction comprises that the pretreated carbon emission is sent into a neural network, the quantity of the previous state information is written into the current candidate set by a reset gate, if the reset gate is 0, the candidate set only keeps the input information of the current sequence, and the state information of the previous moment is controlled by an update gate to be brought into the current stateThe degree in the state gets the output hidden layer information of the current sequence, the smaller the reset gate, the smaller the information indicating the previous state is written, if r t 0 is then represented byOnly the input information of the current sequence is retained.
The reset gate is shown as being a reset gate,
r t =σ(W r ·[h t-1 ,x t ]+b r )
wherein h is t-1 Representing the input of the last moment, x t Represents the input at time t, σ (·) represents the function operation, W r And b r Respectively representing a weight matrix and a bias matrix in the operation process.
The candidate set is represented as a set of candidates,
wherein, tanh (·) represents a function operation, W and b t The method is used for calculating a weight matrix and a deviation matrix in the process.
The update gate is used to control the extent to which the state information at the previous time is brought into the current state, a larger value of the update gate indicates that the state information at the previous time is brought more.
The update gate is represented as a gate-up gate,
z t =σ(W z ·[h t-1 ,x t ]+b z )
wherein W is z And b z Respectively representing a weight matrix and a bias matrix in the operation process.
The output hidden layer information is represented as,
model training involves evaluating network performance using MSE as a loss function, the computation of MSE being expressed as,
wherein y is i Is the i-th actual data of the data,is the i-th predicted data, and M represents the training data amount.
In the network training process, relu is selected as an activation function of a full-connection layer, and Nadam algorithm is adopted to optimize the training of the network.
The model evaluation comprises the steps of adopting root mean square error, average absolute error and training time as model performance evaluation indexes.
The root mean square error is expressed as,
the average absolute error is expressed as,
wherein f i Representing the i-th actual data of the data,representing the ith predicted data, M represents the total data amount, the greater the values of RMSE and MAE, the poorer the performance of the network. The training time mainly considers the pre-training time of the carbon emission prediction model, namely, the time required by model training.
S3: and constructing a carbon emission prediction model based on the random forest park carbon emission training set.
Further, constructing the carbon emission prediction model based on the carbon emission training set of the random forest park comprises the steps that the random forest generates the training set through a Bagging method, the classified regression trees are used as unit classifiers to be combined into an integrated classifier, and a prediction result is an arithmetic average value of all the integrated classifiers.
The Bagging method is a random sampling method for infinitely repeated random samplingThe machine sampling technique is a sampling method based on mathematical calculation, and the random sampling algorithm is carried out on the initial data set to obtain the number of each random sample set. By Bootstrap resampling, training samples (d) for repeated training are randomly decimated from each training original training sample set 1 ,d 2 ,....d n ) Obtaining n tree N of sub-cycle tree When generating a training subset, it is always found that at least a relatively small portion of the samples may not be extracted, and the probability of the Bagging method is expressed mathematically as,
where n is the sample size.
Furthermore, the construction of the carbon emission prediction model based on the carbon emission training set of the random forest park further comprises the step that the CART decision tree algorithm divides each original sample sub-set into 2 or more than two by using a binary recursion segmentation method, each non-leaf node is provided with two nodes, and the node splitting of the sample set follows the minimum principle of the Gini index.
The Gini index is expressed as the number,
where k is the total number of class samples of the node, P k Is the probability of the kth feature on the node.
The Gini index of sample set D is expressed as,
wherein C is k Is a subset belonging to the kth class in set D.
The Gini exponent of each node partition is expressed as,
wherein D is 1 And D 2 Is a subset of set D.
Let the random forest be composed of a set of CART decision trees to form h (x, θ) k ) K=1, 2, n, the function of which is expressed as,
wherein X is an input vector containing at most j classes, j is a class, Y is an accurate classification vector, I () is an exponential function, a k Is a mean function.
The generalization error of the forest is expressed as,
P e =P X,Y (K(X,Y)<0)
wherein X is an input vector, Y is a classification vector, and P X,Y Is a classification error function of X.
The maximum value of the generalization error of the random forest is expressed as,
wherein,s is the average intensity, which is the correlation coefficient.
P of random forest e,max The smaller the random forest, the better the generalization. The maximum value of the generalization error of the random forest is positively correlated with the average correlation coefficient, and is negatively correlated with the maximum weighted average intensity relationship among all the individual decision trees, and the rule has important guiding significance for improving the prediction precision of the random forest.
The specific steps of constructing the carbon emission prediction model include normalizing historical simulation data, the historical simulation data including historical carbon emissions and model input variables.
Model parameters are initialized.
A. N samples are selected by randomly replacing samples by using a Bootstrap method.
B. And randomly selecting M characteristics from M characteristics of Shan Keshu as characteristic values to perform node splitting, and searching the characteristics with optimal splitting capacity.
C. In order to ensure that the model shows the dynamic carbon emission characteristic as much as possible, the model is not simplified, and the decision tree is split to leaf nodes without pruning.
D. By n tree The decision tree forms a random forest, and the prediction result of the random forest modelTaking the average value of the predicted values of each tree.
E. Predicting again by using the measured data of the previous three days to obtain a prediction resultAs a verification simulation vs.
F. With the operation and use of the building, the measured data is increased, and the newly obtained measured data is replaced by the data with earlier time in the original historical simulation training set, so that the training set is dynamic.
G. Dynamically superposing the data of the random forest model according to the steps A-F, and outputtingAs a final result.
Example 2
Referring to fig. 2, for one embodiment of the present invention, a system for a low carbon campus carbon emission dynamic prediction method that considers an emission training set is provided, the low carbon campus carbon emission dynamic prediction system that considers an emission training set including a data collection and processing module, a model selection and training module, and a prediction and assessment module.
The data collection and processing module is responsible for collecting relevant data of the low-carbon park; the model selection and training module is used for selecting a proper prediction model to dynamically predict the carbon emission; the prediction and assessment module predicts future carbon emissions using a trained model.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Example 3
In this embodiment, in order to verify the beneficial effects of the present invention, scientific demonstration is performed through economic benefit calculation and simulation experiments. The present embodiment has been conducted by the conventional method and the method of the present embodiment.
According to the invention, a certain low-carbon park is selected for application verification, and by taking prediction of carbon emission of office buildings in the park as an example, a short-term prediction method research of carbon emission of a building dynamic training set combining simulation and measured data is developed to predict time-by-time carbon emission of a future day. The building area is 1210m2, the building energy-saving standard of 65% is met, and the daily cooling time is 8:00-17:00.
TransientSystemSimulationProgram (TRNSYS) is taken as instantaneous system simulation software with the most powerful functions and the most flexible calculation in the current energy field, and can realize the real-time high-precision simulation of various building carbon emissions. The TRNSYS software is used as a simulation calculation platform, the building is used as an object, a building model is built, the whole cold supply quaternary carbon emission is dynamically simulated, and the model is used as an initial database to develop initial prediction of the building carbon emission.
To verify the prediction accuracy of the random forest algorithm in the aforementioned building data, weather factors, temperature, humidity and solar radiation are added successively, and the prediction accuracy of the random forest and a Support Vector Machine (SVM) is compared and analyzed. The invention analyzes the predicted data comprehensively and strictly. The accuracy of both algorithms is shown in figure 3 when the training samples are relatively small.
As can be seen from fig. 3, when the training sample is a small sample, the average value of the total error of Random Forest (RFR) prediction is about 2.65, and the average value of the total error of support vector machine prediction is about 3.37, so that from numerical analysis, we can obtain that the RFR prediction result is more accurate than that of SVM, and can intuitively see that the error curve of RFR per month is almost below SVM. Therefore, in the building carbon emission prediction, the random forest algorithm has better performance, and the random forest algorithm is selected as a basic algorithm of the carbon emission prediction model.
Short-term carbon emission prediction algorithms for low-carbon park buildings, which are currently used, can be roughly divided into two types. Firstly, because of factors such as limited actual storage space, carbon emission prediction is generally carried out by adopting data of 3 days before a prediction day as a training set; the second deploys carbon emission predictions based entirely on the simulated training set. By way of example, based on a random forest algorithm, the comparative analysis is based on the prediction accuracy of the simulation training set and the actual measurement training set prediction model.
The prediction results of the two models are shown in fig. 4. The figure shows that the accuracy of the model prediction results of the two times is not good, but the overall trend accords with the actual value, and the simulated training set contains the distribution rule of the carbon emission of the building. The root mean square error RMSE of the two predictions is large, 13.31 and 17.46 respectively, and the prediction accuracy needs to be further optimized. In summary, when carbon emission prediction is performed on a building (insufficient historical data) in a low-carbon park, the accuracy based on the simulation training set is general, and the accuracy based on the actual measurement training set of the previous 3 days commonly used in the existing research is poor. Therefore, the invention considers combining the simulation training set and the actual measurement training set, and provides the concept of the dynamic training set.
The invention integrates the carbon emission of office buildings in a park by adopting a dynamic training set, and the carbon emission depends on parameter variables such as temperature, so that external factor data such as carbon emission value, time, outdoor temperature, relative humidity, solar radiation, and the like are used as input variables. In addition, the characteristic information of the building such as the heat transfer coefficient of the building roof and the wall body is required to be given. The average value of the front point and the rear point is used for replacing the error point of the unreasonable mutation of the carbon emission, and the processed training set is the training set A1.
The prediction results of the models are measured by adopting a Root Mean Square Error (RMSE) in the carbon emission prediction process, and the smaller the root mean square error is, the more accurate the prediction results are.
The main parameters of the low-carbon park building prediction model based on the dynamic training set are set as follows: the number of decision trees in the random forest is 1000, and the split feature number is 11. The carbon emission simulation data of the building for 6 months 15 days-8 months 15 days in the cold season are selected as an initial training set, and the dynamic training set is updated and predicted day by day according to the dynamic training set process, so that the carbon emission distribution of the future 8 months 16 days and 17 days is predicted. The prediction results are shown in fig. 5.
The above process reproduces the process of continuously updating the debug model after the start-up operation of the building, starting from the data obtained by the simulation, in the case of insufficient historical data of the initial stage and the development stage of the formation of the building and the park. Along with the time, the actual measurement data in the dynamic training set are more and more, the predicted result is more and more fitted with the actual measurement data, and the characteristics and carbon emission change rule of the low-carbon park building are more clearly disclosed. RMSE of the several day predictions is collated as shown in fig. 6.
The first prediction of RMSE for 8 months and 16 days carbon emission was 11.59, the last prediction of RMSE for 8 months and 28 days carbon emission was 7.84, 32.4% lower than the first, 53.9% lower than the comparative example where the previous 3 days measured value was used as the training set. Therefore, the accuracy of the random forest model office building carbon emission prediction can be effectively improved by carrying out dynamic training set processing on the basis of the low-carbon park building simulated carbon emission data set. In addition, the prediction error in the previous days is reduced more rapidly, and the prediction error is particularly obvious from 16 days to 17 days, so that the actual carbon emission value is greatly different from that in the simulation situation due to the actual building utilization rate and the actual factors of the low-carbon park, and the difference can be effectively reduced and continuously maintained after the training set is replaced by the actual test data, but the prediction process is stable in the subsequent prediction process.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered in the scope of the claims of the present invention.

Claims (10)

1. The method for dynamically predicting the carbon emission of the low-carbon park in consideration of the emission training set is characterized by comprising the following steps of:
defining a park carbon emission prediction calculation range;
constructing an online prediction model based on dynamic characteristics of a low-carbon park;
and constructing a carbon emission prediction model based on the random forest park carbon emission training set.
2. The low carbon park carbon emission dynamic prediction method considering emission training set as claimed in claim 1, wherein: the construction of the online prediction model comprises data preprocessing, model construction, model training and model evaluation.
3. The low carbon park carbon emission dynamic prediction method considering emission training set as claimed in claim 2, wherein: the data preprocessing comprises partial autocorrelation function analysis, sliding window processing and standardization;
the partial autocorrelation function analysis includes calculating an autocovariance of the data;
the auto-covariance of the data is expressed as,
wherein M is the number of data sets, p (i) is the number of the data sets, M represents any time break, i represents the ith moment, and ρ represents the constructed partial autocorrelation function;
the window sliding processing comprises cutting original data set by sliding window algorithm, converting data into supervised form composed of features and labels, setting the size of sliding window as N, i.e. predicting n+1th sample value by using the previous N pieces of historical data, and changing the number of data sets into M-N+1 after sliding window processing, wherein the format is { (x) 1 ,…,x n+1 ),…,(x n+1 ,…,x 2n+1 ),…};
The normalization includes normalizing the data collected at the edges using the Min-MaxNormalization method;
the Min-MaxNormalization method is expressed as,
wherein Z is narm Z is the normalized data set max Represents the maximum value in the dataset, Z min Representing the minimum in the dataset.
4. The low carbon park carbon emission dynamic prediction method considering emission training set as claimed in claim 3, wherein: the model construction comprises the steps of sending the pretreated carbon emission into a neural network, controlling the quantity of the previous state information to be written into a current candidate set by a reset gate, if the reset gate is 0, indicating that the candidate set only keeps the input information of the current sequence, and updating the degree of the state information used for controlling the previous moment to be brought into the current state by the gate to obtain the output hidden layer information of the current sequence;
the reset gate is shown as being configured to,
r t =σ(W r ·[h t-1 ,x t ]+b r )
wherein h is t-1 Representing the input of the last moment, x t Represents the input at time t, σ (·) represents the function operation, W r And b r Respectively representing a weight matrix and a bias matrix in the operation process;
the candidate set is represented as a set of candidates,
wherein, tanh (·) represents a function operation, W and b t The method comprises the steps of calculating a weight matrix and a deviation matrix in the process;
the update gate is represented as a gate-up gate,
z t =σ(W z ·[h t-1 ,x t ]+b z )
wherein W is z And b z Respectively representing a weight matrix and a bias matrix in the operation process;
the output hidden layer information is represented as,
the model training includes evaluating network performance using MSE as a loss function, the computation of MSE being expressed as,
wherein y is i Is the i-th actual data of the data,is the i-th predicted data, M represents the training data amount;
in the network training process, relu is selected as an activation function of a full-connection layer, and Nadam algorithm is adopted to optimize the training of the network;
the model evaluation comprises the steps that the model performance evaluation index adopts root mean square error, average absolute error and training time;
the root mean square error is expressed as,
the average absolute error is expressed as,
wherein f i Representing the i-th actual data of the data,represents the i-th predicted data, and M represents the total data amount.
5. The low carbon park carbon emission dynamic prediction method considering emission training set as claimed in claim 4, wherein: the method comprises the steps that a random forest generates a training set through a Bagging method, classified regression trees are used as unit classifiers to be combined into an integrated classifier, and a prediction result is an arithmetic average value of all the integrated classifiers;
the probabilistic mathematical expression of the Bagging method is,
where n is the sample size.
6. The low carbon park carbon emission dynamic prediction method considering emission training set as claimed in claim 5, wherein: the carbon emission prediction model is constructed by adopting a CART decision tree algorithm, the generalization error of a random forest is obtained through the CART decision tree algorithm, the CART decision tree algorithm is characterized in that each original sub-set is divided into 2 or more than two by using a binary recursion segmentation method, each non-leaf node is provided with two nodes, and the node splitting of a sample set follows the minimum principle of Gini indexes;
the Gini index is expressed as,
where k is the total number of class samples of the node, P k Is the probability of the kth feature on the node;
the Gini index of sample set D is expressed as,
wherein C is k Is a subset belonging to the kth class in the set D;
the Gini exponent of each node partition is expressed as,
wherein D is 1 And D 2 Is a subset of set D;
the generalization error of the forest is expressed as,
P e =P X,Y (K(X,Y)<0)
wherein X is an input vector, Y is a classification vector, and P X,Y A classification error function of X;
the maximum value of the generalization error of the random forest is expressed as,
wherein,s is the average intensity, which is the correlation coefficient.
7. The low carbon park carbon emission dynamic prediction method considering emission training set as claimed in claim 6, wherein: the specific steps of constructing the carbon emission prediction model comprise normalizing historical simulation data, wherein the historical simulation data comprises historical carbon emission and model input variables;
initializing model parameters;
sampling randomly replaced by adopting a Bootstrap method to select n samples;
randomly selecting M from M features of Shan Keshu as feature values to perform node splitting;
by n tree The decision tree forms a random forest, and the prediction result of the random forest modelTaking the average value of the predicted values of all the trees;
predicting again by using measured data of specific time to obtain prediction resultAs a verification simulation vs.
8. A system employing the low carbon park carbon emission dynamic prediction method considering emission training set as set forth in any one of claims 1-7, characterized by: the system comprises a data collection and processing module, a model selection and training module and a prediction and evaluation module;
the data collection and processing module is responsible for collecting relevant data of the low-carbon park;
the model selection and training module is used for selecting a proper prediction model to dynamically predict the carbon emission;
the prediction and assessment module predicts future carbon emissions using a trained model.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the low carbon park carbon emission dynamic prediction method taking into account emission training sets of any of claims 1 to 7.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the low carbon park carbon emission dynamic prediction method taking into account emission training sets of any of claims 1 to 7.
CN202311080887.5A 2023-08-25 2023-08-25 Low-carbon park carbon emission dynamic prediction method and system considering emission training set Pending CN117390550A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311080887.5A CN117390550A (en) 2023-08-25 2023-08-25 Low-carbon park carbon emission dynamic prediction method and system considering emission training set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311080887.5A CN117390550A (en) 2023-08-25 2023-08-25 Low-carbon park carbon emission dynamic prediction method and system considering emission training set

Publications (1)

Publication Number Publication Date
CN117390550A true CN117390550A (en) 2024-01-12

Family

ID=89461988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311080887.5A Pending CN117390550A (en) 2023-08-25 2023-08-25 Low-carbon park carbon emission dynamic prediction method and system considering emission training set

Country Status (1)

Country Link
CN (1) CN117390550A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117892638A (en) * 2024-03-14 2024-04-16 河海大学 Drought formation time prediction method and system using conditional probability function

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117892638A (en) * 2024-03-14 2024-04-16 河海大学 Drought formation time prediction method and system using conditional probability function
CN117892638B (en) * 2024-03-14 2024-05-17 河海大学 Drought formation time prediction method and system using conditional probability function

Similar Documents

Publication Publication Date Title
CN113962364B (en) Multi-factor power load prediction method based on deep learning
CN109754113B (en) Load prediction method based on dynamic time warping and long-and-short time memory
Gao et al. Interpretable deep learning model for building energy consumption prediction based on attention mechanism
CN111027772A (en) Multi-factor short-term load prediction method based on PCA-DBILSTM
CN113554466B (en) Short-term electricity consumption prediction model construction method, prediction method and device
Liu et al. Heating load forecasting for combined heat and power plants via strand-based LSTM
CN116644970A (en) Photovoltaic power prediction method based on VMD decomposition and lamination deep learning
CN117390550A (en) Low-carbon park carbon emission dynamic prediction method and system considering emission training set
CN114119273A (en) Park comprehensive energy system non-invasive load decomposition method and system
CN113610328A (en) Power generation load prediction method
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
CN115409369A (en) Comprehensive energy system reliability evaluation method based on mechanism and data hybrid driving
Zhao et al. A frequency item mining based embedded feature selection algorithm and its application in energy consumption prediction of electric bus
CN115481788B (en) Phase change energy storage system load prediction method and system
CN114065646B (en) Energy consumption prediction method based on hybrid optimization algorithm, cloud computing platform and system
Sarmas et al. Baseline energy modeling for improved measurement and verification through the use of ensemble artificial intelligence models
CN115759343A (en) E-LSTM-based user electric quantity prediction method and device
CN115511218A (en) Intermittent type electrical appliance load prediction method based on multi-task learning and deep learning
CN115796327A (en) Wind power interval prediction method based on VMD (vertical vector decomposition) and IWOA-F-GRU (empirical mode decomposition) -based models
CN115310355A (en) Multi-energy coupling-considered multi-load prediction method and system for comprehensive energy system
CN113051837A (en) Method and device for predicting DLSTM power load based on meteorological parameters
CN113591391A (en) Power load control device, control method, terminal, medium and application
CN117833243B (en) Method, system, equipment and medium for predicting short-term demand of electric power
Liu et al. Ultra-short-term wind power forecasting based on stacking model
CN116451049B (en) Wind power prediction method based on agent assisted evolutionary neural network structure search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination