CN117172347A - Carbon emission prediction method based on energy big data - Google Patents

Carbon emission prediction method based on energy big data Download PDF

Info

Publication number
CN117172347A
CN117172347A CN202310814459.4A CN202310814459A CN117172347A CN 117172347 A CN117172347 A CN 117172347A CN 202310814459 A CN202310814459 A CN 202310814459A CN 117172347 A CN117172347 A CN 117172347A
Authority
CN
China
Prior art keywords
carbon emission
prediction
carbon
energy
influencing factors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310814459.4A
Other languages
Chinese (zh)
Inventor
马瑞
朱东歌
韩红卫
沙江波
刘佳
康文妮
丁茂生
夏绪卫
张爽
李兴华
闫振华
柴育峰
郭飞
吴旻荣
王峰
李晓龙
高博
张庆平
王亮
苏望
万鹏
蔡冰
段文齐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Ningxia Electric Power Co Ltd
Original Assignee
Electric Power Research Institute of State Grid Ningxia Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Ningxia Electric Power Co Ltd filed Critical Electric Power Research Institute of State Grid Ningxia Electric Power Co Ltd
Priority to CN202310814459.4A priority Critical patent/CN117172347A/en
Publication of CN117172347A publication Critical patent/CN117172347A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种基于能源大数据的碳排放预测方法,涉及碳排放预测技术领域,包括:S1、碳排放影响因素数据整合;S2、碳排放预测模型构建;S3、预测模型训练及冲击模拟;S4、碳排放预测的冲击分析;S5、碳排放预测及结果校正。该一种基于能源大数据的碳排放预测方法,通过研究碳排放的初步预测和冲击预测的关系,建立数学方程关系,计算得出碳排放的初步预测结果,再利用数学方程关系对初步预测结果进行校正,不仅构建了碳排放预测模型,还模拟了各影响因素在不同情况下对预测模型造成的冲击,避免不同领域、不同设备的碳排放影响因素差异对预测模型造成的影响,从而避免其对碳排放预测造成干扰,提升碳排放预测结果的准确性和可靠性。

The invention discloses a carbon emission prediction method based on energy big data, which relates to the technical field of carbon emission prediction, including: S1, data integration of carbon emission influencing factors; S2, carbon emission prediction model construction; S3, prediction model training and impact simulation ; S4, impact analysis of carbon emission prediction; S5, carbon emission prediction and result correction. This carbon emission prediction method based on energy big data studies the relationship between the preliminary prediction of carbon emissions and the impact prediction, establishes a mathematical equation relationship, calculates the preliminary prediction results of carbon emissions, and then uses the mathematical equation relationship to predict the preliminary prediction results. Calibration not only builds a carbon emission prediction model, but also simulates the impact of various influencing factors on the prediction model under different circumstances to avoid the impact of differences in carbon emission influencing factors in different fields and different equipment on the prediction model, thereby avoiding its impact on the prediction model. Cause interference to carbon emission predictions and improve the accuracy and reliability of carbon emission prediction results.

Description

一种基于能源大数据的碳排放预测方法A carbon emission prediction method based on energy big data

技术领域Technical field

本发明涉及碳排放预测技术领域,具体为一种基于能源大数据的碳排放预测方法。The invention relates to the technical field of carbon emission prediction, specifically a carbon emission prediction method based on energy big data.

背景技术Background technique

能源大数据是将电力、燃气、石油等能源数据及经济、人口、地理、气候等相关领域数据进行综合采集、处理、分析与应用的相关技术与应用。随着5G、AI智能与物联网技术与应用的不断深入,在移动互联网与企业信息化的基础上,依托大数据与区块链等技术,实现结构化和非结构化信息的采集、分析与融合,实现海量数据采集的及时、准确、可信、可用、可追溯。它不仅仅是对能源生产、管理、消费及相关技术革命与大数据理念的深度融合,更能够为能源产业发展及商业模式创新寻找新的方向。Energy big data is a related technology and application that comprehensively collects, processes, analyzes and applies energy data such as electricity, gas, and oil as well as data in economic, population, geography, climate and other related fields. With the continuous deepening of 5G, AI intelligence and Internet of Things technologies and applications, on the basis of mobile Internet and enterprise informatization, relying on technologies such as big data and blockchain, the collection, analysis and analysis of structured and unstructured information can be realized. Integration enables timely, accurate, credible, usable and traceable collection of massive data. It is not only a deep integration of energy production, management, consumption and related technological revolutions with big data concepts, but also can find new directions for the development of the energy industry and business model innovation.

比如申请号为202111516267.2的专利文件公开了一种基于电力数据的能源行业碳排放预测方法,该专利通过构建一个能源行业电力数据-能源消费数据-碳排放的转换关系,基于用能行业的电力数据提高碳排放管理功能,同时对企业碳排放进行预测和预警,充分挖掘企业碳减排潜力,实现对重点用能企业碳排放进行预测和预警,最终实现数据驱动的多行业能源消费及碳排放全景预测,同时服务重点用能企业挖掘碳减排潜力、开展节能降耗。For example, the patent document with application number 202111516267.2 discloses a method for predicting carbon emissions in the energy industry based on power data. This patent builds a conversion relationship between power data in the energy industry - energy consumption data - carbon emissions, based on power data in the energy-using industry. Improve the carbon emission management function, and at the same time predict and warn corporate carbon emissions, fully tap the carbon emission reduction potential of enterprises, realize the prediction and early warning of carbon emissions of key energy-consuming enterprises, and ultimately achieve a data-driven multi-industry energy consumption and carbon emission panorama forecast, and at the same time serve key energy-consuming enterprises to tap their carbon emission reduction potential and carry out energy conservation and consumption reduction.

但类似于上述文件的碳排放预测方法依然存在以下不足:However, carbon emission prediction methods similar to the above documents still have the following shortcomings:

现有的碳排放预测方法一般通过构建出碳排放预测模型,并通过代入相关参数和变量计算进行碳排放预测,碳排放预测结果的准确性和可靠性直接取决于预测模型,但由于不同领域甚至不同设备的碳排放影响因素均存在差异,因此预测模型容易受到这些差异的影响,从而对碳排放预测造成干扰,影响碳排放预测结果的准确性和可靠性。Existing carbon emission prediction methods generally build a carbon emission prediction model and calculate carbon emissions by substituting relevant parameters and variables. The accuracy and reliability of the carbon emission prediction results directly depend on the prediction model, but due to different fields and even There are differences in the factors affecting carbon emissions of different equipment, so the prediction model is easily affected by these differences, which interferes with the carbon emission prediction and affects the accuracy and reliability of the carbon emission prediction results.

因此,急需对此缺点进行改进,本发明则是针对现有的结构及不足予以研究改良,提供有一种基于能源大数据的碳排放预测方法。Therefore, there is an urgent need to improve this shortcoming. The present invention studies and improves the existing structure and shortcomings, and provides a carbon emission prediction method based on energy big data.

发明内容Contents of the invention

本发明的目的在于提供一种基于能源大数据的碳排放预测方法,以解决上述背景技术中提出的问题。The purpose of the present invention is to provide a carbon emission prediction method based on energy big data to solve the problems raised in the above background technology.

为实现上述目的,本发明提供如下技术方案:一种基于能源大数据的碳排放预测方法,包括以下具体步骤:In order to achieve the above objectives, the present invention provides the following technical solution: a carbon emission prediction method based on energy big data, including the following specific steps:

S1、碳排放影响因素数据整合:S1. Data integration of factors affecting carbon emissions:

分析确定能源行业影响碳排放因素,连接能源信息数据库,调取库内关于碳排放影响因素的相关信息数据,并建立能源信息数据集,在能源信息数据集中设置多个子集,将不同的碳排放影响因素信息数据放置在子集内部,建立不同碳排放影响因素类别的信息子集;Analyze and determine the factors affecting carbon emissions in the energy industry, connect to the energy information database, retrieve relevant information and data on factors affecting carbon emissions in the database, and establish an energy information data set. Set multiple subsets in the energy information data set to combine different carbon emissions. The influencing factor information data is placed inside the subset, and information subsets of different categories of carbon emission influencing factors are established;

S2、碳排放预测模型构建:S2. Construction of carbon emission prediction model:

对数据子集内的数据进行清洗和修正,并通过随机森林算法进行影响因素的重要程度判别,依据重要性对影响因素进行排序,选出主要的影响因素,生成特征矩阵,并构建碳排放预测模型;Clean and correct the data in the data subset, and use the random forest algorithm to determine the importance of influencing factors, sort the influencing factors according to their importance, select the main influencing factors, generate a feature matrix, and construct a carbon emission prediction Model;

S3、预测模型训练及冲击模拟:S3. Predictive model training and impact simulation:

根据所述特征矩阵对所述碳排放预测模型进行训练,以检验碳排放预测模型,并在训练后根据输入的各个影响因素的数值,进行碳排放的初步预测,再调整影响因素的参数,模拟受到不同情况冲击,并进行碳排放预测;The carbon emission prediction model is trained according to the characteristic matrix to test the carbon emission prediction model, and after training, a preliminary prediction of carbon emissions is made based on the input values of each influencing factor, and then the parameters of the influencing factors are adjusted to simulate Be affected by different situations and conduct carbon emission predictions;

S4、碳排放预测的冲击分析:S4. Impact analysis of carbon emission forecast:

研究碳排放的初步预测和冲击预测的关系,结合能源数据分析影响因素的冲击情况及变化规律,通过聚类分析和关联规则分析,确定两者内部关联性,绘制因果关系图,并建立数学方程关系;Study the relationship between preliminary prediction of carbon emissions and impact prediction, combine energy data to analyze the impact and change patterns of influencing factors, determine the internal correlation between the two through cluster analysis and association rule analysis, draw a causal relationship diagram, and establish mathematical equations relation;

S5、碳排放预测及结果校正:S5. Carbon emission prediction and result correction:

调取能源信息数据库内的能源数据,代入碳排放预测模型计算得出碳排放的初步预测结果,再利用数学方程关系对初步预测结果进行校正,除去影响因素冲击对碳排放预测造成的干扰,计算出最终的碳排放预测结果。The energy data in the energy information database is retrieved, substituted into the carbon emission prediction model to calculate the preliminary prediction results of carbon emissions, and then the preliminary prediction results are corrected using mathematical equations to remove the interference caused by the impact of influencing factors on the carbon emission prediction, and the calculation The final carbon emission prediction results are obtained.

进一步的,所述步骤S1中能源信息数据集{Edb_Co2}={ΔCo2_1}+{ΔCo2_2}+{ΔCo2_3}+…+{ΔCo2_n},其中,Co2_1、Co2_2、Co2_3、…、Co2_n分别表示不同的碳排放影响因素,{ΔCo2_1}、{ΔCo2_2}、{ΔCo2_3}、…、{ΔCo2_n}分别表示不同影响因素对应的信息子集。Further, in step S1, the energy information data set {Edb_Co2}={ΔCo2_1}+{ΔCo2_2}+{ΔCo2_3}+...+{ΔCo2_n}, where Co2_1, Co2_2, Co2_3,..., Co2_n respectively represent different carbons. Emission influencing factors, {ΔCo2_1}, {ΔCo2_2}, {ΔCo2_3}, ..., {ΔCo2_n} respectively represent the information subsets corresponding to different influencing factors.

进一步的,所述步骤S1中同一信息子集内的碳排放影响因素信息数据相互独立、互不干扰,不同信息子集内的碳排放影响因素信息数据不存在重叠和交互。Further, in step S1, the information data of carbon emission influencing factors in the same information subset are independent of each other and do not interfere with each other, and there is no overlap or interaction between the information data of carbon emission influencing factors in different information subsets.

进一步的,所述步骤S2中随机森林算法的具体操作是将多个决策树结合在一起,每次数据子集是随机有放回的选出,同时随机选出部分特征作为输入。Furthermore, the specific operation of the random forest algorithm in step S2 is to combine multiple decision trees. Each time the data subset is randomly selected with replacement, and at the same time, some features are randomly selected as input.

进一步的,所述步骤S2中随机森林算法是以决策树为估计器的Bagging算法,最终结果通过结合器进行统计,结合器在分类问题中,选择多数分类结果作为最后的结果,在回归问题中,对多个回归结果取平均值作为最后的结果。Furthermore, the random forest algorithm in step S2 is a bagging algorithm using decision trees as estimators. The final results are statistically calculated by the combiner. In the classification problem, the combiner selects the majority of the classification results as the final result. In the regression problem , average multiple regression results as the final result.

进一步的,所述步骤S2中影响因素排序以重要性作为排序依据,且遵循重要性自大到小的排序规则,影响因素的排序结果越靠前表明该影响因素的重要程度越高,反之,排序结果越靠后则表明该影响因素的重要程度越低。Further, in step S2, the ranking of influencing factors is based on importance and follows the ordering rule of increasing importance. The higher the ranking result of the influencing factors, the higher the importance of the influencing factors. On the contrary, The lower the ranking result is, the lower the importance of the influencing factor.

进一步的,所述步骤S2中主要影响因素选取的依据是影响因素的排序结果,且主要影响因素指排序结果靠前且重要性高的影响因素。Further, the selection of the main influencing factors in step S2 is based on the ranking results of the influencing factors, and the main influencing factors refer to the influencing factors with the highest ranking results and high importance.

进一步的,所述步骤S3中碳排放预测公式如下:Further, the carbon emission prediction formula in step S3 is as follows:

其中,C为碳排放预测结果,i=1、2、3、…、n,k为重要性参数,Pi为第i种主要影响因素的碳排放系数,Di为第i种主要影响因素的消耗量。Among them, C is the carbon emission prediction result, i=1, 2, 3,..., n, k is the importance parameter, P i is the carbon emission coefficient of the i-th main influencing factor, and D i is the i-th main influencing factor. consumption.

进一步的,所述步骤S4中聚类分析的操作是将初步预测和冲击预测的数据分类到不同的类或者簇,再结合分类结果进行分析,且聚类所要求划分的类是未知的。Further, the cluster analysis operation in step S4 is to classify the preliminary prediction and impact prediction data into different classes or clusters, and then analyze the classification results in combination, and the classes required for clustering are unknown.

进一步的,所述步骤S4中关联规则分析的操作是在初步预测和冲击预测的数据中,查找存在于初步预测和冲击预测之间的频繁模式、关联、相关性或因果结构。Further, the operation of association rule analysis in step S4 is to find frequent patterns, associations, correlations or causal structures that exist between the preliminary prediction and the impact prediction in the data of the preliminary prediction and the impact prediction.

本发明提供了一种基于能源大数据的碳排放预测方法,具备以下有益效果:本发明通过分析确定能源行业影响碳排放因素,并连接能源信息数据库调取数据,以建立能源信息数据集和不同碳排放影响因素类别的信息子集,并通过随机森林算法对影响因素的重要程度进行判别,再排序选出主要的影响因素,生成特征矩阵,构建碳排放预测模型,进行模型训练,并在训练后进行碳排放的初步预测,再调整影响因素的参数,模拟冲击并进行碳排放预测,研究碳排放的初步预测和冲击预测的关系,分析影响因素的冲击情况及变化规律,经聚类分析和关联规则分析后,建立数学方程关系,将能源数据代入碳排放预测模型计算得出碳排放的初步预测结果,再利用数学方程关系对初步预测结果进行校正,得到最终的碳排放预测结果,改进后的碳排放预测方法不仅构建了碳排放预测模型,还模拟了各影响因素在不同情况下对预测模型造成的冲击,避免不同领域、不同设备的碳排放影响因素差异对预测模型造成的影响,从而避免其对碳排放预测造成干扰,提升碳排放预测结果的准确性和可靠性。The present invention provides a carbon emission prediction method based on energy big data, which has the following beneficial effects: The present invention determines the factors affecting carbon emissions in the energy industry through analysis, and connects the energy information database to retrieve data to establish energy information data sets and different Information subsets of the categories of factors affecting carbon emissions, and use the random forest algorithm to identify the importance of the influencing factors, then sort and select the main influencing factors, generate a feature matrix, build a carbon emission prediction model, conduct model training, and train Then make a preliminary prediction of carbon emissions, then adjust the parameters of the influencing factors, simulate the impact and predict the carbon emissions, study the relationship between the preliminary prediction of carbon emissions and the impact prediction, analyze the impact and change patterns of the influencing factors, and perform cluster analysis and After analyzing the association rules, establish a mathematical equation relationship, substitute the energy data into the carbon emission prediction model to calculate the preliminary prediction results of carbon emissions, and then use the mathematical equation relationship to correct the preliminary prediction results to obtain the final carbon emission prediction results. After improvement The carbon emission prediction method not only builds a carbon emission prediction model, but also simulates the impact of various influencing factors on the prediction model under different circumstances to avoid the impact of differences in carbon emission influencing factors in different fields and different equipment on the prediction model, thereby Avoid interference with carbon emission predictions and improve the accuracy and reliability of carbon emission prediction results.

附图说明Description of drawings

图1为本发明一种基于能源大数据的碳排放预测方法的整体运行流程示意图;Figure 1 is a schematic diagram of the overall operation flow of a carbon emission prediction method based on energy big data according to the present invention;

图2为本发明一种基于能源大数据的碳排放预测方法的步骤S1运行流程示意图;Figure 2 is a schematic diagram of the operation flow of step S1 of a carbon emission prediction method based on energy big data according to the present invention;

图3为本发明一种基于能源大数据的碳排放预测方法的步骤S2运行流程示意图;Figure 3 is a schematic diagram of the operation flow of step S2 of a carbon emission prediction method based on energy big data according to the present invention;

图4为本发明一种基于能源大数据的碳排放预测方法的步骤S3运行流程示意图;Figure 4 is a schematic diagram of the operation flow of step S3 of a carbon emission prediction method based on energy big data according to the present invention;

图5为本发明一种基于能源大数据的碳排放预测方法的步骤S4运行流程示意图;Figure 5 is a schematic diagram of the operation flow of step S4 of a carbon emission prediction method based on energy big data according to the present invention;

图6为本发明一种基于能源大数据的碳排放预测方法的步骤S5运行流程示意图。Figure 6 is a schematic diagram of the operation flow of step S5 of a carbon emission prediction method based on energy big data of the present invention.

具体实施方式Detailed ways

下面结合附图和实施例对本发明的实施方式作进一步详细描述。以下实施例用于说明本发明,但不能用来限制本发明的范围。The embodiments of the present invention will be described in further detail below with reference to the accompanying drawings and examples. The following examples are used to illustrate the invention but are not intended to limit the scope of the invention.

如图1-图6所示,一种基于能源大数据的碳排放预测方法,包括以下具体步骤:As shown in Figures 1-6, a carbon emission prediction method based on energy big data includes the following specific steps:

S1、碳排放影响因素数据整合:S1. Data integration of factors affecting carbon emissions:

分析确定能源行业影响碳排放因素,连接能源信息数据库,调取库内关于碳排放影响因素的相关信息数据,并建立能源信息数据集,在能源信息数据集中设置多个子集,将不同的碳排放影响因素信息数据放置在子集内部,建立不同碳排放影响因素类别的信息子集,同一信息子集内的碳排放影响因素信息数据相互独立、互不干扰,不同信息子集内的碳排放影响因素信息数据不存在重叠和交互;Analyze and determine the factors affecting carbon emissions in the energy industry, connect to the energy information database, retrieve relevant information and data on factors affecting carbon emissions in the database, and establish an energy information data set. Set multiple subsets in the energy information data set to combine different carbon emissions. The information data of influencing factors is placed inside the subset, and information subsets of different categories of carbon emission influencing factors are established. The information data of carbon emission influencing factors in the same information subset are independent of each other and do not interfere with each other. The impact of carbon emissions in different information subsets There is no overlap or interaction in factor information data;

能源信息数据集{Edb_Co2}={ΔCo2_1}+{ΔCo2_2}+{ΔCo2_3}+…+{ΔCo2_n},其中,Co2_1、Co2_2、Co2_3、…、Co2_n分别表示不同的碳排放影响因素,{ΔCo2_1}、{ΔCo2_2}、{ΔCo2_3}、…、{ΔCo2_n}分别表示不同影响因素对应的信息子集;Energy information data set {Edb_Co2}={ΔCo2_1}+{ΔCo2_2}+{ΔCo2_3}+…+{ΔCo2_n}, where Co2_1, Co2_2, Co2_3,…, Co2_n respectively represent different carbon emission influencing factors, {ΔCo2_1}, {ΔCo2_2}, {ΔCo2_3}, ..., {ΔCo2_n} respectively represent the information subsets corresponding to different influencing factors;

S2、碳排放预测模型构建:S2. Construction of carbon emission prediction model:

对数据子集内的数据进行清洗和修正,并通过随机森林算法进行影响因素的重要程度判别,依据重要性对影响因素进行排序,选出主要的影响因素,生成特征矩阵,并构建碳排放预测模型;Clean and correct the data in the data subset, and use the random forest algorithm to determine the importance of influencing factors, sort the influencing factors according to their importance, select the main influencing factors, generate a feature matrix, and construct a carbon emission prediction Model;

随机森林算法的具体操作是将多个决策树结合在一起,每次数据子集是随机有放回的选出,同时随机选出部分特征作为输入;随机森林算法是以决策树为估计器的Bagging算法,最终结果通过结合器进行统计,结合器在分类问题中,选择多数分类结果作为最后的结果,在回归问题中,对多个回归结果取平均值作为最后的结果;The specific operation of the random forest algorithm is to combine multiple decision trees. Each time the data subset is randomly selected with replacement, and at the same time, some features are randomly selected as input; the random forest algorithm uses decision trees as estimators. In the bagging algorithm, the final results are counted through the combiner. In the classification problem, the combiner selects the majority of the classification results as the final result. In the regression problem, the average of multiple regression results is taken as the final result;

影响因素排序以重要性作为排序依据,且遵循重要性自大到小的排序规则,影响因素的排序结果越靠前表明该影响因素的重要程度越高,反之,排序结果越靠后则表明该影响因素的重要程度越低;主要影响因素选取的依据是影响因素的排序结果,且主要影响因素指排序结果靠前且重要性高的影响因素;The ranking of influencing factors is based on importance, and follows the ordering rules from largest to smallest. The higher the ranking result of the influencing factors, the higher the importance of the influencing factors. On the contrary, the lower the ranking result, the higher the importance of the influencing factors. The lower the importance of the influencing factors; the selection of the main influencing factors is based on the ranking results of the influencing factors, and the main influencing factors refer to the influencing factors with high ranking results and high importance;

S3、预测模型训练及冲击模拟:S3. Predictive model training and impact simulation:

根据所述特征矩阵对所述碳排放预测模型进行训练,以检验碳排放预测模型,并在训练后根据输入的各个影响因素的数值,进行碳排放的初步预测,再调整影响因素的参数,模拟受到不同情况冲击,并进行碳排放预测;The carbon emission prediction model is trained according to the characteristic matrix to test the carbon emission prediction model, and after training, a preliminary prediction of carbon emissions is made based on the input values of each influencing factor, and then the parameters of the influencing factors are adjusted to simulate Be affected by different situations and conduct carbon emission predictions;

碳排放预测公式如下:The carbon emission prediction formula is as follows:

其中,C为碳排放预测结果,i=1、2、3、…、n,k为重要性参数,Pi为第i种主要影响因素的碳排放系数,Di为第i种主要影响因素的消耗量;Among them, C is the carbon emission prediction result, i=1, 2, 3,..., n, k is the importance parameter, P i is the carbon emission coefficient of the i-th main influencing factor, and D i is the i-th main influencing factor. consumption;

S4、碳排放预测的冲击分析:S4. Impact analysis of carbon emission forecast:

研究碳排放的初步预测和冲击预测的关系,结合能源数据分析影响因素的冲击情况及变化规律,通过聚类分析和关联规则分析,确定两者内部关联性,绘制因果关系图,并建立数学方程关系;Study the relationship between preliminary prediction of carbon emissions and impact prediction, combine energy data to analyze the impact and change patterns of influencing factors, determine the internal correlation between the two through cluster analysis and association rule analysis, draw a causal relationship diagram, and establish mathematical equations relation;

聚类分析的操作是将初步预测和冲击预测的数据分类到不同的类或者簇,再结合分类结果进行分析,且聚类所要求划分的类是未知的;关联规则分析的操作是在初步预测和冲击预测的数据中,查找存在于初步预测和冲击预测之间的频繁模式、关联、相关性或因果结构;The operation of cluster analysis is to classify the data of preliminary prediction and impact prediction into different classes or clusters, and then combine the classification results for analysis, and the classes required for clustering are unknown; the operation of association rule analysis is to perform preliminary prediction and shock prediction data, looking for frequent patterns, associations, correlations, or causal structures that exist between preliminary forecasts and shock predictions;

S5、碳排放预测及结果校正:S5. Carbon emission prediction and result correction:

调取能源信息数据库内的能源数据,代入碳排放预测模型计算得出碳排放的初步预测结果,再利用数学方程关系对初步预测结果进行校正,除去影响因素冲击对碳排放预测造成的干扰,计算出最终的碳排放预测结果。The energy data in the energy information database is retrieved, substituted into the carbon emission prediction model to calculate the preliminary prediction results of carbon emissions, and then the preliminary prediction results are corrected using mathematical equations to remove the interference caused by the impact of influencing factors on the carbon emission prediction, and the calculation The final carbon emission prediction results are obtained.

综上,结合图1-图6所示,该基于能源大数据的碳排放预测方法,使用时基于能源大数据的碳排放预测方法包括以下具体步骤:In summary, as shown in Figures 1 to 6, the carbon emission prediction method based on energy big data includes the following specific steps when used:

S1、碳排放影响因素数据整合:S1. Data integration of factors affecting carbon emissions:

分析确定能源行业影响碳排放因素,连接能源信息数据库,调取库内关于碳排放影响因素的相关信息数据,并建立能源信息数据集,在能源信息数据集中设置多个子集,将不同的碳排放影响因素信息数据放置在子集内部,建立不同碳排放影响因素类别的信息子集;同一信息子集内的碳排放影响因素信息数据相互独立、互不干扰,不同信息子集内的碳排放影响因素信息数据不存在重叠和交互;Analyze and determine the factors affecting carbon emissions in the energy industry, connect to the energy information database, retrieve relevant information and data on factors affecting carbon emissions in the database, and establish an energy information data set. Set multiple subsets in the energy information data set to combine different carbon emissions. The information data of influencing factors are placed inside the subsets to establish information subsets of different categories of carbon emission influencing factors; the information data of carbon emission influencing factors in the same information subset are independent of each other and do not interfere with each other, and the carbon emission impacts in different information subsets There is no overlap or interaction in factor information data;

能源信息数据集{Edb_Co2}={ΔCo2_1}+{ΔCo2_2}+{ΔCo2_3}+…+{ΔCo2_n},其中,Co2_1、Co2_2、Co2_3、…、Co2_n分别表示不同的碳排放影响因素,{ΔCo2_1}、{ΔCo2_2}、{ΔCo2_3}、…、{ΔCo2_n}分别表示不同影响因素对应的信息子集;Energy information data set {Edb_Co2}={ΔCo2_1}+{ΔCo2_2}+{ΔCo2_3}+…+{ΔCo2_n}, where Co2_1, Co2_2, Co2_3,…, Co2_n respectively represent different carbon emission influencing factors, {ΔCo2_1}, {ΔCo2_2}, {ΔCo2_3}, ..., {ΔCo2_n} respectively represent the information subsets corresponding to different influencing factors;

S2、碳排放预测模型构建:S2. Construction of carbon emission prediction model:

对数据子集内的数据进行清洗和修正,并通过随机森林算法进行影响因素的重要程度判别,随机森林算法的具体操作是将多个决策树结合在一起,每次数据子集是随机有放回的选出,同时随机选出部分特征作为输入;且随机森林算法是以决策树为估计器的Bagging算法,最终结果通过结合器进行统计,结合器在分类问题中,选择多数分类结果作为最后的结果,在回归问题中,对多个回归结果取平均值作为最后的结果,依据重要性对影响因素进行排序,影响因素排序以重要性作为排序依据,且遵循重要性自大到小的排序规则,影响因素的排序结果越靠前表明该影响因素的重要程度越高,反之,排序结果越靠后则表明该影响因素的重要程度越低,选出主要的影响因素,主要影响因素选取的依据是影响因素的排序结果,且主要影响因素指排序结果靠前且重要性高的影响因素,生成特征矩阵,并构建碳排放预测模型;Clean and correct the data in the data subset, and use the random forest algorithm to determine the importance of influencing factors. The specific operation of the random forest algorithm is to combine multiple decision trees together. Each data subset is randomly placed At the same time, some features are randomly selected as input; and the random forest algorithm is a Bagging algorithm that uses decision trees as estimators. The final results are statistically calculated through the combiner. In the classification problem, the combiner selects the majority of the classification results as the final result. In the regression problem, the average of multiple regression results is taken as the final result, and the influencing factors are sorted according to their importance. The ranking of the influencing factors is based on their importance, and follows the order from the largest to the smallest. According to the rules, the higher the ranking result of influencing factors is, the higher the importance of the influencing factor is. On the contrary, the lower the ranking result is, the lower the importance of the influencing factor is. Select the main influencing factors and select the main influencing factors. Based on the ranking results of influencing factors, and the main influencing factors refer to the influencing factors with high ranking results and high importance, a feature matrix is generated and a carbon emission prediction model is constructed;

S3、预测模型训练及冲击模拟:S3. Predictive model training and impact simulation:

根据所述特征矩阵对所述碳排放预测模型进行训练,以检验碳排放预测模型,并在训练后根据输入的各个影响因素的数值,进行碳排放的初步预测,再调整影响因素的参数,模拟受到不同情况冲击,并进行碳排放预测;The carbon emission prediction model is trained according to the characteristic matrix to test the carbon emission prediction model, and after training, a preliminary prediction of carbon emissions is made based on the input values of each influencing factor, and then the parameters of the influencing factors are adjusted to simulate Be affected by different situations and conduct carbon emission predictions;

碳排放预测公式如下:The carbon emission prediction formula is as follows:

其中,C为碳排放预测结果,i=1、2、3、…、n,k为重要性参数,Pi为第i种主要影响因素的碳排放系数,Di为第i种主要影响因素的消耗量;Among them, C is the carbon emission prediction result, i=1, 2, 3,..., n, k is the importance parameter, P i is the carbon emission coefficient of the i-th main influencing factor, and D i is the i-th main influencing factor. consumption;

S4、碳排放预测的冲击分析:S4. Impact analysis of carbon emission forecast:

研究碳排放的初步预测和冲击预测的关系,结合能源数据分析影响因素的冲击情况及变化规律,通过聚类分析和关联规则分析,聚类分析的操作是将初步预测和冲击预测的数据分类到不同的类或者簇,再结合分类结果进行分析,且聚类所要求划分的类是未知的,关联规则分析的操作是在初步预测和冲击预测的数据中,查找存在于初步预测和冲击预测之间的频繁模式、关联、相关性或因果结构,确定两者内部关联性,绘制因果关系图,并建立数学方程关系;Study the relationship between the preliminary prediction and impact prediction of carbon emissions, and analyze the impact and change patterns of influencing factors based on energy data. Through cluster analysis and association rule analysis, the operation of cluster analysis is to classify the data of preliminary prediction and impact prediction into Different classes or clusters are analyzed together with the classification results, and the classes required for clustering are unknown. The operation of association rule analysis is to find the data that exists between the preliminary prediction and impact prediction. Frequent patterns, associations, correlations or causal structures between the two, determine the internal correlation between the two, draw a causal relationship diagram, and establish a mathematical equation relationship;

S5、碳排放预测及结果校正:S5. Carbon emission prediction and result correction:

调取能源信息数据库内的能源数据,代入碳排放预测模型计算得出碳排放的初步预测结果,再利用数学方程关系对初步预测结果进行校正,除去影响因素冲击对碳排放预测造成的干扰,计算出最终的碳排放预测结果。The energy data in the energy information database is retrieved, substituted into the carbon emission prediction model to calculate the preliminary prediction results of carbon emissions, and then the preliminary prediction results are corrected using mathematical equations to remove the interference caused by the impact of influencing factors on the carbon emission prediction, and the calculation The final carbon emission prediction results are obtained.

本发明通过分析确定能源行业影响碳排放因素,并连接能源信息数据库调取数据,以建立能源信息数据集和不同碳排放影响因素类别的信息子集,并通过随机森林算法对影响因素的重要程度进行判别,再排序选出主要的影响因素,生成特征矩阵,构建碳排放预测模型,进行模型训练,并在训练后进行碳排放的初步预测,再调整影响因素的参数,模拟冲击并进行碳排放预测,研究碳排放的初步预测和冲击预测的关系,分析影响因素的冲击情况及变化规律,经聚类分析和关联规则分析后,建立数学方程关系,将能源数据代入碳排放预测模型计算得出碳排放的初步预测结果,再利用数学方程关系对初步预测结果进行校正,得到最终的碳排放预测结果,改进后的碳排放预测方法不仅构建了碳排放预测模型,还模拟了各影响因素在不同情况下对预测模型造成的冲击,避免不同领域、不同设备的碳排放影响因素差异对预测模型造成的影响,从而避免其对碳排放预测造成干扰,提升碳排放预测结果的准确性和可靠性This invention determines the factors affecting carbon emissions in the energy industry through analysis, and connects the energy information database to retrieve data to establish an energy information data set and information subsets of different categories of carbon emission influencing factors, and determines the importance of the influencing factors through a random forest algorithm. Discriminate, then sort and select the main influencing factors, generate a feature matrix, build a carbon emission prediction model, conduct model training, and make a preliminary prediction of carbon emissions after training, then adjust the parameters of the influencing factors, simulate the impact and conduct carbon emissions Forecasting, study the relationship between preliminary predictions of carbon emissions and impact predictions, analyze the impact and change patterns of influencing factors, establish mathematical equation relationships after cluster analysis and association rule analysis, and substitute energy data into the carbon emission prediction model to calculate The preliminary prediction results of carbon emissions are then corrected using mathematical equations to obtain the final carbon emission prediction results. The improved carbon emission prediction method not only builds a carbon emission prediction model, but also simulates the impact of various influencing factors on different The impact of different circumstances on the prediction model can be avoided, and the impact of differences in carbon emission influencing factors in different fields and different equipment on the prediction model can be avoided, so as to avoid interference with carbon emission prediction and improve the accuracy and reliability of carbon emission prediction results.

本发明的实施例是为了示例和描述起见而给出的,而并不是无遗漏的或者将本发明限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显而易见的。选择和描述实施例是为了更好说明本发明的原理和实际应用,并且使本领域的普通技术人员能够理解本发明从而设计适于特定用途的带有各种修改的各种实施例。The embodiments of the present invention are presented for purposes of illustration and description, and are not intended to be exhaustive or to limit the invention to the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention and design various embodiments with various modifications as are suited to the particular use contemplated.

Claims (10)

1.一种基于能源大数据的碳排放预测方法,其特征在于,包括以下具体步骤:1. A carbon emission prediction method based on energy big data, which is characterized by including the following specific steps: S1、碳排放影响因素数据整合:S1. Data integration of factors affecting carbon emissions: 分析确定能源行业影响碳排放因素,连接能源信息数据库,调取库内关于碳排放影响因素的相关信息数据,并建立能源信息数据集,在能源信息数据集中设置多个子集,将不同的碳排放影响因素信息数据放置在子集内部,建立不同碳排放影响因素类别的信息子集;Analyze and determine the factors affecting carbon emissions in the energy industry, connect to the energy information database, retrieve relevant information and data on factors affecting carbon emissions in the database, and establish an energy information data set. Set multiple subsets in the energy information data set to combine different carbon emissions. The influencing factor information data is placed inside the subset, and information subsets of different categories of carbon emission influencing factors are established; S2、碳排放预测模型构建:S2. Construction of carbon emission prediction model: 对数据子集内的数据进行清洗和修正,并通过随机森林算法进行影响因素的重要程度判别,依据重要性对影响因素进行排序,选出主要的影响因素,生成特征矩阵,并构建碳排放预测模型;Clean and correct the data in the data subset, and use the random forest algorithm to determine the importance of influencing factors, sort the influencing factors according to their importance, select the main influencing factors, generate a feature matrix, and construct a carbon emission prediction Model; S3、预测模型训练及冲击模拟:S3. Predictive model training and impact simulation: 根据所述特征矩阵对所述碳排放预测模型进行训练,以检验碳排放预测模型,并在训练后根据输入的各个影响因素的数值,进行碳排放的初步预测,再调整影响因素的参数,模拟受到不同情况冲击,并进行碳排放预测;The carbon emission prediction model is trained according to the characteristic matrix to test the carbon emission prediction model, and after training, a preliminary prediction of carbon emissions is made based on the input values of each influencing factor, and then the parameters of the influencing factors are adjusted to simulate Be affected by different situations and conduct carbon emission predictions; S4、碳排放预测的冲击分析:S4. Impact analysis of carbon emission forecast: 研究碳排放的初步预测和冲击预测的关系,结合能源数据分析影响因素的冲击情况及变化规律,通过聚类分析和关联规则分析,确定两者内部关联性,绘制因果关系图,并建立数学方程关系;Study the relationship between preliminary prediction of carbon emissions and impact prediction, combine energy data to analyze the impact and change patterns of influencing factors, determine the internal correlation between the two through cluster analysis and association rule analysis, draw a causal relationship diagram, and establish mathematical equations relation; S5、碳排放预测及结果校正:S5. Carbon emission prediction and result correction: 调取能源信息数据库内的能源数据,代入碳排放预测模型计算得出碳排放的初步预测结果,再利用数学方程关系对初步预测结果进行校正,除去影响因素冲击对碳排放预测造成的干扰,计算出最终的碳排放预测结果。The energy data in the energy information database is retrieved, substituted into the carbon emission prediction model to calculate the preliminary prediction results of carbon emissions, and then the preliminary prediction results are corrected using mathematical equations to remove the interference caused by the impact of influencing factors on the carbon emission prediction, and the calculation The final carbon emission prediction results are obtained. 2.根据权利要求1所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S1中能源信息数据集{Edb_Co2}={ΔCo2_1}+{ΔCo2_2}+{ΔCo2_3}+…+{ΔCo2_n},其中,Co2_1、Co2_2、Co2_3、…、Co2_n分别表示不同的碳排放影响因素,{ΔCo2_1}、{ΔCo2_2}、{ΔCo2_3}、…、{ΔCo2_n}分别表示不同影响因素对应的信息子集。2. A carbon emission prediction method based on energy big data according to claim 1, characterized in that in step S1, the energy information data set {Edb_Co2}={ΔCo2_1}+{ΔCo2_2}+{ΔCo2_3}+ …+{ΔCo2_n}, where Co2_1, Co2_2, Co2_3,…, Co2_n respectively represent different influencing factors of carbon emissions, {ΔCo2_1}, {ΔCo2_2}, {ΔCo2_3},…, {ΔCo2_n} respectively represent the corresponding information subset. 3.根据权利要求1所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S1中同一信息子集内的碳排放影响因素信息数据相互独立、互不干扰,不同信息子集内的碳排放影响因素信息数据不存在重叠和交互。3. A carbon emission prediction method based on energy big data according to claim 1, characterized in that the carbon emission influencing factor information data in the same information subset in step S1 are independent of each other and do not interfere with each other. There is no overlap or interaction in the information data of carbon emission influencing factors within the information subset. 4.根据权利要求1所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S2中随机森林算法的具体操作是将多个决策树结合在一起,每次数据子集是随机有放回的选出,同时随机选出部分特征作为输入。4. A carbon emission prediction method based on energy big data according to claim 1, characterized in that the specific operation of the random forest algorithm in step S2 is to combine multiple decision trees together. The set is randomly selected with replacement, and some features are randomly selected as input. 5.根据权利要求4所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S2中随机森林算法是以决策树为估计器的Bagging算法,最终结果通过结合器进行统计,结合器在分类问题中,选择多数分类结果作为最后的结果,在回归问题中,对多个回归结果取平均值作为最后的结果。5. A carbon emission prediction method based on energy big data according to claim 4, characterized in that the random forest algorithm in step S2 is a bagging algorithm using a decision tree as an estimator, and the final result is processed by a combiner Statistics, the combiner selects the majority classification result as the final result in classification problems, and averages multiple regression results as the final result in regression problems. 6.根据权利要求1所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S2中影响因素排序以重要性作为排序依据,且遵循重要性自大到小的排序规则,影响因素的排序结果越靠前表明该影响因素的重要程度越高,反之,排序结果越靠后则表明该影响因素的重要程度越低。6. A carbon emission prediction method based on energy big data according to claim 1, characterized in that the ranking of influencing factors in step S2 is based on importance and follows the ranking from largest to smallest importance. According to the rule, the higher the ranking result of an influencing factor is, the higher the importance of the influencing factor is. On the contrary, the lower the ranking result is, the lower the importance of the influencing factor is. 7.根据权利要求6所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S2中主要影响因素选取的依据是影响因素的排序结果,且主要影响因素指排序结果靠前且重要性高的影响因素。7. A carbon emission prediction method based on energy big data according to claim 6, characterized in that the selection of the main influencing factors in step S2 is based on the ranking results of the influencing factors, and the main influencing factors refer to the ranking results. The first and most important influencing factors. 8.根据权利要求1所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S3中碳排放预测公式如下:8. A carbon emission prediction method based on energy big data according to claim 1, characterized in that the carbon emission prediction formula in step S3 is as follows: 其中,C为碳排放预测结果,i=1、2、3、…、n,k为重要性参数,Pi为第i种主要影响因素的碳排放系数,Di为第i种主要影响因素的消耗量。Among them, C is the carbon emission prediction result, i=1, 2, 3,..., n, k is the importance parameter, P i is the carbon emission coefficient of the i-th main influencing factor, and D i is the i-th main influencing factor. consumption. 9.根据权利要求1所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S4中聚类分析的操作是将初步预测和冲击预测的数据分类到不同的类或者簇,再结合分类结果进行分析,且聚类所要求划分的类是未知的。9. A carbon emission prediction method based on energy big data according to claim 1, characterized in that the cluster analysis operation in step S4 is to classify the preliminary prediction and impact prediction data into different classes or Clusters are combined with the classification results for analysis, and the classes required for clustering are unknown. 10.根据权利要求1所述的一种基于能源大数据的碳排放预测方法,其特征在于,所述步骤S4中关联规则分析的操作是在初步预测和冲击预测的数据中,查找存在于初步预测和冲击预测之间的频繁模式、关联、相关性或因果结构。10. A carbon emission prediction method based on energy big data according to claim 1, characterized in that the operation of association rule analysis in step S4 is to search for the data that exists in the preliminary prediction and impact prediction. Frequent patterns, associations, correlations, or causal structures between forecasts and shock predictions.
CN202310814459.4A 2023-07-04 2023-07-04 Carbon emission prediction method based on energy big data Pending CN117172347A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310814459.4A CN117172347A (en) 2023-07-04 2023-07-04 Carbon emission prediction method based on energy big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310814459.4A CN117172347A (en) 2023-07-04 2023-07-04 Carbon emission prediction method based on energy big data

Publications (1)

Publication Number Publication Date
CN117172347A true CN117172347A (en) 2023-12-05

Family

ID=88932555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310814459.4A Pending CN117172347A (en) 2023-07-04 2023-07-04 Carbon emission prediction method based on energy big data

Country Status (1)

Country Link
CN (1) CN117172347A (en)

Similar Documents

Publication Publication Date Title
Zhu et al. Application of machine learning techniques for predicting the consequences of construction accidents in China
Huang et al. An empirical analysis of data preprocessing for machine learning-based software cost estimation
CN101556553B (en) Defect prediction method and system based on requirement change
WO2023142424A1 (en) Power financial service risk control method and system based on gru-lstm neural network
CN103744928B (en) A kind of network video classification method based on history access record
CN112132233A (en) Criminal personnel dangerous behavior prediction method and system based on effective influence factors
Wang et al. Practical and white-box anomaly detection through unsupervised and active learning
CN113807645A (en) Industrial chain risk deduction method based on open source information
Gu et al. [Retracted] Application of Fuzzy Decision Tree Algorithm Based on Mobile Computing in Sports Fitness Member Management
CN114764682B (en) Rice safety risk assessment method based on multi-machine learning algorithm fusion
Yu et al. Article citation contribution indicator: application in energy and environment
CN114936307A (en) A Normalized Graph Model Construction Method
CN118552063A (en) Heating furnace energy-saving benefit management method based on comprehensive fuel consumption
CN116108963A (en) Electric power carbon emission prediction method and equipment based on integrated learning module
CN110362911A (en) A kind of agent model selection method of Design-Oriented process
CN118041683B (en) Malicious traffic detection method based on structure embedded bidirectional reconstruction network
CN118569634A (en) Granary risk event intelligent early warning and processing method based on integrated learning and timing diagram neural network
CN117172347A (en) Carbon emission prediction method based on energy big data
Wu et al. Association rule mining with a special rule coding and dynamic genetic algorithm for air quality impact factors in Beijing, China
CN118037304A (en) A financial risk level labeling method and system based on data mining
CN115063251B (en) Social propagation dynamic network representation method based on relationship strength and feedback mechanism
Zhou et al. Prediction of silicon content of molten iron in blast furnace based on particle swarm-random forest
CN116797096A (en) Fuzzy comprehensive evaluation method for toughness level of supply chain based on AHP-entropy weight method
Karimi et al. Analyzing the results of buildings energy audit by using grey incidence analysis
Ma et al. Study on preliminary performance of algorithms for network traffic identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination