CN115952915A - Energy consumption prediction optimization method using fuzzy entropy classification - Google Patents

Energy consumption prediction optimization method using fuzzy entropy classification Download PDF

Info

Publication number
CN115952915A
CN115952915A CN202310026470.4A CN202310026470A CN115952915A CN 115952915 A CN115952915 A CN 115952915A CN 202310026470 A CN202310026470 A CN 202310026470A CN 115952915 A CN115952915 A CN 115952915A
Authority
CN
China
Prior art keywords
energy consumption
frequency component
components
fuzzy entropy
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310026470.4A
Other languages
Chinese (zh)
Inventor
谭志
焦英浩
王闯胜
李翔宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hezhong Huineng Technology Co ltd
Beijing University of Civil Engineering and Architecture
Original Assignee
Beijing Hezhong Huineng Technology Co ltd
Beijing University of Civil Engineering and Architecture
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Hezhong Huineng Technology Co ltd, Beijing University of Civil Engineering and Architecture filed Critical Beijing Hezhong Huineng Technology Co ltd
Priority to CN202310026470.4A priority Critical patent/CN115952915A/en
Publication of CN115952915A publication Critical patent/CN115952915A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an energy consumption prediction optimization method using fuzzy entropy classification. The method comprises the following steps: decomposing original building energy consumption data into a series of components by a time sequence decomposition method, and calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method; calculating fuzzy entropy difference values of two adjacent components in sequence, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value change; predicting the high-frequency component by using an RF method to obtain a high-frequency component prediction result, and predicting the low-frequency component by using a CNN-GRU model optimized based on an attention mechanism to obtain a low-frequency component prediction result; and reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain the energy consumption prediction result of the original building energy consumption data. The method uses a combined prediction method, predicts high-frequency signals by using an RF method, predicts low-frequency signals by using a mixed deep learning model, and obtains a final prediction result by superposition and reconstruction so as to reduce errors of building energy consumption prediction.

Description

Energy consumption prediction optimization method using fuzzy entropy classification
Technical Field
The invention relates to the technical field of building energy consumption prediction, in particular to an energy consumption prediction optimization method using fuzzy entropy classification.
Background
According to the current report of global building and construction in 2022, the carbon dioxide emission of the building operation in 2021 reaches the highest level of history, which is increased by about 5% compared with the last year. With the rapid growth of urban population, energy consumption and carbon dioxide emission are continuously increased, which means that measures for saving energy, reducing emission and improving the utilization efficiency of building energy are implemented slowly. Accurate building energy consumption prediction is the basis for formulating various building energy-saving strategies, and the establishment of an efficient and accurate building energy consumption prediction model has very important practical significance.
Common methods for realizing building energy consumption prediction mainly comprise a physical model, a data driving method and the like. The building energy consumption prediction model based on the physical model realizes the prediction of the building energy consumption through the building thermophysical principle, has good explanatory performance, but has the problems of complex operation, high theoretical knowledge storage requirement and the like in the practical application process. The machine learning method is a typical data driving method, can obtain good effects only by combining past data with some characteristic engineering and other processing, and is widely applied in the field. In recent years, the development of neural networks optimizes characteristic engineering steps, so that the modeling process is simpler, and the nonlinear fitting degree and the prediction accuracy of a prediction model are further improved. However, the prediction of the energy consumption data is still a complex process, and the problems of complex nonlinearity and instability, insufficient feature extraction capability and utilization rate of the neural network and the like in the energy consumption data cause troubles for accurate energy prediction and are difficult to realize accurate prediction. To meet these challenges, an accurate and efficient energy consumption prediction model is needed.
According to the method, for the problems of instability and nonlinearity in energy consumption data, some researches are based on additive decomposition, seasonal decomposition or empirical mode decomposition, and the like, original energy consumption sequence data are decomposed, but the prediction precision after decomposition is possibly not effectively improved, on one hand, the reconstruction of decomposition is possibly difficult to guarantee, and therefore when the prediction results of all decomposition components are integrated, the whole prediction error is possibly unstable; on the other hand, the existing method is based on a result reverse-pushing process, namely a prediction model for each decomposition component is selected according to the quality of a prediction result, and the mode is not favorable for practical engineering practice. Therefore, the invention provides that the fuzzy entropy method is used for calculating the fuzzy entropy of the decomposition components, each component is divided into high-frequency and low-frequency signals according to the principle that the fuzzy entropy change of the adjacent components is maximum, then Random Forest (RF) and the proposed deep learning model are respectively used for prediction, and finally, the final prediction value is obtained through superposition reconstruction.
Currently, a building energy consumption prediction method in the prior art includes a time sequence Decomposition method represented by a Complete integrated Empirical Mode Decomposition (CEEMDAN) method with Adaptive Noise. After the original signal is decomposed by the CEEMDAN, the prior art often uses a rough decomposition reconstruction prediction method, namely, a prediction model for each decomposition component is selected according to the quality of a prediction result, and the method is not suitable for an actual engineering task.
The disadvantages of the building energy consumption prediction method in the prior art include: the method is a single prediction model, effective prediction can be made only by detailed information such as detailed building physical parameters and environmental parameters, and the like, but the method is difficult to collect such rich information in practical application.
The method does not integrate information extracted from data, and more important information does not get more weight, so that it is difficult to maintain ideal prediction effect. After the building energy consumption data are decomposed based on the time sequence decomposition method, the prediction models of all decomposition components are reversely deduced according to the quality of the prediction results, and the method is not suitable for practical engineering application.
Disclosure of Invention
The embodiment of the invention provides an energy consumption prediction optimization method using fuzzy entropy classification, so as to effectively reduce errors of building energy consumption prediction.
In order to achieve the purpose, the invention adopts the following technical scheme.
A method for energy consumption prediction optimization using fuzzy entropy classification, comprising:
decomposing original building energy consumption data into a series of components by a time sequence decomposition method, and calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method;
sequentially calculating fuzzy entropy difference values of two adjacent components, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value change;
predicting the high-frequency component by using a random forest RF method to obtain a high-frequency component prediction result, and predicting the low-frequency component by using a CNN-GRU model based on self-attention mechanism optimization to obtain a low-frequency component prediction result;
and reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain an energy consumption prediction result of the original building energy consumption data.
Preferably, the original building energy consumption data is decomposed into a series of components by a time sequence decomposition method, including:
the method comprises the steps of reserving columns of time stamps and energy consumption values in original public building energy consumption data sets, taking out row data corresponding to the time stamp columns, and taking the processed public building energy consumption data sets as training data and testing data; and decomposing the public building energy consumption data of the training set into a series of components by using a time sequence decomposition method.
Preferably, the calculating the fuzzy entropy value of each component according to the fuzzy entropy method includes:
calculating the fuzzy entropy of each component sequence according to the principle and the calculation formula of the fuzzy entropy, wherein the definition and the calculation of the fuzzy entropy are carried out according to the following rules:
first, the sequence is defined: given the mode dimension m, a set of m-dimensional vectors X (i) is constructed, which are defined as equation (1):
X(i)=[x(i),x(i+1),...,x(i+m-1)]-x 0 (i) (1)
in the formula, x 0 (i) Representing the mean of m successive x (i), i.e.
Figure SMS_1
Second, define the distance between sequences: distance between x (i) and x (j)
Figure SMS_2
As the maximum value of their respective endpoint differences, as in equation (2):
Figure SMS_3
thirdly, defining the similarity of the sequences: introducing a new variable n through a fuzzy function
Figure SMS_4
Counting/or>
Figure SMS_5
And/or>
Figure SMS_6
Is greater than or equal to>
Figure SMS_7
Similarity->
Figure SMS_8
Is as in formula (3):
Figure SMS_9
fuzzy function
Figure SMS_10
The calculation formula of (2) is as formula (4):
Figure SMS_11
fourth, all membership levels except self are averaged, as in equation (5):
Figure SMS_12
changing the dimension m plus 1 into m +1, and repeating the steps to obtain
Figure SMS_13
As in equation (6):
Figure SMS_14
as an index for defining time series, fuzzyEn (m, n, r)
Figure SMS_15
And/or>
Figure SMS_16
The negative natural logarithm of the deviation, as shown in equation (7):
Figure SMS_17
in the above formula, m is a mode dimension or an embedding dimension; r represents the width of the blur function boundary; n determines the gradient of the similarity tolerance boundary, and plays a role in weighting in the calculation process of the similarity between the fuzzy entropy vectors.
Preferably, the sequentially calculating the fuzzy entropy difference values of two adjacent components, and dividing all the components into a high-frequency component and a low-frequency component based on the two components with the largest change of the fuzzy entropy difference values includes:
for component F 1 ~F k And k represents the total number of the components, and the fuzzy entropy difference value between the adjacent components is calculated by using a formula (8):
Figure SMS_18
in the formula:
Figure SMS_19
are two adjacent components F i And F i+1 Is greater than or equal to>
Figure SMS_20
Is a component F i Is fuzzy entropy of (4)>
Figure SMS_21
Is the component F i+1 K is the total number of components;
comparing fuzzy entropy difference values between adjacent components
Figure SMS_22
Selecting two components F with the maximum fuzzy entropy difference i And F i+1 As shown in equation (9):
Figure SMS_23
will component F 1 ~F i Defined as high frequency component, component F i+1 ~F k Defined as the low frequency component.
Preferably, the predicting the high frequency component by using the RF method to obtain a high frequency component prediction result includes:
for each component data in the high-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking energy consumption value data of the next row of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set;
inputting the input samples and the labels of the training set into an RF method model, learning the RF method model to obtain a well-learned RF method model, and inputting the input samples and the labels of the testing set into the RF method model to obtain a high-frequency component prediction result.
Preferably, the predicting the low-frequency component by using the CNN-GRU model optimized based on the attention mechanism to obtain a low-frequency component prediction result includes:
for each component data in the low-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking the energy consumption value data of the next line of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set, wherein each adjacent 24 data is a window, and the step length is 1;
adding a self-attention layer behind the GRU layer to obtain a CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the training set into the CNN-GRU model based on self-attention mechanism optimization, learning the CNN-GRU model based on self-attention mechanism optimization to obtain a well-learned CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the test set into the well-learned CNN-GRU model based on self-attention mechanism optimization, and obtaining a low-frequency component prediction result.
Preferably, the reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain the energy consumption prediction result of the original building energy consumption data includes:
and integrating and reconstructing the high-frequency component prediction result and the low-frequency component prediction result by utilizing the principle of the adopted time sequence decomposition method to obtain the energy consumption prediction result of the original building energy consumption data.
As can be seen from the technical solutions provided by the embodiments of the present invention, the embodiments of the present invention use a combined prediction method: the high-frequency signal is predicted by the RF method, the low-frequency signal is predicted by the mixed deep learning model, the final prediction result is obtained by superposition and reconstruction, and the advantages of each model are reasonably exerted so as to reduce the error of building energy consumption prediction.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a process flow diagram of a method for optimizing energy consumption prediction using fuzzy entropy classification according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a modeling process of an RF model according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a modeling process of a CNN-GRU model based on self-attention mechanism optimization for predicting low-frequency components according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of prediction errors of each decomposition component by using an RF model and a CNN-GRU model optimized based on a self-attention mechanism on a CEEMDAN decomposition sequence of a UnivDorm building data set according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.
The deep learning model provided by the invention adopts a Convolutional Neural Network (CNN) and a Gate-controlled circulation Unit (GRU) to extract data characteristics, and then carries out optimization through a self-attention (self-attention) layer, thereby realizing accurate prediction. The method explores the advantages of a signal processing method (fuzzy entropy method) and a data driving method (RF, CNN, GRU, self-attention) in the aspect of energy consumption prediction in the information theory, so that some theoretical knowledge technologies are better combined with the aspect of building energy consumption data, errors of energy consumption prediction can be effectively reduced, related personnel are helped to improve the energy utilization rate, and carbon emission is reduced.
After the building energy consumption data is decomposed by the time sequence decomposition method, the fuzzy entropy of each component is calculated by using the fuzzy entropy method, and each component is divided into high-frequency and low-frequency signals according to the principle that the difference value of the fuzzy entropies of adjacent components is the largest, so that the building energy consumption data is more suitable for practical engineering application.
The embodiment of the invention provides a hybrid deep learning model, which uses a CNN layer to extract space-time characteristics of building energy consumption data, then uses a GRU layer to extract time sequence characteristics, and optimizes the result through a self-attribute layer to reasonably distribute information weight, thereby improving the prediction precision.
Using a combinatorial prediction method: the RF method predicts high-frequency signals, the proposed mixed deep learning model predicts low-frequency signals, the final prediction result is obtained through superposition and reconstruction, and the advantages of the models are reasonably exerted to reduce errors of building energy consumption prediction.
The processing flow of the energy consumption prediction optimization method using fuzzy entropy classification provided by the embodiment of the invention is shown in fig. 1, and comprises the following processing procedures: the method comprises the steps of decomposing original building energy consumption data into a series of components through a time sequence decomposition method, calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method, calculating the fuzzy entropy difference value of two adjacent components in sequence after calculation, finding out the two components with the largest fuzzy entropy difference value change, and dividing each component into a high-frequency component and a low-frequency component. Then, the high frequency components are predicted using an RF method that is more robust to processing high frequency signals, and the low frequency components are predicted using the proposed CNN-GRU model optimized based on the self-attention mechanism. And finally, reconstructing the predicted values of the high-frequency component and the low-frequency component to be used as a final predicted result, and evaluating the final predicted result according to the real energy consumption data.
(1) Fuzzy entropy method and component division method
Entropy is originally a thermodynamic concept, and is a measure used to describe the degree of disorder of a thermodynamic system. With the development of information theory, a method for measuring the complexity of time series represented by approximate entropy appears. The approximate entropy is a dynamic parameter which can be expressed by only needing shorter data, and the sample entropy is improved based on the approximate entropy, does not depend on the data length any more, and has higher precision. And the fuzzy entropy introduces a similarity concept, improves the sample entropy, and has higher calculation speed while keeping the precision. The complexity of each component sequence is calculated by adopting a fuzzy entropy method, and the higher the fuzzy entropy value of the sequence is, the more disordered the waveform is represented, and the higher the frequency is. The definition process of the fuzzy entropy method is as follows:
first, the sequence is defined: given the mode dimension m, a set of m-dimensional vectors X (i) is constructed, which are defined as equation (1):
X(i)=[x(i),x(i+1),...,x(i+m-1)]-x 0 (i) (1)
in the formula, x 0 (i) Representing the mean of m successive x (i), i.e.
Figure SMS_24
Next, the distance between sequences is defined: distance between x (i) and x (j)
Figure SMS_25
As the maximum value of their respective endpoint differences, as in equation (2):
Figure SMS_26
then, the similarity of the sequences is defined: introducing a new variable n through a fuzzy function
Figure SMS_27
Calculate->
Figure SMS_28
And/or>
Figure SMS_29
Is greater than or equal to>
Figure SMS_30
Similarity degree>
Figure SMS_31
Is as in formula (3):
Figure SMS_32
fuzzy function
Figure SMS_33
The calculation formula of (2) is as formula (4):
Figure SMS_34
then, all membership degrees except itself are averaged, as in equation (5):
Figure SMS_35
changing the dimension m plus 1 into m +1, and repeating the steps to obtain
Figure SMS_36
As in equation (6):
Figure SMS_37
after the above preparation, the time series index fuzzyEn (m, n, r) was defined as
Figure SMS_38
And/or>
Figure SMS_39
The negative natural logarithm of the deviation, as shown in equation (7):
Figure SMS_40
in the above formula, m is a mode dimension or an embedding dimension, and generally m =2; r represents the width of the boundary of the fuzzy function, if the value is too large, the statistical information is lost, if the value is too small, the effect of the statistical property is not ideal, and the sensitivity to the resulting noise is increased, so the value of r is generally taken as a sequence standard deviation value of 0.1-0.25 times; n determines the gradient of the similar tolerance boundary, the larger n is, n plays a role of weight in the calculation process of the similarity between the fuzzy entropy vectors, and the value is generally smaller integer values such as 2 or 3.
After the fuzzy entropy of each component is calculated, the following component division method is adopted to divide each component into a high-frequency component and a low-frequency component so as to facilitate further prediction and fully exert the advantages of each model.
For component F 1 ~F k (k represents the total number of components), calculating the fuzzy entropy of each component and the fuzzy entropy difference of the adjacent components, as shown in formula (8):
Figure SMS_41
in the formula:
Figure SMS_42
are two adjacent components F i And F i+1 In conjunction with the difference (D), is greater than or equal to>
Figure SMS_43
Is the component F i Is fuzzy entropy of (4)>
Figure SMS_44
Is the component F i+1 K is the total number of components.
Comparing fuzzy entropy difference values of adjacent components
Figure SMS_45
Selecting two components F with the maximum fuzzy entropy difference i And F i+1 As shown in equation (9):
Figure SMS_46
will be component F 1 ~F i Defined as high frequency component, component F i+1 ~F k Defined as the low frequency component.
(2) Random forest method
Embodiments of the present invention predict high frequency components using RF methods that are good at processing high frequency signals. A random forest is an integration method based on decision tree models, when receiving data of a training set, part of features are randomly grabbed from input features (namely historical energy consumption data) of the random forest for many times, a group of tree-like graph structures are generated in a random mode, decision rules are automatically summarized from the features and labels of the training set, the tree-like graph structures are decision trees, each decision tree model is independent, and the generation processes are not interfered with each other. Finally, in the final energy consumption prediction summary, the method gives the same weight to the prediction results of the mutually independent decision tree models, and takes the average value of the prediction results of the decision trees as the final prediction value. In short, the random forest method randomly and repeatedly constructs a plurality of decision trees by using training set data, and summarizes modeling results of each decision tree model, so as to obtain better prediction performance than a single decision tree model.
Fig. 2 is a schematic diagram of a modeling process of an RF model according to an embodiment of the present invention. As a representative of the integration method, the random forest is based on a decision tree, part of features are randomly grabbed each time and the decision tree is generated in a random mode, the process is repeated for multiple times to generate a plurality of decision trees, and finally, the mean value is integrated and obtained to obtain a final prediction result, so that regression or classification performance better than that of a single model is obtained.
In the invention, MSE (Mean Square Error) is selected as an index for measuring the branch quality of the random forest model, the MSE is defined in a formula (10), and the modeling aim is to obtain the minimum MSE.
Figure SMS_47
In the formula, y m Refers to the value of the original energy consumption data,
Figure SMS_48
temporal predicted energy consumption dataThe value, M, is the total number of samples predicted.
(3) CNN-GRU model based on self-attention mechanism optimization
Fig. 3 is a schematic diagram of a modeling process of a CNN-GRU model based on self-attention mechanism optimization for predicting a low-frequency component according to an embodiment of the present invention. The invention provides a self-attention mechanism optimization-based CNN-GRU model to predict low-frequency components of building energy consumption data. Spatial relation among different characteristic values in the data is extracted by adopting the one-dimensional convolution layer, so that the defect that a model based on an RNN (Recurrent Neural Network) cannot capture data space-time characteristics is overcome, and the time sequence of the characteristics extracted after the one-dimensional convolution is still kept. Then, the time precedence order relation and long-term dependence are captured by selecting a GRU network which is a variant based on the RNN, the GRU optimizes the problem of gradient disappearance which possibly occurs in the original RNN, the calculation parameters are less than those of the LSTM, the convergence speed is higher, but the GRU network can only generate components with fixed length for input, and the importance degree of information cannot be distinguished. Therefore, the invention also provides a self-attention layer added behind the GRU layer, and the self-attention improves the effect of important time steps in the GRU network by transforming the output sequence of the GRU network, thereby further reducing the model prediction error.
The addition of the self-attention layer reserves the intermediate output result of the GRU encoder on the input sequence, then a model is trained to selectively learn the inputs, and the output sequence is associated with the inputs when the model is output, namely, the full dependence of the sequence is realized, the attention feature extraction is completed, and the output is the weighted average sum of the output components of the GRU network. And finally, outputting the whole neural network as a specified dimension through a simple full connection layer.
(4) Preprocessing of raw building energy consumption data
Data normalization is the most common scaling technique in machine learning applications, and commonly used methods are Min-Max normalization, Z-score normalization, and the like. The invention uses a Z-score normalization method to scale the raw building energy consumption data, standardizing the features by removing the mean and scaling to unit variance. According to the mean value and standard deviation of sample data in a training set, each feature is independently centralized and scaled, then the stored mean value and standard deviation are used for carrying out subsequent further transformation on the data, and a calculation formula of Z-score normalization is shown as a formula (11).
For sequence x 1 ,x 2 ,...,x n
Figure SMS_49
Wherein,
Figure SMS_50
is the average of the sequence;
Figure SMS_51
Is the standard deviation of the sequence; after calculation, a new sequence x after zooming is obtained 1 ′,x 2 ′,...,x n ′。
The specific implementation process of the energy consumption prediction optimization method using fuzzy entropy classification provided by the embodiment of the invention comprises the following steps:
the first step is as follows: the public building energy consumption data set (derived data of energy consumption per hour) is subjected to primary processing, columns irrelevant to the experiment are removed, only columns of the time stamps and the energy consumption values are reserved, and row data of three months including 3 months, 4 months and 5 months in the time stamp columns are taken out.
The second step is that: and storing the processed data set as a csv file as a data set used in the experiment.
The third step: at this stage, the data is divided into training data and test data, with the first 80% of the data being the training set and the last 20% being the test set.
The fourth step: the experiment was performed using a Pycharm tool in Python 3.8 environment. And decomposing the energy consumption sequence data of the training set by using a time sequence decomposition method to obtain a series of components.
The fifth step: and calculating the fuzzy entropy of each component and the fuzzy entropy difference value of the adjacent components according to the principle of a fuzzy entropy method.
And a sixth step: comparing ambiguities of adjacent componentsEntropy difference, selecting two components (marked as F) with maximum fuzzy entropy difference i And F i+1 ) Will F 1 ~F i Division into high-frequency components, F i+1 ~F k (k is the total number of components) into low frequency components.
The seventh step: and (3) for each component data obtained by decomposition, adopting a sliding window mode, wherein every 24 adjacent data are a 'window', the step length is 1, and sequentially sliding downwards until the sliding of the whole training set is finished. Then 24 pieces of data for each "window" are used as an input sample of the training set, and the energy consumption value data for the next row of each window is used as a label for the training set. The test set is processed as above and divided into input samples and labels.
Eighth step: two scalers are initialized for scaling input sample data and tag data, respectively. After the sealer is prepared using the training set data, the training set data and the test set data are scaled to prevent data leakage. (data leakage refers to the problem of predicting test set information and thereby obtaining an incorrect conclusion)
The ninth step: for high frequency components, an RF model is used to learn and train from input samples and labels of a training set of each component, and then predictions are made on a test set.
The ninth step: for low frequency components, the self-attention mechanism optimized CNN-GRU model provided by the invention is used for learning the effective information of the training set data of each component and predicting on the test set.
The tenth step: and reversely scaling the predicted value of the prediction model of each component by using the previously used scaler, and recording each reversely scaled value as the prediction result of the model.
The eleventh step: the recorded predictions are reconstructed (e.g., if the sequence is decomposed using an additive model, the predictions for each component are summed) as the final prediction.
The twelfth step: and comparing the prediction result with the real label of the test set, and evaluating the prediction performance of the model, so that all the work is finished.
To verify the effectiveness of the present invention, the following experiments were performed in a certain orderTaking UnivDorm as an example of the energy consumption data set of students' dormitory, adopting a CEEMDAN method as a time sequence decomposition method of the experiment, sequentially calculating the difference values of adjacent components according to the component division method of the invention, and selecting the largest difference value, namely the largest difference value
Figure SMS_52
Thus IMF 1 For high frequency components, IMF 2 ~IMF 9 Is designated as the low frequency component.
Then, according to the past research conclusion, a more appropriate RF model is selected for predicting the high-frequency component; for low frequency components, the CNN-GRU model optimized based on the self-attention mechanism described above is used for prediction. To verify the effectiveness of the method, the above two models are used to predict each component after the UnivDorm data set is decomposed, and the result is shown in fig. 4.
It can be seen that the RF model is paired with IMF 1 The prediction error of the component is lower for IMF 2 ~IMF 9 The component is lower in prediction error of the CNN-GRU model based on the self-attention mechanism optimization, and the experimental result is identical with the division result of the component division principle, so that the rationality of the fuzzy entropy classification method and the component division principle provided by the invention is proved, and the effectiveness of the provided combined prediction method is also proved.
TABLE 1
Figure SMS_53
Figure SMS_54
Further experiments compared the prediction errors of the proposed CNN-GRU model optimized based on the Auto-attention mechanism with the Auto regression model (AR), the Seasonal Auto regression model (SAR), the RF model based on CEEMDAN decomposition, and the GRU model based on CEEMDAN decomposition on three building energy consumption data sets, and the evaluation indexes are Mean Absolute Percentage Error (MAPE), mean Absolute Error (MAE), and Root Mean Square Error (RMSE). The results of the experiment are shown in table 1. The model provided by the invention is superior to AR and SAR models, has the lowest prediction error, shows that the model well learns the information in the data, and effectively improves the accuracy of building energy consumption prediction.
In summary, the method of the embodiment of the invention has the following beneficial effects:
extensive experiments on three data sets prove that the method can provide excellent generalization capability and prediction performance only by historical energy consumption data.
Prediction experiments carried out on the UnivDorm data set by respectively using RF and CNN-GRU model models optimized based on the self-attention mechanism show that the combined prediction method provided by the invention has rationality and effectiveness.
The fuzzy entropy method and the component division principle of maximum fuzzy entropy change according to the adjacent components are convenient to exert the advantages of each prediction model and better accord with practical engineering application.
Those of ordinary skill in the art will understand that: the figures are schematic representations of one embodiment, and the blocks or processes shown in the figures are not necessarily required to practice the present invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, apparatus or system embodiments, which are substantially similar to method embodiments, are described in relative ease, and reference may be made to some descriptions of method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (7)

1. A method for energy consumption prediction optimization using fuzzy entropy classification, comprising:
decomposing original building energy consumption data into a series of components by a time sequence decomposition method, and calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method;
sequentially calculating fuzzy entropy difference values of two adjacent components, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value change;
predicting the high-frequency component by using a random forest RF method to obtain a high-frequency component prediction result, and predicting the low-frequency component by using a CNN-GRU model based on self-attention mechanism optimization to obtain a low-frequency component prediction result;
and reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain an energy consumption prediction result of the original building energy consumption data.
2. The method of claim 1, wherein the decomposing the raw building energy consumption data into a series of components by a time series decomposition method comprises:
the method comprises the steps of reserving columns of time stamps and energy consumption values in original public building energy consumption data sets, taking out row data corresponding to the time stamp columns, and taking the processed public building energy consumption data sets as training data and testing data; and decomposing the public building energy consumption data of the training set into a series of components by using a time sequence decomposition method.
3. The method according to claim 2, wherein the calculating of the fuzzy entropy value of each component according to the fuzzy entropy method comprises:
calculating the fuzzy entropy of each component sequence according to the principle of the fuzzy entropy and a calculation formula, wherein the definition and the calculation of the fuzzy entropy are carried out according to the following rules:
first, the sequence is defined: given a mode dimension m, a set of m-dimensional vectors X (i) is constructed, which is defined as equation (1):
X(i)=[x(i),x(i+1),...,x(i+m-1)]-x 0 (i) (1)
in the formula, x 0 (i) Representing the mean of m successive x (i), i.e.
Figure FDA0004045234040000011
Second, define the distance between sequences: distance between x (i) and x (j)
Figure FDA0004045234040000021
The maximum value of the difference between their respective endpoints, as shown in equation (2):
Figure FDA0004045234040000022
third, define the sequenceSimilarity of (2): introducing a new variable n through a fuzzy function
Figure FDA0004045234040000023
Calculating out
Figure FDA0004045234040000024
And/or>
Figure FDA0004045234040000025
Is greater than or equal to>
Figure FDA0004045234040000026
Similarity->
Figure FDA0004045234040000027
Is as in formula (3):
Figure FDA0004045234040000028
fuzzy function
Figure FDA0004045234040000029
The calculation formula of (2) is as formula (4):
Figure FDA00040452340400000210
fourth, all membership levels except for themselves are averaged, as in equation (5):
Figure FDA00040452340400000211
changing the dimension m plus 1 into m +1, and repeating the steps to obtain
Figure FDA00040452340400000212
Such as formula(6):
Figure FDA00040452340400000213
As an index for defining time series, fuzzyEn (m, n, r)
Figure FDA00040452340400000214
And/or>
Figure FDA00040452340400000215
The negative natural logarithm of the deviation, as shown in equation (7): />
Figure FDA00040452340400000216
In the above formula, m is a mode dimension or an embedding dimension; r represents the width of the blur function boundary; n determines the gradient of the similarity tolerance boundary, and plays a role in weighting in the calculation process of the similarity between the fuzzy entropy vectors.
4. The method according to claim 3, wherein the sequentially calculating the fuzzy entropy difference values of two adjacent components, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value variation comprises:
for component F 1 ~F k And k represents the total number of the components, and the fuzzy entropy difference value between the adjacent components is calculated by using a formula (8):
Figure FDA0004045234040000031
in the formula:
Figure FDA0004045234040000032
are two adjacent components F i And F i+1 Difference of (2),
Figure FDA0004045234040000033
Is a component F i Fuzzy entropy of->
Figure FDA0004045234040000034
Is the component F i+1 K is the total number of components;
comparing fuzzy entropy difference values between adjacent components
Figure FDA0004045234040000035
Selecting two components F with the maximum fuzzy entropy difference i And F i+1 As shown in equation (9):
Figure FDA0004045234040000036
will be component F 1 ~F i Defined as high frequency component, component F i+1 ~F k Defined as the low frequency component.
5. The method according to claim 4, wherein the predicting the high frequency component by using the RF method to obtain the high frequency component prediction result comprises:
for each component data in the high-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking energy consumption value data of the next row of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set;
inputting the input samples and the labels of the training set into an RF method model, learning the RF method model to obtain a well-learned RF method model, and inputting the input samples and the labels of the testing set into the RF method model to obtain a high-frequency component prediction result.
6. The method according to claim 4, wherein the predicting the low frequency component using the CNN-GRU model optimized based on the attention mechanism to obtain a low frequency component prediction result comprises:
for each component data in the low-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking energy consumption value data of the next row of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set;
adding a self-attention layer behind the GRU layer to obtain a CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the training set into the CNN-GRU model based on self-attention mechanism optimization, learning the CNN-GRU model based on self-attention mechanism optimization to obtain a well-learned CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the test set into the well-learned CNN-GRU model based on self-attention mechanism optimization, and obtaining a low-frequency component prediction result.
7. The method of claim 6, wherein reconstructing the high frequency component prediction and the low frequency component prediction to obtain the energy consumption prediction of the original building energy consumption data comprises:
and performing integrated reconstruction on the high-frequency component prediction result and the low-frequency component prediction result by utilizing the principle of the adopted time sequence decomposition method to obtain the energy consumption prediction result of the original building energy consumption data.
CN202310026470.4A 2023-01-09 2023-01-09 Energy consumption prediction optimization method using fuzzy entropy classification Pending CN115952915A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310026470.4A CN115952915A (en) 2023-01-09 2023-01-09 Energy consumption prediction optimization method using fuzzy entropy classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310026470.4A CN115952915A (en) 2023-01-09 2023-01-09 Energy consumption prediction optimization method using fuzzy entropy classification

Publications (1)

Publication Number Publication Date
CN115952915A true CN115952915A (en) 2023-04-11

Family

ID=87287585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310026470.4A Pending CN115952915A (en) 2023-01-09 2023-01-09 Energy consumption prediction optimization method using fuzzy entropy classification

Country Status (1)

Country Link
CN (1) CN115952915A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349478A (en) * 2023-10-08 2024-01-05 国网江苏省电力有限公司经济技术研究院 Resource data reconstruction integration system based on digital transformation enterprise
CN118627125A (en) * 2024-08-12 2024-09-10 山东赛宝信息技术咨询有限公司 Database information leakage detection system based on fuzzy entropy algorithm

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112098874A (en) * 2020-08-21 2020-12-18 杭州电子科技大学 Lithium ion battery electric quantity prediction method considering aging condition
CN112116080A (en) * 2020-09-24 2020-12-22 中国科学院沈阳计算技术研究所有限公司 CNN-GRU water quality prediction method integrated with attention mechanism
CN112766078A (en) * 2020-12-31 2021-05-07 辽宁工程技术大学 Power load level prediction method of GRU-NN based on EMD-SVR-MLR and attention mechanism
CN113850438A (en) * 2021-09-29 2021-12-28 西安建筑科技大学 Public building energy consumption prediction method, system, equipment and medium
CN115169232A (en) * 2022-07-11 2022-10-11 山东科技大学 Daily peak load prediction method, computer equipment and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112098874A (en) * 2020-08-21 2020-12-18 杭州电子科技大学 Lithium ion battery electric quantity prediction method considering aging condition
CN112116080A (en) * 2020-09-24 2020-12-22 中国科学院沈阳计算技术研究所有限公司 CNN-GRU water quality prediction method integrated with attention mechanism
CN112766078A (en) * 2020-12-31 2021-05-07 辽宁工程技术大学 Power load level prediction method of GRU-NN based on EMD-SVR-MLR and attention mechanism
CN113850438A (en) * 2021-09-29 2021-12-28 西安建筑科技大学 Public building energy consumption prediction method, system, equipment and medium
CN115169232A (en) * 2022-07-11 2022-10-11 山东科技大学 Daily peak load prediction method, computer equipment and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
于军琪等: "基于神经网络的建筑能耗混合预测模型", 《浙江大学学报(工学版)》, pages 1 - 12 *
李青;李军;马昊;: "基于互补型集成经验模态分解-模糊熵和回声状态网络的短期电力负荷预测", 计算机应用, no. 12, pages 1 - 6 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349478A (en) * 2023-10-08 2024-01-05 国网江苏省电力有限公司经济技术研究院 Resource data reconstruction integration system based on digital transformation enterprise
CN117349478B (en) * 2023-10-08 2024-05-24 国网江苏省电力有限公司经济技术研究院 Resource data reconstruction integration system based on digital transformation enterprise
CN118627125A (en) * 2024-08-12 2024-09-10 山东赛宝信息技术咨询有限公司 Database information leakage detection system based on fuzzy entropy algorithm

Similar Documents

Publication Publication Date Title
CN111161535B (en) Attention mechanism-based graph neural network traffic flow prediction method and system
Zhang et al. A novel combination forecasting model for wind power integrating least square support vector machine, deep belief network, singular spectrum analysis and locality-sensitive hashing
CN111860982B (en) VMD-FCM-GRU-based wind power plant short-term wind power prediction method
Godfrey et al. Neural decomposition of time-series data for effective generalization
CN115952915A (en) Energy consumption prediction optimization method using fuzzy entropy classification
Gaur Neural networks in data mining
CN113159361A (en) Short-term load prediction method and system based on VDM and Stacking model fusion
CN110070229A (en) The short term prediction method of home electrical load
CN110516818A (en) A kind of high dimensional data prediction technique based on integrated study technology
US11423043B2 (en) Methods and systems for wavelet based representation
CN114169110B (en) Motor bearing fault diagnosis method based on feature optimization and GWAA-XGboost
CN110909928B (en) Energy load short-term prediction method and device, computer equipment and storage medium
CN113298131B (en) Attention mechanism-based time sequence data missing value interpolation method
AL-Bundi et al. A review on fractal image compression using optimization techniques
CN115271225A (en) Wind power-wind power modeling method based on wavelet denoising and neural network
CN117354172A (en) Network traffic prediction method and system
Chen et al. Exploiting data entropy for neural network compression
CN107704944A (en) A kind of fluctuation of stock market interval prediction method based on information theory study
CN116128124A (en) Building energy consumption prediction method based on abnormal energy value processing and time sequence decomposition
Ortelli et al. Faster estimation of discrete choice models via dataset reduction
CN114004295B (en) Small sample image data expansion method based on countermeasure enhancement
CN109325585A (en) The shot and long term memory network partially connected method decomposed based on tensor ring
Yang et al. Transfer Learning-Driven Hourly PM2. 5 Prediction Based on a Modified Hybrid Deep Learning
He et al. Wavelet-temporal neural network for multivariate time series prediction
CN108108806A (en) Convolutional neural networks initial method based on the extraction of pre-training model filter

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230411