CN115952915A - Energy consumption prediction optimization method using fuzzy entropy classification - Google Patents
Energy consumption prediction optimization method using fuzzy entropy classification Download PDFInfo
- Publication number
- CN115952915A CN115952915A CN202310026470.4A CN202310026470A CN115952915A CN 115952915 A CN115952915 A CN 115952915A CN 202310026470 A CN202310026470 A CN 202310026470A CN 115952915 A CN115952915 A CN 115952915A
- Authority
- CN
- China
- Prior art keywords
- energy consumption
- frequency component
- components
- fuzzy entropy
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 131
- 238000005265 energy consumption Methods 0.000 title claims abstract description 84
- 238000005457 optimization Methods 0.000 title claims abstract description 27
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 36
- 230000007246 mechanism Effects 0.000 claims abstract description 26
- 230000008859 change Effects 0.000 claims abstract description 7
- 238000012549 training Methods 0.000 claims description 38
- 238000007637 random forest analysis Methods 0.000 claims description 29
- 230000008569 process Effects 0.000 claims description 21
- 238000012360 testing method Methods 0.000 claims description 18
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 4
- 238000013136 deep learning model Methods 0.000 abstract description 6
- 238000003066 decision tree Methods 0.000 description 11
- 238000002474 experimental method Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000002759 z-score normalization Methods 0.000 description 3
- 239000000654 additive Substances 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- 229910002092 carbon dioxide Inorganic materials 0.000 description 2
- 239000001569 carbon dioxide Substances 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000001932 seasonal effect Effects 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides an energy consumption prediction optimization method using fuzzy entropy classification. The method comprises the following steps: decomposing original building energy consumption data into a series of components by a time sequence decomposition method, and calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method; calculating fuzzy entropy difference values of two adjacent components in sequence, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value change; predicting the high-frequency component by using an RF method to obtain a high-frequency component prediction result, and predicting the low-frequency component by using a CNN-GRU model optimized based on an attention mechanism to obtain a low-frequency component prediction result; and reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain the energy consumption prediction result of the original building energy consumption data. The method uses a combined prediction method, predicts high-frequency signals by using an RF method, predicts low-frequency signals by using a mixed deep learning model, and obtains a final prediction result by superposition and reconstruction so as to reduce errors of building energy consumption prediction.
Description
Technical Field
The invention relates to the technical field of building energy consumption prediction, in particular to an energy consumption prediction optimization method using fuzzy entropy classification.
Background
According to the current report of global building and construction in 2022, the carbon dioxide emission of the building operation in 2021 reaches the highest level of history, which is increased by about 5% compared with the last year. With the rapid growth of urban population, energy consumption and carbon dioxide emission are continuously increased, which means that measures for saving energy, reducing emission and improving the utilization efficiency of building energy are implemented slowly. Accurate building energy consumption prediction is the basis for formulating various building energy-saving strategies, and the establishment of an efficient and accurate building energy consumption prediction model has very important practical significance.
Common methods for realizing building energy consumption prediction mainly comprise a physical model, a data driving method and the like. The building energy consumption prediction model based on the physical model realizes the prediction of the building energy consumption through the building thermophysical principle, has good explanatory performance, but has the problems of complex operation, high theoretical knowledge storage requirement and the like in the practical application process. The machine learning method is a typical data driving method, can obtain good effects only by combining past data with some characteristic engineering and other processing, and is widely applied in the field. In recent years, the development of neural networks optimizes characteristic engineering steps, so that the modeling process is simpler, and the nonlinear fitting degree and the prediction accuracy of a prediction model are further improved. However, the prediction of the energy consumption data is still a complex process, and the problems of complex nonlinearity and instability, insufficient feature extraction capability and utilization rate of the neural network and the like in the energy consumption data cause troubles for accurate energy prediction and are difficult to realize accurate prediction. To meet these challenges, an accurate and efficient energy consumption prediction model is needed.
According to the method, for the problems of instability and nonlinearity in energy consumption data, some researches are based on additive decomposition, seasonal decomposition or empirical mode decomposition, and the like, original energy consumption sequence data are decomposed, but the prediction precision after decomposition is possibly not effectively improved, on one hand, the reconstruction of decomposition is possibly difficult to guarantee, and therefore when the prediction results of all decomposition components are integrated, the whole prediction error is possibly unstable; on the other hand, the existing method is based on a result reverse-pushing process, namely a prediction model for each decomposition component is selected according to the quality of a prediction result, and the mode is not favorable for practical engineering practice. Therefore, the invention provides that the fuzzy entropy method is used for calculating the fuzzy entropy of the decomposition components, each component is divided into high-frequency and low-frequency signals according to the principle that the fuzzy entropy change of the adjacent components is maximum, then Random Forest (RF) and the proposed deep learning model are respectively used for prediction, and finally, the final prediction value is obtained through superposition reconstruction.
Currently, a building energy consumption prediction method in the prior art includes a time sequence Decomposition method represented by a Complete integrated Empirical Mode Decomposition (CEEMDAN) method with Adaptive Noise. After the original signal is decomposed by the CEEMDAN, the prior art often uses a rough decomposition reconstruction prediction method, namely, a prediction model for each decomposition component is selected according to the quality of a prediction result, and the method is not suitable for an actual engineering task.
The disadvantages of the building energy consumption prediction method in the prior art include: the method is a single prediction model, effective prediction can be made only by detailed information such as detailed building physical parameters and environmental parameters, and the like, but the method is difficult to collect such rich information in practical application.
The method does not integrate information extracted from data, and more important information does not get more weight, so that it is difficult to maintain ideal prediction effect. After the building energy consumption data are decomposed based on the time sequence decomposition method, the prediction models of all decomposition components are reversely deduced according to the quality of the prediction results, and the method is not suitable for practical engineering application.
Disclosure of Invention
The embodiment of the invention provides an energy consumption prediction optimization method using fuzzy entropy classification, so as to effectively reduce errors of building energy consumption prediction.
In order to achieve the purpose, the invention adopts the following technical scheme.
A method for energy consumption prediction optimization using fuzzy entropy classification, comprising:
decomposing original building energy consumption data into a series of components by a time sequence decomposition method, and calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method;
sequentially calculating fuzzy entropy difference values of two adjacent components, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value change;
predicting the high-frequency component by using a random forest RF method to obtain a high-frequency component prediction result, and predicting the low-frequency component by using a CNN-GRU model based on self-attention mechanism optimization to obtain a low-frequency component prediction result;
and reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain an energy consumption prediction result of the original building energy consumption data.
Preferably, the original building energy consumption data is decomposed into a series of components by a time sequence decomposition method, including:
the method comprises the steps of reserving columns of time stamps and energy consumption values in original public building energy consumption data sets, taking out row data corresponding to the time stamp columns, and taking the processed public building energy consumption data sets as training data and testing data; and decomposing the public building energy consumption data of the training set into a series of components by using a time sequence decomposition method.
Preferably, the calculating the fuzzy entropy value of each component according to the fuzzy entropy method includes:
calculating the fuzzy entropy of each component sequence according to the principle and the calculation formula of the fuzzy entropy, wherein the definition and the calculation of the fuzzy entropy are carried out according to the following rules:
first, the sequence is defined: given the mode dimension m, a set of m-dimensional vectors X (i) is constructed, which are defined as equation (1):
X(i)=[x(i),x(i+1),...,x(i+m-1)]-x 0 (i) (1)
Second, define the distance between sequences: distance between x (i) and x (j)As the maximum value of their respective endpoint differences, as in equation (2):
thirdly, defining the similarity of the sequences: introducing a new variable n through a fuzzy functionCounting/or>And/or>Is greater than or equal to>Similarity->Is as in formula (3):
fourth, all membership levels except self are averaged, as in equation (5):
as an index for defining time series, fuzzyEn (m, n, r)And/or>The negative natural logarithm of the deviation, as shown in equation (7):
in the above formula, m is a mode dimension or an embedding dimension; r represents the width of the blur function boundary; n determines the gradient of the similarity tolerance boundary, and plays a role in weighting in the calculation process of the similarity between the fuzzy entropy vectors.
Preferably, the sequentially calculating the fuzzy entropy difference values of two adjacent components, and dividing all the components into a high-frequency component and a low-frequency component based on the two components with the largest change of the fuzzy entropy difference values includes:
for component F 1 ~F k And k represents the total number of the components, and the fuzzy entropy difference value between the adjacent components is calculated by using a formula (8):
in the formula:are two adjacent components F i And F i+1 Is greater than or equal to>Is a component F i Is fuzzy entropy of (4)>Is the component F i+1 K is the total number of components;
comparing fuzzy entropy difference values between adjacent componentsSelecting two components F with the maximum fuzzy entropy difference i And F i+1 As shown in equation (9):
will component F 1 ~F i Defined as high frequency component, component F i+1 ~F k Defined as the low frequency component.
Preferably, the predicting the high frequency component by using the RF method to obtain a high frequency component prediction result includes:
for each component data in the high-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking energy consumption value data of the next row of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set;
inputting the input samples and the labels of the training set into an RF method model, learning the RF method model to obtain a well-learned RF method model, and inputting the input samples and the labels of the testing set into the RF method model to obtain a high-frequency component prediction result.
Preferably, the predicting the low-frequency component by using the CNN-GRU model optimized based on the attention mechanism to obtain a low-frequency component prediction result includes:
for each component data in the low-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking the energy consumption value data of the next line of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set, wherein each adjacent 24 data is a window, and the step length is 1;
adding a self-attention layer behind the GRU layer to obtain a CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the training set into the CNN-GRU model based on self-attention mechanism optimization, learning the CNN-GRU model based on self-attention mechanism optimization to obtain a well-learned CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the test set into the well-learned CNN-GRU model based on self-attention mechanism optimization, and obtaining a low-frequency component prediction result.
Preferably, the reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain the energy consumption prediction result of the original building energy consumption data includes:
and integrating and reconstructing the high-frequency component prediction result and the low-frequency component prediction result by utilizing the principle of the adopted time sequence decomposition method to obtain the energy consumption prediction result of the original building energy consumption data.
As can be seen from the technical solutions provided by the embodiments of the present invention, the embodiments of the present invention use a combined prediction method: the high-frequency signal is predicted by the RF method, the low-frequency signal is predicted by the mixed deep learning model, the final prediction result is obtained by superposition and reconstruction, and the advantages of each model are reasonably exerted so as to reduce the error of building energy consumption prediction.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a process flow diagram of a method for optimizing energy consumption prediction using fuzzy entropy classification according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a modeling process of an RF model according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a modeling process of a CNN-GRU model based on self-attention mechanism optimization for predicting low-frequency components according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of prediction errors of each decomposition component by using an RF model and a CNN-GRU model optimized based on a self-attention mechanism on a CEEMDAN decomposition sequence of a UnivDorm building data set according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.
The deep learning model provided by the invention adopts a Convolutional Neural Network (CNN) and a Gate-controlled circulation Unit (GRU) to extract data characteristics, and then carries out optimization through a self-attention (self-attention) layer, thereby realizing accurate prediction. The method explores the advantages of a signal processing method (fuzzy entropy method) and a data driving method (RF, CNN, GRU, self-attention) in the aspect of energy consumption prediction in the information theory, so that some theoretical knowledge technologies are better combined with the aspect of building energy consumption data, errors of energy consumption prediction can be effectively reduced, related personnel are helped to improve the energy utilization rate, and carbon emission is reduced.
After the building energy consumption data is decomposed by the time sequence decomposition method, the fuzzy entropy of each component is calculated by using the fuzzy entropy method, and each component is divided into high-frequency and low-frequency signals according to the principle that the difference value of the fuzzy entropies of adjacent components is the largest, so that the building energy consumption data is more suitable for practical engineering application.
The embodiment of the invention provides a hybrid deep learning model, which uses a CNN layer to extract space-time characteristics of building energy consumption data, then uses a GRU layer to extract time sequence characteristics, and optimizes the result through a self-attribute layer to reasonably distribute information weight, thereby improving the prediction precision.
Using a combinatorial prediction method: the RF method predicts high-frequency signals, the proposed mixed deep learning model predicts low-frequency signals, the final prediction result is obtained through superposition and reconstruction, and the advantages of the models are reasonably exerted to reduce errors of building energy consumption prediction.
The processing flow of the energy consumption prediction optimization method using fuzzy entropy classification provided by the embodiment of the invention is shown in fig. 1, and comprises the following processing procedures: the method comprises the steps of decomposing original building energy consumption data into a series of components through a time sequence decomposition method, calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method, calculating the fuzzy entropy difference value of two adjacent components in sequence after calculation, finding out the two components with the largest fuzzy entropy difference value change, and dividing each component into a high-frequency component and a low-frequency component. Then, the high frequency components are predicted using an RF method that is more robust to processing high frequency signals, and the low frequency components are predicted using the proposed CNN-GRU model optimized based on the self-attention mechanism. And finally, reconstructing the predicted values of the high-frequency component and the low-frequency component to be used as a final predicted result, and evaluating the final predicted result according to the real energy consumption data.
(1) Fuzzy entropy method and component division method
Entropy is originally a thermodynamic concept, and is a measure used to describe the degree of disorder of a thermodynamic system. With the development of information theory, a method for measuring the complexity of time series represented by approximate entropy appears. The approximate entropy is a dynamic parameter which can be expressed by only needing shorter data, and the sample entropy is improved based on the approximate entropy, does not depend on the data length any more, and has higher precision. And the fuzzy entropy introduces a similarity concept, improves the sample entropy, and has higher calculation speed while keeping the precision. The complexity of each component sequence is calculated by adopting a fuzzy entropy method, and the higher the fuzzy entropy value of the sequence is, the more disordered the waveform is represented, and the higher the frequency is. The definition process of the fuzzy entropy method is as follows:
first, the sequence is defined: given the mode dimension m, a set of m-dimensional vectors X (i) is constructed, which are defined as equation (1):
X(i)=[x(i),x(i+1),...,x(i+m-1)]-x 0 (i) (1)
Next, the distance between sequences is defined: distance between x (i) and x (j)As the maximum value of their respective endpoint differences, as in equation (2):
then, the similarity of the sequences is defined: introducing a new variable n through a fuzzy functionCalculate->And/or>Is greater than or equal to>Similarity degree>Is as in formula (3):
then, all membership degrees except itself are averaged, as in equation (5):
after the above preparation, the time series index fuzzyEn (m, n, r) was defined asAnd/or>The negative natural logarithm of the deviation, as shown in equation (7):
in the above formula, m is a mode dimension or an embedding dimension, and generally m =2; r represents the width of the boundary of the fuzzy function, if the value is too large, the statistical information is lost, if the value is too small, the effect of the statistical property is not ideal, and the sensitivity to the resulting noise is increased, so the value of r is generally taken as a sequence standard deviation value of 0.1-0.25 times; n determines the gradient of the similar tolerance boundary, the larger n is, n plays a role of weight in the calculation process of the similarity between the fuzzy entropy vectors, and the value is generally smaller integer values such as 2 or 3.
After the fuzzy entropy of each component is calculated, the following component division method is adopted to divide each component into a high-frequency component and a low-frequency component so as to facilitate further prediction and fully exert the advantages of each model.
For component F 1 ~F k (k represents the total number of components), calculating the fuzzy entropy of each component and the fuzzy entropy difference of the adjacent components, as shown in formula (8):
in the formula:are two adjacent components F i And F i+1 In conjunction with the difference (D), is greater than or equal to>Is the component F i Is fuzzy entropy of (4)>Is the component F i+1 K is the total number of components.
Comparing fuzzy entropy difference values of adjacent componentsSelecting two components F with the maximum fuzzy entropy difference i And F i+1 As shown in equation (9):
will be component F 1 ~F i Defined as high frequency component, component F i+1 ~F k Defined as the low frequency component.
(2) Random forest method
Embodiments of the present invention predict high frequency components using RF methods that are good at processing high frequency signals. A random forest is an integration method based on decision tree models, when receiving data of a training set, part of features are randomly grabbed from input features (namely historical energy consumption data) of the random forest for many times, a group of tree-like graph structures are generated in a random mode, decision rules are automatically summarized from the features and labels of the training set, the tree-like graph structures are decision trees, each decision tree model is independent, and the generation processes are not interfered with each other. Finally, in the final energy consumption prediction summary, the method gives the same weight to the prediction results of the mutually independent decision tree models, and takes the average value of the prediction results of the decision trees as the final prediction value. In short, the random forest method randomly and repeatedly constructs a plurality of decision trees by using training set data, and summarizes modeling results of each decision tree model, so as to obtain better prediction performance than a single decision tree model.
Fig. 2 is a schematic diagram of a modeling process of an RF model according to an embodiment of the present invention. As a representative of the integration method, the random forest is based on a decision tree, part of features are randomly grabbed each time and the decision tree is generated in a random mode, the process is repeated for multiple times to generate a plurality of decision trees, and finally, the mean value is integrated and obtained to obtain a final prediction result, so that regression or classification performance better than that of a single model is obtained.
In the invention, MSE (Mean Square Error) is selected as an index for measuring the branch quality of the random forest model, the MSE is defined in a formula (10), and the modeling aim is to obtain the minimum MSE.
In the formula, y m Refers to the value of the original energy consumption data,temporal predicted energy consumption dataThe value, M, is the total number of samples predicted.
(3) CNN-GRU model based on self-attention mechanism optimization
Fig. 3 is a schematic diagram of a modeling process of a CNN-GRU model based on self-attention mechanism optimization for predicting a low-frequency component according to an embodiment of the present invention. The invention provides a self-attention mechanism optimization-based CNN-GRU model to predict low-frequency components of building energy consumption data. Spatial relation among different characteristic values in the data is extracted by adopting the one-dimensional convolution layer, so that the defect that a model based on an RNN (Recurrent Neural Network) cannot capture data space-time characteristics is overcome, and the time sequence of the characteristics extracted after the one-dimensional convolution is still kept. Then, the time precedence order relation and long-term dependence are captured by selecting a GRU network which is a variant based on the RNN, the GRU optimizes the problem of gradient disappearance which possibly occurs in the original RNN, the calculation parameters are less than those of the LSTM, the convergence speed is higher, but the GRU network can only generate components with fixed length for input, and the importance degree of information cannot be distinguished. Therefore, the invention also provides a self-attention layer added behind the GRU layer, and the self-attention improves the effect of important time steps in the GRU network by transforming the output sequence of the GRU network, thereby further reducing the model prediction error.
The addition of the self-attention layer reserves the intermediate output result of the GRU encoder on the input sequence, then a model is trained to selectively learn the inputs, and the output sequence is associated with the inputs when the model is output, namely, the full dependence of the sequence is realized, the attention feature extraction is completed, and the output is the weighted average sum of the output components of the GRU network. And finally, outputting the whole neural network as a specified dimension through a simple full connection layer.
(4) Preprocessing of raw building energy consumption data
Data normalization is the most common scaling technique in machine learning applications, and commonly used methods are Min-Max normalization, Z-score normalization, and the like. The invention uses a Z-score normalization method to scale the raw building energy consumption data, standardizing the features by removing the mean and scaling to unit variance. According to the mean value and standard deviation of sample data in a training set, each feature is independently centralized and scaled, then the stored mean value and standard deviation are used for carrying out subsequent further transformation on the data, and a calculation formula of Z-score normalization is shown as a formula (11).
For sequence x 1 ,x 2 ,...,x n :
Wherein,is the average of the sequence;Is the standard deviation of the sequence; after calculation, a new sequence x after zooming is obtained 1 ′,x 2 ′,...,x n ′。
The specific implementation process of the energy consumption prediction optimization method using fuzzy entropy classification provided by the embodiment of the invention comprises the following steps:
the first step is as follows: the public building energy consumption data set (derived data of energy consumption per hour) is subjected to primary processing, columns irrelevant to the experiment are removed, only columns of the time stamps and the energy consumption values are reserved, and row data of three months including 3 months, 4 months and 5 months in the time stamp columns are taken out.
The second step is that: and storing the processed data set as a csv file as a data set used in the experiment.
The third step: at this stage, the data is divided into training data and test data, with the first 80% of the data being the training set and the last 20% being the test set.
The fourth step: the experiment was performed using a Pycharm tool in Python 3.8 environment. And decomposing the energy consumption sequence data of the training set by using a time sequence decomposition method to obtain a series of components.
The fifth step: and calculating the fuzzy entropy of each component and the fuzzy entropy difference value of the adjacent components according to the principle of a fuzzy entropy method.
And a sixth step: comparing ambiguities of adjacent componentsEntropy difference, selecting two components (marked as F) with maximum fuzzy entropy difference i And F i+1 ) Will F 1 ~F i Division into high-frequency components, F i+1 ~F k (k is the total number of components) into low frequency components.
The seventh step: and (3) for each component data obtained by decomposition, adopting a sliding window mode, wherein every 24 adjacent data are a 'window', the step length is 1, and sequentially sliding downwards until the sliding of the whole training set is finished. Then 24 pieces of data for each "window" are used as an input sample of the training set, and the energy consumption value data for the next row of each window is used as a label for the training set. The test set is processed as above and divided into input samples and labels.
Eighth step: two scalers are initialized for scaling input sample data and tag data, respectively. After the sealer is prepared using the training set data, the training set data and the test set data are scaled to prevent data leakage. (data leakage refers to the problem of predicting test set information and thereby obtaining an incorrect conclusion)
The ninth step: for high frequency components, an RF model is used to learn and train from input samples and labels of a training set of each component, and then predictions are made on a test set.
The ninth step: for low frequency components, the self-attention mechanism optimized CNN-GRU model provided by the invention is used for learning the effective information of the training set data of each component and predicting on the test set.
The tenth step: and reversely scaling the predicted value of the prediction model of each component by using the previously used scaler, and recording each reversely scaled value as the prediction result of the model.
The eleventh step: the recorded predictions are reconstructed (e.g., if the sequence is decomposed using an additive model, the predictions for each component are summed) as the final prediction.
The twelfth step: and comparing the prediction result with the real label of the test set, and evaluating the prediction performance of the model, so that all the work is finished.
To verify the effectiveness of the present invention, the following experiments were performed in a certain orderTaking UnivDorm as an example of the energy consumption data set of students' dormitory, adopting a CEEMDAN method as a time sequence decomposition method of the experiment, sequentially calculating the difference values of adjacent components according to the component division method of the invention, and selecting the largest difference value, namely the largest difference valueThus IMF 1 For high frequency components, IMF 2 ~IMF 9 Is designated as the low frequency component.
Then, according to the past research conclusion, a more appropriate RF model is selected for predicting the high-frequency component; for low frequency components, the CNN-GRU model optimized based on the self-attention mechanism described above is used for prediction. To verify the effectiveness of the method, the above two models are used to predict each component after the UnivDorm data set is decomposed, and the result is shown in fig. 4.
It can be seen that the RF model is paired with IMF 1 The prediction error of the component is lower for IMF 2 ~IMF 9 The component is lower in prediction error of the CNN-GRU model based on the self-attention mechanism optimization, and the experimental result is identical with the division result of the component division principle, so that the rationality of the fuzzy entropy classification method and the component division principle provided by the invention is proved, and the effectiveness of the provided combined prediction method is also proved.
TABLE 1
Further experiments compared the prediction errors of the proposed CNN-GRU model optimized based on the Auto-attention mechanism with the Auto regression model (AR), the Seasonal Auto regression model (SAR), the RF model based on CEEMDAN decomposition, and the GRU model based on CEEMDAN decomposition on three building energy consumption data sets, and the evaluation indexes are Mean Absolute Percentage Error (MAPE), mean Absolute Error (MAE), and Root Mean Square Error (RMSE). The results of the experiment are shown in table 1. The model provided by the invention is superior to AR and SAR models, has the lowest prediction error, shows that the model well learns the information in the data, and effectively improves the accuracy of building energy consumption prediction.
In summary, the method of the embodiment of the invention has the following beneficial effects:
extensive experiments on three data sets prove that the method can provide excellent generalization capability and prediction performance only by historical energy consumption data.
Prediction experiments carried out on the UnivDorm data set by respectively using RF and CNN-GRU model models optimized based on the self-attention mechanism show that the combined prediction method provided by the invention has rationality and effectiveness.
The fuzzy entropy method and the component division principle of maximum fuzzy entropy change according to the adjacent components are convenient to exert the advantages of each prediction model and better accord with practical engineering application.
Those of ordinary skill in the art will understand that: the figures are schematic representations of one embodiment, and the blocks or processes shown in the figures are not necessarily required to practice the present invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, apparatus or system embodiments, which are substantially similar to method embodiments, are described in relative ease, and reference may be made to some descriptions of method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (7)
1. A method for energy consumption prediction optimization using fuzzy entropy classification, comprising:
decomposing original building energy consumption data into a series of components by a time sequence decomposition method, and calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method;
sequentially calculating fuzzy entropy difference values of two adjacent components, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value change;
predicting the high-frequency component by using a random forest RF method to obtain a high-frequency component prediction result, and predicting the low-frequency component by using a CNN-GRU model based on self-attention mechanism optimization to obtain a low-frequency component prediction result;
and reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain an energy consumption prediction result of the original building energy consumption data.
2. The method of claim 1, wherein the decomposing the raw building energy consumption data into a series of components by a time series decomposition method comprises:
the method comprises the steps of reserving columns of time stamps and energy consumption values in original public building energy consumption data sets, taking out row data corresponding to the time stamp columns, and taking the processed public building energy consumption data sets as training data and testing data; and decomposing the public building energy consumption data of the training set into a series of components by using a time sequence decomposition method.
3. The method according to claim 2, wherein the calculating of the fuzzy entropy value of each component according to the fuzzy entropy method comprises:
calculating the fuzzy entropy of each component sequence according to the principle of the fuzzy entropy and a calculation formula, wherein the definition and the calculation of the fuzzy entropy are carried out according to the following rules:
first, the sequence is defined: given a mode dimension m, a set of m-dimensional vectors X (i) is constructed, which is defined as equation (1):
X(i)=[x(i),x(i+1),...,x(i+m-1)]-x 0 (i) (1)
Second, define the distance between sequences: distance between x (i) and x (j)The maximum value of the difference between their respective endpoints, as shown in equation (2):
third, define the sequenceSimilarity of (2): introducing a new variable n through a fuzzy functionCalculating outAnd/or>Is greater than or equal to>Similarity->Is as in formula (3):
fourth, all membership levels except for themselves are averaged, as in equation (5):
As an index for defining time series, fuzzyEn (m, n, r)And/or>The negative natural logarithm of the deviation, as shown in equation (7): />
In the above formula, m is a mode dimension or an embedding dimension; r represents the width of the blur function boundary; n determines the gradient of the similarity tolerance boundary, and plays a role in weighting in the calculation process of the similarity between the fuzzy entropy vectors.
4. The method according to claim 3, wherein the sequentially calculating the fuzzy entropy difference values of two adjacent components, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value variation comprises:
for component F 1 ~F k And k represents the total number of the components, and the fuzzy entropy difference value between the adjacent components is calculated by using a formula (8):
in the formula:are two adjacent components F i And F i+1 Difference of (2),Is a component F i Fuzzy entropy of->Is the component F i+1 K is the total number of components;
comparing fuzzy entropy difference values between adjacent componentsSelecting two components F with the maximum fuzzy entropy difference i And F i+1 As shown in equation (9):
will be component F 1 ~F i Defined as high frequency component, component F i+1 ~F k Defined as the low frequency component.
5. The method according to claim 4, wherein the predicting the high frequency component by using the RF method to obtain the high frequency component prediction result comprises:
for each component data in the high-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking energy consumption value data of the next row of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set;
inputting the input samples and the labels of the training set into an RF method model, learning the RF method model to obtain a well-learned RF method model, and inputting the input samples and the labels of the testing set into the RF method model to obtain a high-frequency component prediction result.
6. The method according to claim 4, wherein the predicting the low frequency component using the CNN-GRU model optimized based on the attention mechanism to obtain a low frequency component prediction result comprises:
for each component data in the low-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking energy consumption value data of the next row of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set;
adding a self-attention layer behind the GRU layer to obtain a CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the training set into the CNN-GRU model based on self-attention mechanism optimization, learning the CNN-GRU model based on self-attention mechanism optimization to obtain a well-learned CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the test set into the well-learned CNN-GRU model based on self-attention mechanism optimization, and obtaining a low-frequency component prediction result.
7. The method of claim 6, wherein reconstructing the high frequency component prediction and the low frequency component prediction to obtain the energy consumption prediction of the original building energy consumption data comprises:
and performing integrated reconstruction on the high-frequency component prediction result and the low-frequency component prediction result by utilizing the principle of the adopted time sequence decomposition method to obtain the energy consumption prediction result of the original building energy consumption data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310026470.4A CN115952915A (en) | 2023-01-09 | 2023-01-09 | Energy consumption prediction optimization method using fuzzy entropy classification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310026470.4A CN115952915A (en) | 2023-01-09 | 2023-01-09 | Energy consumption prediction optimization method using fuzzy entropy classification |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115952915A true CN115952915A (en) | 2023-04-11 |
Family
ID=87287585
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310026470.4A Pending CN115952915A (en) | 2023-01-09 | 2023-01-09 | Energy consumption prediction optimization method using fuzzy entropy classification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115952915A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117349478A (en) * | 2023-10-08 | 2024-01-05 | 国网江苏省电力有限公司经济技术研究院 | Resource data reconstruction integration system based on digital transformation enterprise |
CN118627125A (en) * | 2024-08-12 | 2024-09-10 | 山东赛宝信息技术咨询有限公司 | Database information leakage detection system based on fuzzy entropy algorithm |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112098874A (en) * | 2020-08-21 | 2020-12-18 | 杭州电子科技大学 | Lithium ion battery electric quantity prediction method considering aging condition |
CN112116080A (en) * | 2020-09-24 | 2020-12-22 | 中国科学院沈阳计算技术研究所有限公司 | CNN-GRU water quality prediction method integrated with attention mechanism |
CN112766078A (en) * | 2020-12-31 | 2021-05-07 | 辽宁工程技术大学 | Power load level prediction method of GRU-NN based on EMD-SVR-MLR and attention mechanism |
CN113850438A (en) * | 2021-09-29 | 2021-12-28 | 西安建筑科技大学 | Public building energy consumption prediction method, system, equipment and medium |
CN115169232A (en) * | 2022-07-11 | 2022-10-11 | 山东科技大学 | Daily peak load prediction method, computer equipment and readable storage medium |
-
2023
- 2023-01-09 CN CN202310026470.4A patent/CN115952915A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112098874A (en) * | 2020-08-21 | 2020-12-18 | 杭州电子科技大学 | Lithium ion battery electric quantity prediction method considering aging condition |
CN112116080A (en) * | 2020-09-24 | 2020-12-22 | 中国科学院沈阳计算技术研究所有限公司 | CNN-GRU water quality prediction method integrated with attention mechanism |
CN112766078A (en) * | 2020-12-31 | 2021-05-07 | 辽宁工程技术大学 | Power load level prediction method of GRU-NN based on EMD-SVR-MLR and attention mechanism |
CN113850438A (en) * | 2021-09-29 | 2021-12-28 | 西安建筑科技大学 | Public building energy consumption prediction method, system, equipment and medium |
CN115169232A (en) * | 2022-07-11 | 2022-10-11 | 山东科技大学 | Daily peak load prediction method, computer equipment and readable storage medium |
Non-Patent Citations (2)
Title |
---|
于军琪等: "基于神经网络的建筑能耗混合预测模型", 《浙江大学学报(工学版)》, pages 1 - 12 * |
李青;李军;马昊;: "基于互补型集成经验模态分解-模糊熵和回声状态网络的短期电力负荷预测", 计算机应用, no. 12, pages 1 - 6 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117349478A (en) * | 2023-10-08 | 2024-01-05 | 国网江苏省电力有限公司经济技术研究院 | Resource data reconstruction integration system based on digital transformation enterprise |
CN117349478B (en) * | 2023-10-08 | 2024-05-24 | 国网江苏省电力有限公司经济技术研究院 | Resource data reconstruction integration system based on digital transformation enterprise |
CN118627125A (en) * | 2024-08-12 | 2024-09-10 | 山东赛宝信息技术咨询有限公司 | Database information leakage detection system based on fuzzy entropy algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111161535B (en) | Attention mechanism-based graph neural network traffic flow prediction method and system | |
Zhang et al. | A novel combination forecasting model for wind power integrating least square support vector machine, deep belief network, singular spectrum analysis and locality-sensitive hashing | |
CN111860982B (en) | VMD-FCM-GRU-based wind power plant short-term wind power prediction method | |
Godfrey et al. | Neural decomposition of time-series data for effective generalization | |
CN115952915A (en) | Energy consumption prediction optimization method using fuzzy entropy classification | |
Gaur | Neural networks in data mining | |
CN113159361A (en) | Short-term load prediction method and system based on VDM and Stacking model fusion | |
CN110070229A (en) | The short term prediction method of home electrical load | |
CN110516818A (en) | A kind of high dimensional data prediction technique based on integrated study technology | |
US11423043B2 (en) | Methods and systems for wavelet based representation | |
CN114169110B (en) | Motor bearing fault diagnosis method based on feature optimization and GWAA-XGboost | |
CN110909928B (en) | Energy load short-term prediction method and device, computer equipment and storage medium | |
CN113298131B (en) | Attention mechanism-based time sequence data missing value interpolation method | |
AL-Bundi et al. | A review on fractal image compression using optimization techniques | |
CN115271225A (en) | Wind power-wind power modeling method based on wavelet denoising and neural network | |
CN117354172A (en) | Network traffic prediction method and system | |
Chen et al. | Exploiting data entropy for neural network compression | |
CN107704944A (en) | A kind of fluctuation of stock market interval prediction method based on information theory study | |
CN116128124A (en) | Building energy consumption prediction method based on abnormal energy value processing and time sequence decomposition | |
Ortelli et al. | Faster estimation of discrete choice models via dataset reduction | |
CN114004295B (en) | Small sample image data expansion method based on countermeasure enhancement | |
CN109325585A (en) | The shot and long term memory network partially connected method decomposed based on tensor ring | |
Yang et al. | Transfer Learning-Driven Hourly PM2. 5 Prediction Based on a Modified Hybrid Deep Learning | |
He et al. | Wavelet-temporal neural network for multivariate time series prediction | |
CN108108806A (en) | Convolutional neural networks initial method based on the extraction of pre-training model filter |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230411 |