CN115952915A

CN115952915A - Energy consumption prediction optimization method using fuzzy entropy classification

Info

Publication number: CN115952915A
Application number: CN202310026470.4A
Authority: CN
Inventors: 谭志; 焦英浩; 王闯胜; 李翔宇
Original assignee: Beijing Hezhong Huineng Technology Co ltd; Beijing University of Civil Engineering and Architecture
Current assignee: Beijing Hezhong Huineng Technology Co ltd; Beijing University of Civil Engineering and Architecture
Priority date: 2023-01-09
Filing date: 2023-01-09
Publication date: 2023-04-11

Abstract

The invention provides an energy consumption prediction optimization method using fuzzy entropy classification. The method comprises the following steps: decomposing original building energy consumption data into a series of components by a time sequence decomposition method, and calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method; calculating fuzzy entropy difference values of two adjacent components in sequence, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value change; predicting the high-frequency component by using an RF method to obtain a high-frequency component prediction result, and predicting the low-frequency component by using a CNN-GRU model optimized based on an attention mechanism to obtain a low-frequency component prediction result; and reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain the energy consumption prediction result of the original building energy consumption data. The method uses a combined prediction method, predicts high-frequency signals by using an RF method, predicts low-frequency signals by using a mixed deep learning model, and obtains a final prediction result by superposition and reconstruction so as to reduce errors of building energy consumption prediction.

Description

Energy consumption prediction optimization method using fuzzy entropy classification

Technical Field

The invention relates to the technical field of building energy consumption prediction, in particular to an energy consumption prediction optimization method using fuzzy entropy classification.

Background

According to the current report of global building and construction in 2022, the carbon dioxide emission of the building operation in 2021 reaches the highest level of history, which is increased by about 5% compared with the last year. With the rapid growth of urban population, energy consumption and carbon dioxide emission are continuously increased, which means that measures for saving energy, reducing emission and improving the utilization efficiency of building energy are implemented slowly. Accurate building energy consumption prediction is the basis for formulating various building energy-saving strategies, and the establishment of an efficient and accurate building energy consumption prediction model has very important practical significance.

Common methods for realizing building energy consumption prediction mainly comprise a physical model, a data driving method and the like. The building energy consumption prediction model based on the physical model realizes the prediction of the building energy consumption through the building thermophysical principle, has good explanatory performance, but has the problems of complex operation, high theoretical knowledge storage requirement and the like in the practical application process. The machine learning method is a typical data driving method, can obtain good effects only by combining past data with some characteristic engineering and other processing, and is widely applied in the field. In recent years, the development of neural networks optimizes characteristic engineering steps, so that the modeling process is simpler, and the nonlinear fitting degree and the prediction accuracy of a prediction model are further improved. However, the prediction of the energy consumption data is still a complex process, and the problems of complex nonlinearity and instability, insufficient feature extraction capability and utilization rate of the neural network and the like in the energy consumption data cause troubles for accurate energy prediction and are difficult to realize accurate prediction. To meet these challenges, an accurate and efficient energy consumption prediction model is needed.

According to the method, for the problems of instability and nonlinearity in energy consumption data, some researches are based on additive decomposition, seasonal decomposition or empirical mode decomposition, and the like, original energy consumption sequence data are decomposed, but the prediction precision after decomposition is possibly not effectively improved, on one hand, the reconstruction of decomposition is possibly difficult to guarantee, and therefore when the prediction results of all decomposition components are integrated, the whole prediction error is possibly unstable; on the other hand, the existing method is based on a result reverse-pushing process, namely a prediction model for each decomposition component is selected according to the quality of a prediction result, and the mode is not favorable for practical engineering practice. Therefore, the invention provides that the fuzzy entropy method is used for calculating the fuzzy entropy of the decomposition components, each component is divided into high-frequency and low-frequency signals according to the principle that the fuzzy entropy change of the adjacent components is maximum, then Random Forest (RF) and the proposed deep learning model are respectively used for prediction, and finally, the final prediction value is obtained through superposition reconstruction.

Currently, a building energy consumption prediction method in the prior art includes a time sequence Decomposition method represented by a Complete integrated Empirical Mode Decomposition (CEEMDAN) method with Adaptive Noise. After the original signal is decomposed by the CEEMDAN, the prior art often uses a rough decomposition reconstruction prediction method, namely, a prediction model for each decomposition component is selected according to the quality of a prediction result, and the method is not suitable for an actual engineering task.

The disadvantages of the building energy consumption prediction method in the prior art include: the method is a single prediction model, effective prediction can be made only by detailed information such as detailed building physical parameters and environmental parameters, and the like, but the method is difficult to collect such rich information in practical application.

The method does not integrate information extracted from data, and more important information does not get more weight, so that it is difficult to maintain ideal prediction effect. After the building energy consumption data are decomposed based on the time sequence decomposition method, the prediction models of all decomposition components are reversely deduced according to the quality of the prediction results, and the method is not suitable for practical engineering application.

Disclosure of Invention

The embodiment of the invention provides an energy consumption prediction optimization method using fuzzy entropy classification, so as to effectively reduce errors of building energy consumption prediction.

In order to achieve the purpose, the invention adopts the following technical scheme.

A method for energy consumption prediction optimization using fuzzy entropy classification, comprising:

decomposing original building energy consumption data into a series of components by a time sequence decomposition method, and calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method;

sequentially calculating fuzzy entropy difference values of two adjacent components, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value change;

predicting the high-frequency component by using a random forest RF method to obtain a high-frequency component prediction result, and predicting the low-frequency component by using a CNN-GRU model based on self-attention mechanism optimization to obtain a low-frequency component prediction result;

and reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain an energy consumption prediction result of the original building energy consumption data.

Preferably, the original building energy consumption data is decomposed into a series of components by a time sequence decomposition method, including:

the method comprises the steps of reserving columns of time stamps and energy consumption values in original public building energy consumption data sets, taking out row data corresponding to the time stamp columns, and taking the processed public building energy consumption data sets as training data and testing data; and decomposing the public building energy consumption data of the training set into a series of components by using a time sequence decomposition method.

Preferably, the calculating the fuzzy entropy value of each component according to the fuzzy entropy method includes:

calculating the fuzzy entropy of each component sequence according to the principle and the calculation formula of the fuzzy entropy, wherein the definition and the calculation of the fuzzy entropy are carried out according to the following rules:

first, the sequence is defined: given the mode dimension m, a set of m-dimensional vectors X (i) is constructed, which are defined as equation (1):

X(i)＝[x(i)，x(i+1)，...，x(i+m-1)]-x ₀ (i) (1)

in the formula, x ₀ (i) Representing the mean of m successive x (i), i.e.

Second, define the distance between sequences: distance between x (i) and x (j)

As the maximum value of their respective endpoint differences, as in equation (2):

thirdly, defining the similarity of the sequences: introducing a new variable n through a fuzzy function

Counting/or>

And/or>

Is greater than or equal to>

Similarity->

Is as in formula (3):

fuzzy function

The calculation formula of (2) is as formula (4):

fourth, all membership levels except self are averaged, as in equation (5):

changing the dimension m plus 1 into m +1, and repeating the steps to obtain

As in equation (6):

as an index for defining time series, fuzzyEn (m, n, r)

And/or>

The negative natural logarithm of the deviation, as shown in equation (7):

in the above formula, m is a mode dimension or an embedding dimension; r represents the width of the blur function boundary; n determines the gradient of the similarity tolerance boundary, and plays a role in weighting in the calculation process of the similarity between the fuzzy entropy vectors.

Preferably, the sequentially calculating the fuzzy entropy difference values of two adjacent components, and dividing all the components into a high-frequency component and a low-frequency component based on the two components with the largest change of the fuzzy entropy difference values includes:

for component F ₁ ～F _k And k represents the total number of the components, and the fuzzy entropy difference value between the adjacent components is calculated by using a formula (8):

in the formula:

are two adjacent components F _i And F _i+1 Is greater than or equal to>

Is a component F _i Is fuzzy entropy of (4)>

Is the component F _i+1 K is the total number of components;

comparing fuzzy entropy difference values between adjacent components

Selecting two components F with the maximum fuzzy entropy difference _i And F _i+1 As shown in equation (9):

will component F ₁ ～F _i Defined as high frequency component, component F _i+1 ～F _k Defined as the low frequency component.

Preferably, the predicting the high frequency component by using the RF method to obtain a high frequency component prediction result includes:

for each component data in the high-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking energy consumption value data of the next row of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set;

inputting the input samples and the labels of the training set into an RF method model, learning the RF method model to obtain a well-learned RF method model, and inputting the input samples and the labels of the testing set into the RF method model to obtain a high-frequency component prediction result.

Preferably, the predicting the low-frequency component by using the CNN-GRU model optimized based on the attention mechanism to obtain a low-frequency component prediction result includes:

for each component data in the low-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking the energy consumption value data of the next line of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set, wherein each adjacent 24 data is a window, and the step length is 1;

adding a self-attention layer behind the GRU layer to obtain a CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the training set into the CNN-GRU model based on self-attention mechanism optimization, learning the CNN-GRU model based on self-attention mechanism optimization to obtain a well-learned CNN-GRU model based on self-attention mechanism optimization, inputting an input sample and a label of the test set into the well-learned CNN-GRU model based on self-attention mechanism optimization, and obtaining a low-frequency component prediction result.

Preferably, the reconstructing the high-frequency component prediction result and the low-frequency component prediction result to obtain the energy consumption prediction result of the original building energy consumption data includes:

and integrating and reconstructing the high-frequency component prediction result and the low-frequency component prediction result by utilizing the principle of the adopted time sequence decomposition method to obtain the energy consumption prediction result of the original building energy consumption data.

As can be seen from the technical solutions provided by the embodiments of the present invention, the embodiments of the present invention use a combined prediction method: the high-frequency signal is predicted by the RF method, the low-frequency signal is predicted by the mixed deep learning model, the final prediction result is obtained by superposition and reconstruction, and the advantages of each model are reasonably exerted so as to reduce the error of building energy consumption prediction.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a process flow diagram of a method for optimizing energy consumption prediction using fuzzy entropy classification according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a modeling process of an RF model according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a modeling process of a CNN-GRU model based on self-attention mechanism optimization for predicting low-frequency components according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of prediction errors of each decomposition component by using an RF model and a CNN-GRU model optimized based on a self-attention mechanism on a CEEMDAN decomposition sequence of a UnivDorm building data set according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.

The deep learning model provided by the invention adopts a Convolutional Neural Network (CNN) and a Gate-controlled circulation Unit (GRU) to extract data characteristics, and then carries out optimization through a self-attention (self-attention) layer, thereby realizing accurate prediction. The method explores the advantages of a signal processing method (fuzzy entropy method) and a data driving method (RF, CNN, GRU, self-attention) in the aspect of energy consumption prediction in the information theory, so that some theoretical knowledge technologies are better combined with the aspect of building energy consumption data, errors of energy consumption prediction can be effectively reduced, related personnel are helped to improve the energy utilization rate, and carbon emission is reduced.

After the building energy consumption data is decomposed by the time sequence decomposition method, the fuzzy entropy of each component is calculated by using the fuzzy entropy method, and each component is divided into high-frequency and low-frequency signals according to the principle that the difference value of the fuzzy entropies of adjacent components is the largest, so that the building energy consumption data is more suitable for practical engineering application.

The embodiment of the invention provides a hybrid deep learning model, which uses a CNN layer to extract space-time characteristics of building energy consumption data, then uses a GRU layer to extract time sequence characteristics, and optimizes the result through a self-attribute layer to reasonably distribute information weight, thereby improving the prediction precision.

Using a combinatorial prediction method: the RF method predicts high-frequency signals, the proposed mixed deep learning model predicts low-frequency signals, the final prediction result is obtained through superposition and reconstruction, and the advantages of the models are reasonably exerted to reduce errors of building energy consumption prediction.

The processing flow of the energy consumption prediction optimization method using fuzzy entropy classification provided by the embodiment of the invention is shown in fig. 1, and comprises the following processing procedures: the method comprises the steps of decomposing original building energy consumption data into a series of components through a time sequence decomposition method, calculating the value of the fuzzy entropy of each component according to a fuzzy entropy method, calculating the fuzzy entropy difference value of two adjacent components in sequence after calculation, finding out the two components with the largest fuzzy entropy difference value change, and dividing each component into a high-frequency component and a low-frequency component. Then, the high frequency components are predicted using an RF method that is more robust to processing high frequency signals, and the low frequency components are predicted using the proposed CNN-GRU model optimized based on the self-attention mechanism. And finally, reconstructing the predicted values of the high-frequency component and the low-frequency component to be used as a final predicted result, and evaluating the final predicted result according to the real energy consumption data.

(1) Fuzzy entropy method and component division method

Entropy is originally a thermodynamic concept, and is a measure used to describe the degree of disorder of a thermodynamic system. With the development of information theory, a method for measuring the complexity of time series represented by approximate entropy appears. The approximate entropy is a dynamic parameter which can be expressed by only needing shorter data, and the sample entropy is improved based on the approximate entropy, does not depend on the data length any more, and has higher precision. And the fuzzy entropy introduces a similarity concept, improves the sample entropy, and has higher calculation speed while keeping the precision. The complexity of each component sequence is calculated by adopting a fuzzy entropy method, and the higher the fuzzy entropy value of the sequence is, the more disordered the waveform is represented, and the higher the frequency is. The definition process of the fuzzy entropy method is as follows:

X(i)＝[x(i)，x(i+1)，...，x(i+m-1)]-x ₀ (i) (1)

in the formula, x ₀ (i) Representing the mean of m successive x (i), i.e.

Next, the distance between sequences is defined: distance between x (i) and x (j)

then, the similarity of the sequences is defined: introducing a new variable n through a fuzzy function

Calculate->

And/or>

Is greater than or equal to>

Similarity degree>

Is as in formula (3):

fuzzy function

The calculation formula of (2) is as formula (4):

then, all membership degrees except itself are averaged, as in equation (5):

changing the dimension m plus 1 into m +1, and repeating the steps to obtain

As in equation (6):

after the above preparation, the time series index fuzzyEn (m, n, r) was defined as

And/or>

The negative natural logarithm of the deviation, as shown in equation (7):

in the above formula, m is a mode dimension or an embedding dimension, and generally m =2; r represents the width of the boundary of the fuzzy function, if the value is too large, the statistical information is lost, if the value is too small, the effect of the statistical property is not ideal, and the sensitivity to the resulting noise is increased, so the value of r is generally taken as a sequence standard deviation value of 0.1-0.25 times; n determines the gradient of the similar tolerance boundary, the larger n is, n plays a role of weight in the calculation process of the similarity between the fuzzy entropy vectors, and the value is generally smaller integer values such as 2 or 3.

After the fuzzy entropy of each component is calculated, the following component division method is adopted to divide each component into a high-frequency component and a low-frequency component so as to facilitate further prediction and fully exert the advantages of each model.

For component F ₁ ～F _k (k represents the total number of components), calculating the fuzzy entropy of each component and the fuzzy entropy difference of the adjacent components, as shown in formula (8):

in the formula:

are two adjacent components F _i And F _i+1 In conjunction with the difference (D), is greater than or equal to>

Is the component F _i Is fuzzy entropy of (4)>

Is the component F _i+1 K is the total number of components.

Comparing fuzzy entropy difference values of adjacent components

will be component F ₁ ～F _i Defined as high frequency component, component F _i+1 ～F _k Defined as the low frequency component.

(2) Random forest method

Embodiments of the present invention predict high frequency components using RF methods that are good at processing high frequency signals. A random forest is an integration method based on decision tree models, when receiving data of a training set, part of features are randomly grabbed from input features (namely historical energy consumption data) of the random forest for many times, a group of tree-like graph structures are generated in a random mode, decision rules are automatically summarized from the features and labels of the training set, the tree-like graph structures are decision trees, each decision tree model is independent, and the generation processes are not interfered with each other. Finally, in the final energy consumption prediction summary, the method gives the same weight to the prediction results of the mutually independent decision tree models, and takes the average value of the prediction results of the decision trees as the final prediction value. In short, the random forest method randomly and repeatedly constructs a plurality of decision trees by using training set data, and summarizes modeling results of each decision tree model, so as to obtain better prediction performance than a single decision tree model.

Fig. 2 is a schematic diagram of a modeling process of an RF model according to an embodiment of the present invention. As a representative of the integration method, the random forest is based on a decision tree, part of features are randomly grabbed each time and the decision tree is generated in a random mode, the process is repeated for multiple times to generate a plurality of decision trees, and finally, the mean value is integrated and obtained to obtain a final prediction result, so that regression or classification performance better than that of a single model is obtained.

In the invention, MSE (Mean Square Error) is selected as an index for measuring the branch quality of the random forest model, the MSE is defined in a formula (10), and the modeling aim is to obtain the minimum MSE.

In the formula, y _m Refers to the value of the original energy consumption data,

temporal predicted energy consumption dataThe value, M, is the total number of samples predicted.

(3) CNN-GRU model based on self-attention mechanism optimization

Fig. 3 is a schematic diagram of a modeling process of a CNN-GRU model based on self-attention mechanism optimization for predicting a low-frequency component according to an embodiment of the present invention. The invention provides a self-attention mechanism optimization-based CNN-GRU model to predict low-frequency components of building energy consumption data. Spatial relation among different characteristic values in the data is extracted by adopting the one-dimensional convolution layer, so that the defect that a model based on an RNN (Recurrent Neural Network) cannot capture data space-time characteristics is overcome, and the time sequence of the characteristics extracted after the one-dimensional convolution is still kept. Then, the time precedence order relation and long-term dependence are captured by selecting a GRU network which is a variant based on the RNN, the GRU optimizes the problem of gradient disappearance which possibly occurs in the original RNN, the calculation parameters are less than those of the LSTM, the convergence speed is higher, but the GRU network can only generate components with fixed length for input, and the importance degree of information cannot be distinguished. Therefore, the invention also provides a self-attention layer added behind the GRU layer, and the self-attention improves the effect of important time steps in the GRU network by transforming the output sequence of the GRU network, thereby further reducing the model prediction error.

The addition of the self-attention layer reserves the intermediate output result of the GRU encoder on the input sequence, then a model is trained to selectively learn the inputs, and the output sequence is associated with the inputs when the model is output, namely, the full dependence of the sequence is realized, the attention feature extraction is completed, and the output is the weighted average sum of the output components of the GRU network. And finally, outputting the whole neural network as a specified dimension through a simple full connection layer.

(4) Preprocessing of raw building energy consumption data

Data normalization is the most common scaling technique in machine learning applications, and commonly used methods are Min-Max normalization, Z-score normalization, and the like. The invention uses a Z-score normalization method to scale the raw building energy consumption data, standardizing the features by removing the mean and scaling to unit variance. According to the mean value and standard deviation of sample data in a training set, each feature is independently centralized and scaled, then the stored mean value and standard deviation are used for carrying out subsequent further transformation on the data, and a calculation formula of Z-score normalization is shown as a formula (11).

For sequence x ₁ ，x ₂ ，...，x _n ：

Wherein,

is the average of the sequence;

Is the standard deviation of the sequence; after calculation, a new sequence x after zooming is obtained ₁ ′，x ₂ ′，...，x _n ′。

The specific implementation process of the energy consumption prediction optimization method using fuzzy entropy classification provided by the embodiment of the invention comprises the following steps:

the first step is as follows: the public building energy consumption data set (derived data of energy consumption per hour) is subjected to primary processing, columns irrelevant to the experiment are removed, only columns of the time stamps and the energy consumption values are reserved, and row data of three months including 3 months, 4 months and 5 months in the time stamp columns are taken out.

The second step is that: and storing the processed data set as a csv file as a data set used in the experiment.

The third step: at this stage, the data is divided into training data and test data, with the first 80% of the data being the training set and the last 20% being the test set.

The fourth step: the experiment was performed using a Pycharm tool in Python 3.8 environment. And decomposing the energy consumption sequence data of the training set by using a time sequence decomposition method to obtain a series of components.

The fifth step: and calculating the fuzzy entropy of each component and the fuzzy entropy difference value of the adjacent components according to the principle of a fuzzy entropy method.

And a sixth step: comparing ambiguities of adjacent componentsEntropy difference, selecting two components (marked as F) with maximum fuzzy entropy difference _i And F _i+1 ) Will F ₁ ～F _i Division into high-frequency components, F _i+1 ～F _k (k is the total number of components) into low frequency components.

The seventh step: and (3) for each component data obtained by decomposition, adopting a sliding window mode, wherein every 24 adjacent data are a 'window', the step length is 1, and sequentially sliding downwards until the sliding of the whole training set is finished. Then 24 pieces of data for each "window" are used as an input sample of the training set, and the energy consumption value data for the next row of each window is used as a label for the training set. The test set is processed as above and divided into input samples and labels.

Eighth step: two scalers are initialized for scaling input sample data and tag data, respectively. After the sealer is prepared using the training set data, the training set data and the test set data are scaled to prevent data leakage. (data leakage refers to the problem of predicting test set information and thereby obtaining an incorrect conclusion)

The ninth step: for high frequency components, an RF model is used to learn and train from input samples and labels of a training set of each component, and then predictions are made on a test set.

The ninth step: for low frequency components, the self-attention mechanism optimized CNN-GRU model provided by the invention is used for learning the effective information of the training set data of each component and predicting on the test set.

The tenth step: and reversely scaling the predicted value of the prediction model of each component by using the previously used scaler, and recording each reversely scaled value as the prediction result of the model.

The eleventh step: the recorded predictions are reconstructed (e.g., if the sequence is decomposed using an additive model, the predictions for each component are summed) as the final prediction.

The twelfth step: and comparing the prediction result with the real label of the test set, and evaluating the prediction performance of the model, so that all the work is finished.

To verify the effectiveness of the present invention, the following experiments were performed in a certain orderTaking UnivDorm as an example of the energy consumption data set of students' dormitory, adopting a CEEMDAN method as a time sequence decomposition method of the experiment, sequentially calculating the difference values of adjacent components according to the component division method of the invention, and selecting the largest difference value, namely the largest difference value

Thus IMF ₁ For high frequency components, IMF ₂ ～IMF ₉ Is designated as the low frequency component.

Then, according to the past research conclusion, a more appropriate RF model is selected for predicting the high-frequency component; for low frequency components, the CNN-GRU model optimized based on the self-attention mechanism described above is used for prediction. To verify the effectiveness of the method, the above two models are used to predict each component after the UnivDorm data set is decomposed, and the result is shown in fig. 4.

It can be seen that the RF model is paired with IMF ₁ The prediction error of the component is lower for IMF ₂ ～IMF ₉ The component is lower in prediction error of the CNN-GRU model based on the self-attention mechanism optimization, and the experimental result is identical with the division result of the component division principle, so that the rationality of the fuzzy entropy classification method and the component division principle provided by the invention is proved, and the effectiveness of the provided combined prediction method is also proved.

TABLE 1

Further experiments compared the prediction errors of the proposed CNN-GRU model optimized based on the Auto-attention mechanism with the Auto regression model (AR), the Seasonal Auto regression model (SAR), the RF model based on CEEMDAN decomposition, and the GRU model based on CEEMDAN decomposition on three building energy consumption data sets, and the evaluation indexes are Mean Absolute Percentage Error (MAPE), mean Absolute Error (MAE), and Root Mean Square Error (RMSE). The results of the experiment are shown in table 1. The model provided by the invention is superior to AR and SAR models, has the lowest prediction error, shows that the model well learns the information in the data, and effectively improves the accuracy of building energy consumption prediction.

In summary, the method of the embodiment of the invention has the following beneficial effects:

extensive experiments on three data sets prove that the method can provide excellent generalization capability and prediction performance only by historical energy consumption data.

Prediction experiments carried out on the UnivDorm data set by respectively using RF and CNN-GRU model models optimized based on the self-attention mechanism show that the combined prediction method provided by the invention has rationality and effectiveness.

The fuzzy entropy method and the component division principle of maximum fuzzy entropy change according to the adjacent components are convenient to exert the advantages of each prediction model and better accord with practical engineering application.

Those of ordinary skill in the art will understand that: the figures are schematic representations of one embodiment, and the blocks or processes shown in the figures are not necessarily required to practice the present invention.

From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, apparatus or system embodiments, which are substantially similar to method embodiments, are described in relative ease, and reference may be made to some descriptions of method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for energy consumption prediction optimization using fuzzy entropy classification, comprising:

2. The method of claim 1, wherein the decomposing the raw building energy consumption data into a series of components by a time series decomposition method comprises:

3. The method according to claim 2, wherein the calculating of the fuzzy entropy value of each component according to the fuzzy entropy method comprises:

calculating the fuzzy entropy of each component sequence according to the principle of the fuzzy entropy and a calculation formula, wherein the definition and the calculation of the fuzzy entropy are carried out according to the following rules:

first, the sequence is defined: given a mode dimension m, a set of m-dimensional vectors X (i) is constructed, which is defined as equation (1):

X(i)＝[x(i)，x(i+1)，...，x(i+m-1)]-x ₀ (i) (1)

in the formula, x ₀ (i) Representing the mean of m successive x (i), i.e.

Second, define the distance between sequences: distance between x (i) and x (j)

The maximum value of the difference between their respective endpoints, as shown in equation (2):

third, define the sequenceSimilarity of (2): introducing a new variable n through a fuzzy function

Calculating out

And/or>

Is greater than or equal to>

Similarity->

Is as in formula (3):

fuzzy function

The calculation formula of (2) is as formula (4):

fourth, all membership levels except for themselves are averaged, as in equation (5):

changing the dimension m plus 1 into m +1, and repeating the steps to obtain

Such as formula(6)：

As an index for defining time series, fuzzyEn (m, n, r)

And/or>

The negative natural logarithm of the deviation, as shown in equation (7): />

4. The method according to claim 3, wherein the sequentially calculating the fuzzy entropy difference values of two adjacent components, and dividing all the components into high-frequency components and low-frequency components based on the two components with the largest fuzzy entropy difference value variation comprises:

in the formula:

are two adjacent components F _i And F _i+1 Difference of (2)，

Is a component F _i Fuzzy entropy of->

Is the component F _i+1 K is the total number of components;

comparing fuzzy entropy difference values between adjacent components

5. The method according to claim 4, wherein the predicting the high frequency component by using the RF method to obtain the high frequency component prediction result comprises:

6. The method according to claim 4, wherein the predicting the low frequency component using the CNN-GRU model optimized based on the attention mechanism to obtain a low frequency component prediction result comprises:

for each component data in the low-frequency component obtained by decomposition, adopting a sliding window mode, sequentially sliding downwards until the whole training set finishes sliding, taking 24 data of each window as an input sample of the training set, taking energy consumption value data of the next row of each window as a label of the training set, and dividing the test set into the input sample and the label according to the processing process of the training set;

7. The method of claim 6, wherein reconstructing the high frequency component prediction and the low frequency component prediction to obtain the energy consumption prediction of the original building energy consumption data comprises:

and performing integrated reconstruction on the high-frequency component prediction result and the low-frequency component prediction result by utilizing the principle of the adopted time sequence decomposition method to obtain the energy consumption prediction result of the original building energy consumption data.