CN115860260A - Resident air conditioner load prediction model considering frequency domain data characteristic decomposition - Google Patents
Resident air conditioner load prediction model considering frequency domain data characteristic decomposition Download PDFInfo
- Publication number
- CN115860260A CN115860260A CN202211700889.5A CN202211700889A CN115860260A CN 115860260 A CN115860260 A CN 115860260A CN 202211700889 A CN202211700889 A CN 202211700889A CN 115860260 A CN115860260 A CN 115860260A
- Authority
- CN
- China
- Prior art keywords
- model
- algorithm
- air conditioner
- feature
- decomposition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000354 decomposition reaction Methods 0.000 title claims abstract description 68
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 95
- 238000000034 method Methods 0.000 claims abstract description 54
- 238000005457 optimization Methods 0.000 claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 19
- 238000003066 decision tree Methods 0.000 claims description 39
- 230000006870 function Effects 0.000 claims description 37
- 238000012549 training Methods 0.000 claims description 32
- 238000004378 air conditioning Methods 0.000 claims description 30
- 230000008569 process Effects 0.000 claims description 23
- 230000003044 adaptive effect Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 9
- 238000013507 mapping Methods 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 4
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 230000006872 improvement Effects 0.000 claims 1
- 230000010354 integration Effects 0.000 claims 1
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 230000005611 electricity Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000010187 selection method Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000012417 linear regression Methods 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a residential air conditioner load prediction model considering frequency domain data characteristic decomposition, which comprises the following steps: step 1: adopting a self-adaptive noise fully-integrated empirical mode decomposition algorithm, namely a CEEMDAN algorithm, to convert original air conditioner load data into components with different fluctuation periods; step 2: introducing a time series arrangement entropy, namely applying a PE algorithm to the field of time sequence air conditioner load prediction, and merging and reconstructing the numerical subcomponent modal characteristics after frequency domain decomposition; and step 3: adopting a data information optimization extraction algorithm selected based on the features of the Catboost; and 4, step 4: considering the influence of high-dimensional external characteristics on an air conditioner load mode, and constructing an air conditioner load prediction model based on an XGboost algorithm; the method has the advantages of performing data characteristic decomposition on the air conditioner load data in a frequency domain, providing a time sequence load permutation entropy algorithm, providing a data information optimization extraction algorithm and providing an air conditioner load prediction model.
Description
Technical Field
The invention belongs to the technical field of air conditioner load prediction, and particularly relates to a residential air conditioner load prediction model considering frequency domain data characteristic decomposition.
Background
In recent years, with the high-speed development of social economy in China, the living standard of residents is continuously improved, the electricity consumption of residents is gradually increased, and the load peak of a power grid system continuously breaks through the historical value, wherein the sudden increase of air conditioner load caused by extremely high temperature is a main reason for high innovation of electricity load in China; in a load prediction method, a method represented by an Extreme Learning Machine (ELM), a Deep Belief Network (DBN), an SVM, a BP neural network and a Random Forest (RF) is widely applied to short-term load prediction, the method converts a long sequence time dependence problem into a static modeling problem, and establishes a nonlinear function mapping relation between load input and output in the current time period through continuous training, however, the air-conditioning load time sequence is nonlinear, unstable, strong in randomness and large in fluctuation, so that the prediction precision is low and the performance of a prediction model is poor in the load prediction process, therefore, the air-conditioning load sequence is decomposed into a plurality of sub-components by an empirical mode decomposition method under the consideration of frequency domain data characteristics, the effective separation of an inherent mode and the frequency domain division of signals are realized, and the load prediction model has strong robustness; in the aspect of a data information optimization extraction algorithm, because short-term load prediction influence factors are complicated and complex, a large number of redundant features and irrelevant features exist, feature dimensionality can be effectively reduced by utilizing a feature selection method to perform information optimization extraction, and algorithm efficiency is improved, but the existing filtering type feature selection method based on minimum redundancy and maximum relevance only evaluates a single feature and does not consider the quality of a feature set, namely, a certain statistical index of each variable is independently calculated respectively, the relative importance degree between different indexes is judged according to the index, and the relatively unimportant indexes are eliminated, in addition, the existing filtering type feature selection method firstly performs feature selection on a data set and then trains a learner, the feature selection process is irrelevant to a subsequent learner, so that the generalization capability of a model is weak, and overfitting is easy to occur; in the aspect of an air conditioner load prediction model, a linear regression model and a neural network model are often adopted in the conventional method, wherein the linear regression method is difficult to model nonlinear data or correlation polynomial regression among data characteristics, cannot solve characteristic interaction in a data set and is difficult to well express highly complex data, so that the prediction precision is low, the neural network method is complex in model parameters, and the problems of difficulty in model hyper-parameter training, easiness in gradient explosion, overfitting training and the like are caused; secondly, the problem to be solved by the optimization method of the local search commonly used in the neural network is to solve the global extremum of the complex nonlinear function, so that the algorithm is likely to fall into the local extremum, and the training fails; finally, the selection of the neural network structure has no unified and complete theoretical guidance, and generally can be selected only by experience, so that the prediction accuracy of the air conditioner load is limited; therefore, it is necessary to provide a residential air conditioner load prediction model considering the frequency domain data characteristic decomposition, which is used for performing the data characteristic decomposition on the air conditioner load data in the frequency domain, providing a time sequence load arrangement entropy algorithm, providing a data information optimization extraction algorithm and providing an air conditioner load prediction model.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a residential air conditioner load prediction model considering frequency domain data characteristic decomposition, which is used for carrying out data characteristic decomposition on air conditioner load data in a frequency domain, providing a time sequence load permutation entropy algorithm, providing a data information optimization extraction algorithm and providing an air conditioner load prediction model.
The purpose of the invention is realized by the following steps: a residential air conditioner load prediction model considering frequency domain data characteristic decomposition comprises the following steps:
step 1: adopting a self-adaptive noise fully-integrated empirical mode decomposition algorithm, namely a CEEMDAN algorithm, to convert original air conditioner load data into components with different fluctuation periods;
and 2, step: introducing a time series arrangement entropy, namely applying a PE algorithm to the field of time sequence air conditioner load prediction, and merging and reconstructing the numerical subcomponent modal characteristics after frequency domain decomposition;
and 3, step 3: in order to reduce overfitting and enhance the generalization capability of the model, a data information optimization extraction algorithm based on the selection of the features of the Catboost is adopted;
and 4, step 4: and considering the influence of high-dimensional external characteristics on the air conditioner load mode, and constructing an XGboost algorithm-based air conditioner load prediction model.
The fully integrated empirical mode decomposition algorithm of the adaptive noise in the step 1 specifically comprises the following steps: the method improves the traditional EMD algorithm by adding Gaussian white noise, provides a self-adaptive noise fully integrated empirical mode decomposition algorithm, adds a plurality of groups of self-adaptive Gaussian white noise, obtains IMF components by averaging the result, ensures that the decomposition process has integrity by a unique residual error calculation mode, improves the inherent mode aliasing phenomenon of the prior EMD, greatly reduces the reconstruction error, ensures that the reconstructed signal is almost the same as the original signal, defines L (t) as the original load sequence and E (t) as the original load sequence i (. To) decompose the i-th component, w, for the EMD sequence i (t) is a set of white Gaussian noises, and the quantity of the white Gaussian noises is consistent with the original load length of L (t); epsilon i Is the white noise amplitude coefficient of the i-th stage,the k-th component is decomposed for the CEEMDAN sequence.
The specific flow of the fully integrated empirical mode decomposition algorithm of the adaptive noise in the step 1 comprises the following steps:
step 1.1: generating M groups of Gaussian noise random values w 1 (t),w 2 (t),...,w M (t) }, obtaining a load curve { L (t) + epsilon of the superimposed noise 0 w 1 (t),L(t)+ε 0 w 2 (t),...,L(t)+ε 0 w M (t), obtaining IMF component { I) by EMD sequence decomposition method 1,1 ,I 2,1 ,...,I M,1 The CEEMDAN component is obtained by taking the mean value, i.e.:
step 1.3: decomposing M sets of sequences { r ] by EMD 1 (t)+ε 1 E 1 (w 1 (t)), I =1,2,. M }, each group of sequences stops decomposing when the 1 st IMF component is obtained, and the 2 nd component I is obtained by averaging the M IMF components 2 Namely:
step 1.4: for the k-th stage, the remaining residuals and components are obtained by equations (4) - (5):
step 1.5: repeat step 1.44 until residual sequence r k (t) the number of extrema n is less than a threshold, typically set to 2, then the CEEMDAN decomposition is complete, at which point L (t) is decomposed into a series of modal componentsAnd residual R (t), i.e.:
The time series arrangement entropy algorithm in the step 2 specifically comprises the following steps: the complexity of the time sequence is measured by defining the time sequence permutation entropy to be used as the basis for combining and recombining the subcomponents, if the number of the subcomponents is large, the numerical difference of the permutation entropy is small, the load fluctuation mode forms are close, the subcomponents can be combined and recombined, and the computing resources and the time cost of the subsequent prediction work are saved.
The specific flow of the time series arrangement entropy algorithm in the step 2 comprises the following steps:
and 2. Step 2.1: for one-dimensional timing load x Load ={x 1 ,x 2 ,...,x N And reconstructing the phase space of the image into a two-dimensional matrix X, namely:in the formula: l represents embedding dimension and determines the sampling number of the row vectors; τ represents the number of interval samples;
step 2.2: for the reconstructed row vector X in X i The elements of (a) are sorted in a descending order to obtain a set of matrix element coordinate indexes { (i, j) 1 ),(i,j 2 ),...,(i,j L ) Is made to satisfyThe larger the ordinate index value is, the more the element values are the same, the higher the ranking is;
step 2.3: for arbitrary row vectors X i Defining a corresponding load fluctuation pattern S i ={j 1 ,j 2 ,...,j L Then there is a total of L! Counting the probability of all fluctuation modes in X { P } 1 ,P 2 ,...,P C Define timing loads x Load ={x 1 ,x 2 ,...,x N The permutation entropy H (L) of } is: the permutation entropy is normalized to be between 0 and 1 by the formula (9), the closer H is to 1, the richer the fluctuation mode is, and the closer H is to 0, the monotonous the fluctuation mode is.
The specific flow of the data information optimization extraction algorithm selected based on the Catboost characteristics in the step 3 comprises the following steps:
step 3.1: the category characteristics such as time, weather and the like can be better processed by the Catboost compared with the traditional gradient lifting decision tree algorithm, the traditional Greeny TBS takes the category label mean value as the standard of node splitting, and the number of samples is trained and testedWhen the distribution of the data set is different, the problem of condition deviation is easy to occur, and the Catboost effectively reduces the influence of noise and low-frequency secondary sample data on the distribution by adding prior distribution, as shown in formula (10):in the formula: σ = { σ = 1 ,σ 2 ,...,σ n When }, when>When it is in motion [ ]]=1, otherwise [ ·]=0;Is a category label value; p is prior; a is a weight coefficient;Average label values for the training set;Classifying feature values for the training set;
step 3.2: the feature importance degree can be evaluated by the Catboost in the training process, based on the feature importance degree, a plurality of feature selection strategies can be constructed, PVC represents the average fluctuation amount of the predicted value of the Catboost model when the unit of the feature value changes, and if the importance degree of the feature relative to the model is higher, the PVC is also larger, as shown in a formula (11); the LFC reflects the effect of the characteristics on accelerating the convergence of the model by comparing the change condition of the Catboost model loss function if the characteristics exist, as shown in formula (13): in the formula, W l 、V l 、W r 、V r Respectively representing the weight and the target value of the left leaf and the weight and the target value of the right leaf; LFC = L (X) -L (X) i ) (13) in the formula, X represents an input set { X ] having N feature components 1 ,x 2 ,...,x N };X i Representing a set of inputs { x ] having N-1 feature components 1 ,x 2 ,...,x i-1 ,x i+1 ,...,x N }; l (-) represents the loss function value of the model after the input features; the evaluation index I obtained by weighting calculation of PVC and LFC can give consideration to the advantages of PVC and LFC in different application scenes to comprehensively embody the characteristic importance semantics, as shown in formula (14): i = a · PVC + b · LFC (14), wherein: a and b are weight coefficients, and the importance degree of PVC or LFC indexes can be enhanced by adjusting the sizes of a and b, so that the method is suitable for differentiated application scenes;
step 3.3: and (3) completing a feature selection process by adopting a recursive feature reduction method, searching an optimal feature subset based on a greedy strategy, and removing the least important features by repeatedly constructing a model.
The recursive feature reduction method in step 3.3 specifically includes the following steps:
step 3.31: initializing parameters: input load data and associated impact signature X = { X = 1 ,x 2 ,...,x N As an argument, predicted data Y = { Y = 1 ,y 2 ,...,y M As a dependent variable;
step 3.32: generating a Catboost model: the first stage is to generate a regression tree based on greedy algorithm by calculating different features X in the feature set X i And selecting the feature x with the minimum MSE error i Constructing an optimal tree model; the second stage is gradient lifting, a new regression tree is continuously constructed in the gradient descending direction of the current regression tree, and finally a plurality of regression trees are integrated to obtain a Catboost gradient lifting regression tree model;
step 3.33: removing characteristics: all features { x ] are calculated by equation (14) 1 ,x 2 ,...,x N Importance measure of { I } 1 ,I 2 ,...,I N And in descending order of feature importance I k1 ≥I k2 ≥...≥I kN Obtaining the sorting resultRecord the current time moldPrediction accuracy p of patterns on a test set N And feature combination>Rejection of the feature of lowest feature importance>Judge the remaining characteristic->Whether the number is equal to 1 or not, if not, inputting the number into a new Catboost model again for learning and training, and repeating the steps from 3.32 to 3.33; if yes, entering step 4;
step 3.34: optimal feature subset selection: load prediction accuracy p of different feature numbers N ,p N-1 ,...,p 1 Sorting in descending order, and assuming that the number of characteristic inputs is j, having the highest prediction precisionThe corresponding optimal characteristic input is £ er>
The air conditioner load prediction model based on the XGboost algorithm in the step 4 specifically comprises the following steps:
step 4.1: XGboost is an integrated learning algorithm with a tree model as a basic model, the principle is that the overall performance of the model is improved by building and integrating a plurality of basic learner training results, the traditional gradient lifting decision tree (GBDT) is used as the basis, XGboost performs second-order Taylor expansion on a loss function of the XGboost, in addition, a regular term is added in the model, the model is accelerated to be converged while the complexity of the model is effectively reduced, the number of samples to be predicted is given as N, and an air conditioner load data set with the characteristic number of M is as follows: d = { (x) i ,y i ):i=1,2,...,n,x i ∈R M ,y i The method belongs to the field of the following integrated prediction models, wherein the integrated prediction models belong to the group R and have K classification regression decision trees (CART):in the formula, y i * Representing an air conditioner load prediction result of the XGBoost model; k is the decision tree number size; f. of k Represents the kth CART decision tree, each f k All have corresponding independent decision tree structures and leaf nodes with different weights; f is a set space representing a decision tree, and its specific meaning is:In the formula: q denotes a respective independent decision tree structure in which all samples can find their corresponding leaf node by mapping under the decision tree and pass ≥ h>Further mapping information contained in the leaf node into a specific numerical value;
step 4.2: calculating the deviation between the model prediction result and the real air conditioner load by defining an objective function, and training the XGboost model by taking a minimized loss function as a target, wherein the objective function is defined as:in the formula: l (-) represents the loss function error, and adopts the mean square error; the regularization term Ω (-) is defined as:In the formula, T represents the number of leaf nodes, and the complexity of the tree structure can be set by changing the number of the leaf nodes; w is a j Representing the weight of the jth leaf node, and keeping the weight at a smaller value can effectively prevent overfitting; gamma and lambda represent penalty coefficients, and the relative importance degree of the two penalty items can be set by changing the numerical values of the two coefficients;
step 4.3: based on forward step-by-step algorithm, by optimizing newly added CART decision tree f t To minimize the objective function, the t-th step removes the constant term and applies the second order Taylor expansionFunction Obj (t) Comprises the following steps:in the formula: I.C. A j Representing all sample number sets mapped into jth leaf nodes through the CART decision tree;Respectively representing the first derivative and the second derivative of the loss function; formula (20) relates to w j The optimal leaf node weight under a certain specific CART decision tree can be obtained by the following equation:Substituting equation (20) into equation (19) results in the optimal objective function corresponding to the specific CART decision tree:Combining formula (15), training the XGboost model with formula (21) as an objective function, and obtaining a final training result, which is an air conditioning load time sequence data set obtained by the air conditioning load prediction model, and recording as:
The invention has the beneficial effects that: the invention relates to a residential air conditioner load prediction model considering frequency domain data characteristic decomposition, which has the following advantages in use: 1. aiming at the problems of low prediction precision and poor performance of a load prediction model caused by large fluctuation and strong randomness of data in original air conditioner load data, a complete empirical mode decomposition algorithm of adaptive noise is provided to carry out data characteristic decomposition on the air conditioner load data in a frequency domain, the original air conditioner load data can be converted into components with different fluctuation periods, the problems of large fluctuation, strong randomness, difficult prediction and the like of the original air conditioner load data can be effectively improved, and the precision of load prediction is improved; 2. aiming at the problem of reconstructing and combining the subcomponents after the sequence decomposition, a time sequence load permutation entropy algorithm is provided, and the subcomponents after the sequence decomposition are permutated and recombined by taking the time sequence complexity as a measurement principle, so that the computing resources in the load prediction process can be saved, and the efficiency and the performance of the prediction algorithm can be obviously improved; 3. aiming at the problem that the distribution characteristics of high-dimensional multi-source data are difficult to process, a data information optimization extraction algorithm based on Catboost feature selection is provided, noise and low frequency are effectively reduced by adding prior distribution, the influence of secondary sample data on distribution is realized, the category features such as time, weather and the like can be better processed, the number of features is reduced, the dimension is reduced, the generalization capability of a model is stronger, and overfitting is reduced; 4. aiming at the problem of poor robustness of nonlinear time sequence characteristic prediction of air conditioning load, an XGboost-based air conditioning load prediction model is provided, a tree model is promoted through the interior, missing values can be automatically processed, the robustness of the model is enhanced, in addition, the XGboost model supports column sampling, overfitting can be reduced, the calculation efficiency of an algorithm can be improved, and further accurate prediction of the air conditioning load is realized; the method has the advantages of performing data characteristic decomposition on the air conditioner load data in a frequency domain, providing a time sequence load permutation entropy algorithm, providing a data information optimization extraction algorithm and providing an air conditioner load prediction model.
Drawings
FIG. 1 is a general method roadmap for the present invention.
FIG. 2 is a flow chart of the feature selection of the Catboost model according to the present invention.
FIG. 3 is a flow chart of the XGboost model training process of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Example 1
As shown in fig. 1 to 3, a residential air conditioning load prediction model considering frequency domain data characteristic decomposition comprises the following steps:
step 1: the method comprises the steps of adopting a self-adaptive noise fully-integrated empirical mode decomposition algorithm, namely a CEEMDAN algorithm, converting original air conditioner load data into components with different fluctuation periods, and avoiding adverse effects on load prediction precision caused by overlarge direct prediction of air conditioner load data fluctuation;
step 2: the time series arrangement entropy, namely the PE algorithm is introduced to the time sequence air conditioner load prediction field, the numerical sub-component modal characteristics after frequency domain decomposition are combined and reconstructed, and the prediction precision of the air conditioner load can be considered while the time complexity of the subsequent load prediction operation is reduced;
and 3, step 3: by adopting a data information optimization extraction algorithm selected based on the features of the Catboost and adding prior distribution, noise and low frequency are effectively reduced, the influence of secondary sample data on distribution is realized, the number of features is reduced, the dimension is reduced, the generalization capability of the model is stronger, and overfitting is reduced;
and 4, step 4: the influence of high-dimensional external characteristics on an air conditioner load mode is considered, an XGboost algorithm-based air conditioner load prediction model is constructed, a tree model is promoted through the inside, missing values can be automatically processed, and the robustness of the model is enhanced.
The invention relates to a residential air conditioner load prediction model considering frequency domain data characteristic decomposition, which has the following advantages in use: 1. aiming at the problems of low prediction accuracy and poor performance of a load prediction model caused by high volatility and strong randomness of data in original air conditioner load data, a complete empirical mode decomposition algorithm of adaptive noise is provided to carry out data characteristic decomposition on the air conditioner load data in a frequency domain, the original air conditioner load data can be converted into components with different fluctuation cycles, the problems of high volatility, strong randomness, difficult prediction and the like of the original air conditioner load data can be effectively improved, and the accuracy of load prediction is improved; 2. aiming at the problem of reconstructing and combining the subcomponents after the sequence decomposition, a time sequence load permutation entropy algorithm is provided, and the subcomponents after the sequence decomposition are permutated and recombined by taking the time sequence complexity as a measurement principle, so that the computing resources in the load prediction process can be saved, and the efficiency and the performance of the prediction algorithm can be obviously improved; 3. aiming at the problem that the distribution characteristics of high-dimensional multi-source data are difficult to process, a data information optimization extraction algorithm based on Catboost feature selection is provided, noise and low frequency are effectively reduced by adding prior distribution, the influence of secondary sample data on distribution is realized, the category features such as time, weather and the like can be better processed, the number of features is reduced, the dimension is reduced, the generalization capability of a model is stronger, and overfitting is reduced; 4. aiming at the problem of poor robustness of nonlinear time sequence characteristic prediction of air conditioning load, an XGboost-based air conditioning load prediction model is provided, a tree model is promoted through the interior, missing values can be automatically processed, the robustness of the model is enhanced, in addition, the XGboost model supports column sampling, overfitting can be reduced, the calculation efficiency of an algorithm can be improved, and further accurate prediction of the air conditioning load is realized; the method has the advantages of performing data characteristic decomposition on the air conditioner load data in a frequency domain, providing a time sequence load permutation entropy algorithm, providing a data information optimization extraction algorithm and providing an air conditioner load prediction model.
Example 2
As shown in fig. 1 to 3, a residential air conditioning load prediction model considering frequency domain data characteristic decomposition comprises the following steps:
step 1: adopting a self-adaptive noise fully-integrated empirical mode decomposition algorithm, namely a CEEMDAN algorithm, to convert original air conditioner load data into components with different fluctuation periods;
and 2, step: introducing a time series arrangement entropy, namely applying a PE algorithm to the field of time sequence air conditioner load prediction, and merging and reconstructing the numerical subcomponent modal characteristics after frequency domain decomposition;
and step 3: in order to reduce overfitting and enhance the generalization capability of the model, a data information optimization extraction algorithm based on Catboost feature selection is adopted;
and 4, step 4: and considering the influence of high-dimensional external characteristics on the air conditioner load mode, and constructing an XGboost algorithm-based air conditioner load prediction model.
In this embodiment, as shown in fig. 1, the present invention extracts residential air conditioning load data from residential electricity data, recombines the original air conditioning load sequence into an eigenmode component and a residual through a CEEMDAN sequence decomposition algorithm, merges subcomponents by using a time sequence arrangement entropy to take into account the efficiency and performance of a prediction model, optimizes an optimal input feature subset of subcomponents based on a Catboost recursive feature elimination algorithm, inputs the result into an XGBOOST model for training an air conditioning load prediction model, and sums the prediction results of each subcomponent to realize accurate and effective air conditioning load prediction.
The fully integrated empirical mode decomposition algorithm of the adaptive noise in the step 1 specifically comprises the following steps: the traditional empirical mode decomposition algorithm, namely the EMD algorithm, is usually applied to the field of signals, processes and analyzes signals which are large in volatility, strong in randomness, non-stable and non-linear, and can extract periodic fluctuation components IMF with regularity from any one signal, wherein the extracted IMF must meet two conditions at the same time: (1) the method comprises the following steps Number of extreme IMF values n 1 And the number n of zero crossings 2 Satisfy | | n 1 -n 2 Less than or equal to 1; (2) the method comprises the following steps An upper envelope L is formed by smoothly connecting local maximum and minimum values max (t), lower envelope L min (t),L max (t) and L min (t) should be symmetrical about the time axis; the method is characterized in that a plurality of groups of self-adaptive white Gaussian noises are added, the result is averaged to obtain an IMF component, the unique residual error calculation mode enables the decomposition process to have integrity, the inherent modal aliasing phenomenon of the existing EMD is improved, the reconstruction error is greatly reduced, the reconstructed signal is almost the same as the original signal, L (t) is defined as an original load sequence, and E (t) is defined as an original load sequence i (. To) decompose the i-th component, w, for the EMD sequence i (t) is a set of white Gaussian noises, and the quantity of the white Gaussian noises is consistent with the original load length of L (t); epsilon i Is the white noise amplitude coefficient of the i-th stage,the k-th component is decomposed for the CEEMDAN sequence.
The specific flow of the fully integrated empirical mode decomposition algorithm of the adaptive noise in the step 1 comprises the following steps:
step 1.1: generating M groups of Gaussian noise random values w 1 (t),w 2 (t),...,w M (t) }, obtaining a load curve { L (t) + epsilon of the superimposed noise 0 w 1 (t),L(t)+ε 0 w 2 (t),...,L(t)+ε 0 w M (t), obtaining IMF component { I } by EMD sequence decomposition method 1,1 ,I 2,1 ,...,I M,1 The CEEMDAN component is obtained by taking the mean value, i.e.:
step 1.3: decomposing M sets of sequences { r ] by EMD 1 (t)+ε 1 E 1 (w 1 (t)), I =1,2,. M }, each group of sequences stops decomposing when the 1 st IMF component is obtained, and the 2 nd component I is obtained by averaging the M IMF components 2 Namely:
step 1.4: for the k-th stage, the remaining residuals and components are obtained by equations (4) - (5):
step 1.5: repeat step 1.44 until residual sequence r k (t) the number of extrema n is less than a threshold, the CEEMDAN decomposition is complete, at which point L (t) is decomposed into a series of modal componentsAnd residual R (t), i.e.:
the time series arrangement entropy algorithm in the step 2 specifically comprises the following steps: the complexity of the time sequence is measured by defining the time sequence permutation entropy to be used as the basis for combining and recombining the subcomponents, if the number of the subcomponents is large, the numerical difference of the permutation entropy is small, the load fluctuation mode forms are close, the subcomponents can be combined and recombined, and the computing resources and the time cost of the subsequent prediction work are saved.
The specific flow of the time series arrangement entropy algorithm in the step 2 comprises the following steps:
step 2.1: for one-dimensional timing load x Load ={x 1 ,x 2 ,...,x N And reconstructing the phase space of the image into a two-dimensional matrix X, namely:in the formula: l represents embedding dimension and determines the number of line vector samples; τ represents the number of interval samples;
step 2.2: for the reconstructed row vector X in X i The elements of (a) are sorted in a descending order to obtain a set of matrix element coordinate indexes { (i, j) 1 ),(i,j 2 ),...,(i,j L ) Is caused to satisfyThe larger the ordinate index value is, the more the element values are the same, the higher the ranking is;
step 2.3: for arbitrary row vectors X i Defining a corresponding load fluctuation pattern S i ={j 1 ,j 2 ,...,j L Then there is a total of L! Counting the probability of all fluctuation modes in X { P } 1 ,P 2 ,...,P C Define the time sequence load x Load ={x 1 ,x 2 ,...,x N The permutation entropy H (L) of } is: the permutation entropy is normalized to be between 0 and 1 through the formula (9), the closer H is to 1, the richer the fluctuation mode is, the closer H is to 0, and the monotonous fluctuation mode is.
The specific flow of the data information optimization extraction algorithm selected based on the Catboost characteristics in the step 3 comprises the following steps:
step 3.1: compared with the traditional gradient lifting decision tree algorithm, the Catboost can better process the category characteristics of time, weather and the like, the traditional Greeny TBS takes the category label mean value as the standard of node splitting, when the distribution of training and testing sample data sets is different, the problem of condition deviation is easy to occur, the Catboost effectively reduces the influence of noise and low-frequency sample data on the distribution by adding prior distribution, and the formula (10) shows that:in the formula: σ = { σ = 1 ,σ 2 ,...,σ n When }, when->When it is in motion [ ]]=1,
Otherwise [ ·]=0;Is a category label value; p is prior; a is a weight coefficient;Average labeling of training sets
step 3.2: the feature importance degree can be evaluated by the Catboost in the training process, based on the feature importance degree, a plurality of feature selection strategies can be constructed, PVC represents the average fluctuation amount of the predicted value of the Catboost model when the unit of the feature value changes, and if the importance degree of the feature relative to the model is higher, the PVC is also larger, as shown in a formula (11); the LFC reflects the effect of the characteristics on accelerating the convergence of the model by comparing the change condition of the Catboost model loss function if the characteristics exist, as shown in formula (13): in the formula, W l 、V l 、W r 、V r Respectively representing the weight and the target value of the left leaf and the weight and the target value of the right leaf; LFC = L (X) -L (X) i ) (13), wherein X represents an input set having N feature components { X } 1 ,x 2 ,...,x N };X i Representing a set of inputs { x ] having N-1 feature components 1 ,x 2 ,...,x i-1 ,x i+1 ,...,x N }; l (-) represents the loss function value of the model after the input features; the evaluation index I obtained by weighting calculation of PVC and LFC can give consideration to the advantages of PVC and LFC in different application scenes to comprehensively embody the characteristic importance semantics, as shown in formula (14): i = a · PVC + b · LFC (14), wherein: a and b are weight coefficients, and the importance degree of PVC or LFC indexes can be enhanced by adjusting the sizes of a and b, so that the method is suitable for differentiated application scenes;
step 3.3: and (3) completing a feature selection process by adopting a recursive feature reduction method, searching an optimal feature subset based on a greedy strategy, and removing the least important features by repeatedly constructing a model.
The recursive feature reduction method in step 3.3 specifically includes the following steps:
step 3.31: initializing parameters: input load data and associated impact signature X = { X = { X } 1 ,x 2 ,...,x N As an argument, predicted data Y = { Y = 1 ,y 2 ,...,y M As a dependent variable;
step 3.32: generating a Catboost model: the first stage is to generate a regression tree based on greedy algorithm by calculating different features X in the feature set X i And selecting the feature x with the minimum MSE error i Constructing an optimal tree model; the second stage is gradient lifting, a new regression tree is continuously constructed in the gradient descending direction of the current regression tree, and finally a plurality of regression trees are integrated to obtain a Catboost gradient lifting regression tree model;
step 3.33: removing characteristics: all features { x ] are calculated by equation (14) 1 ,x 2 ,...,x N Importance metric of { I } the importance of the target 1 ,I 2 ,...,I N And in descending order of feature importance I k1 ≥I k2 ≥...≥I kN Obtaining the sorting resultRecording the prediction accuracy p of the model on the test set at the moment N And combinations of features
Rejection of the feature of lowest feature importance>Judge the remaining characteristic->Whether the number is equal to 1 or not, if not, inputting the number into a new Catboost model again for learning and training, and repeating the steps from 3.32 to 3.33; if yes, entering step 4;
step 3.34: optimal feature subset selection: load prediction accuracy p of different feature numbers N ,p N-1 ,...,p 1 Sorting in descending order, and assuming that the number of characteristic inputs is j, having the highest prediction precisionThe corresponding optimal characteristic input is £ er>The overall process of feature selection of the Catboost model of the invention is shown in FIG. 2.
The air conditioner load prediction model based on the XGboost algorithm in the step 4 specifically comprises the following steps:
step 4.1: XGboost is an integrated learning algorithm based on a tree model and based on the principle that the XGboost is constructed and executedThe method is characterized in that a plurality of basis learning device training results are integrated to improve the overall performance of a model, on the basis of a traditional gradient lifting decision tree (GBDT), XGboost performs second-order Taylor expansion on a loss function of the XGboost, in addition, a regular term is added in the model, the complexity of the model is effectively reduced, meanwhile, the model is accelerated to converge, the number of samples to be predicted is N, and an air conditioner load data set with the characteristic number of M is as follows: d = { (x) i ,y i ):i=1,2,...,n,x i ∈R M ,y i The method belongs to the field of the following integrated prediction models, wherein the integrated prediction models belong to the group R and have K classification regression decision trees (CART):in the formula, y i * Representing an air conditioner load prediction result of the XGboost model; k is the decision tree number size; f. of k Represents the kth CART decision tree, each f k All have corresponding independent decision tree structures and leaf nodes with different weights; f is a set space representing a decision tree, and its specific meaning is:In the formula: q denotes a respective independent decision tree structure in which all samples can find their corresponding leaf node by mapping under the decision tree and by &>Further mapping information contained in the leaf node into a specific numerical value;
step 4.2: calculating the deviation between the model prediction result and the real air conditioner load by defining an objective function, and training the XGBoost model by taking a minimized loss function as a target, wherein the objective function is defined as:in the formula: l (-) represents the loss function error, and adopts the mean square error; the regularization term Ω (-) is defined as:In the formula, T represents the number of leaf nodes, and the complexity of the tree structure can be set by changing the number of the leaf nodes; w is a j Representing the weight of the jth leaf node, keeping the weight at a small value can be effective to prevent overfitting; gamma and lambda represent penalty coefficients, and the relative importance degree of the two penalty terms can be set by changing the numerical values of the two coefficients;
step 4.3: based on forward stepwise algorithm, through optimizing newly-added CART decision tree f t To minimize the objective function, the t step removes the constant term and applies the objective function Obj of the second order Taylor expansion (t) Comprises the following steps:in the formula: i is j Representing all sample number sets mapped into jth leaf nodes through the CART decision tree;Respectively representing the first derivative and the second derivative of the loss function; formula (20) relates to w j The optimal leaf node weight under a certain specific CART decision tree can be obtained by the following equation:Substituting equation (20) into equation (19) yields the corresponding optimal objective function under the specific CART decision tree:In the combined formula (15), the XGboost model is trained with the formula (21) as an objective function, the training process is shown in fig. 3, and the finally obtained training result is the air conditioning load time sequence data set obtained by the air conditioning load prediction model, and is recorded as:
The invention relates to a residential air conditioner load prediction model considering frequency domain data characteristic decomposition, which has the following advantages in use: 1. aiming at the problems of low prediction precision and poor performance of a load prediction model caused by large fluctuation and strong randomness of data in original air conditioner load data, a complete empirical mode decomposition algorithm of adaptive noise is provided to carry out data characteristic decomposition on the air conditioner load data in a frequency domain, the original air conditioner load data can be converted into components with different fluctuation periods, the problems of large fluctuation, strong randomness, difficult prediction and the like of the original air conditioner load data can be effectively improved, and the precision of load prediction is improved; 2. aiming at the problem of reconstructing and combining the subcomponents after the sequence decomposition, a time sequence load permutation entropy algorithm is provided, and the subcomponents after the sequence decomposition are permutated and recombined by taking the time sequence complexity as a measurement principle, so that the computing resources in the load prediction process can be saved, and the efficiency and the performance of the prediction algorithm can be obviously improved; 3. aiming at the problem that the distribution characteristics of high-dimensional multi-source data are difficult to process, a data information optimization extraction algorithm based on Catboost feature selection is provided, noise and low frequency are effectively reduced by adding prior distribution, the influence of secondary sample data on distribution is realized, the category features such as time, weather and the like can be better processed, the number of features is reduced, the dimension is reduced, the generalization capability of a model is stronger, and overfitting is reduced; 4. aiming at the problem of poor robustness of nonlinear time sequence characteristic prediction of air conditioning load, an XGboost-based air conditioning load prediction model is provided, a tree model is promoted through the interior, missing values can be automatically processed, the robustness of the model is enhanced, in addition, the XGboost model supports column sampling, overfitting can be reduced, the calculation efficiency of an algorithm can be improved, and further accurate prediction of the air conditioning load is realized; the method has the advantages of performing data characteristic decomposition on the air conditioner load data in a frequency domain, providing a time sequence load permutation entropy algorithm, providing a data information optimization extraction algorithm and providing an air conditioner load prediction model.
Claims (8)
1. A residential air conditioner load prediction model considering frequency domain data characteristic decomposition is characterized in that: it comprises the following steps:
step 1: adopting a self-adaptive noise fully-integrated empirical mode decomposition algorithm, namely a CEEMDAN algorithm, to convert original air conditioner load data into components with different fluctuation periods;
step 2: introducing a time series arrangement entropy algorithm, namely applying a PE algorithm to the time sequence air conditioner load prediction field, and merging and reconstructing the numerical subcomponent modal characteristics after frequency domain decomposition;
and step 3: in order to reduce overfitting and enhance the generalization capability of the model, a data information optimization extraction algorithm based on the selection of the features of the Catboost is adopted;
and 4, step 4: and considering the influence of high-dimensional external characteristics on the air conditioner load mode, and constructing an XGboost algorithm-based air conditioner load prediction model.
2. The residential air conditioning load prediction model considering frequency domain data characteristic decomposition as claimed in claim 1, wherein: the fully integrated empirical mode decomposition algorithm of the adaptive noise in the step 1 specifically comprises the following steps: the traditional EMD algorithm is improved by adding Gaussian white noise, a self-adaptive noise complete integration empirical mode decomposition algorithm is provided, the algorithm adds a plurality of groups of self-adaptive Gaussian white noise, the result is averaged to obtain IMF components, a unique residual error calculation mode enables the decomposition process to have integrity, the inherent modal aliasing phenomenon of the existing EMD is improved, the reconstruction error is greatly reduced, the reconstructed signal is almost the same as the original signal, L (t) is defined as the original load sequence, E (input/output) is defined as the original load sequence, and i (. To) decompose the i-th component, w, for the EMD sequence i (t) is a set of white Gaussian noises, and the quantity of the white Gaussian noises is consistent with the original load length of L (t); epsilon i Is the white noise amplitude coefficient of the i-th stage,the k-th component is decomposed for the CEEMDAN sequence.
3. The residential air conditioning load prediction model considering frequency domain data characteristic decomposition as claimed in claim 2, wherein: the specific flow of the fully integrated empirical mode decomposition algorithm of the adaptive noise in the step 1 comprises the following steps:
step 1.1: generating M groups of Gaussian noise random values w 1 (t),w 2 (t),...,w M (t) }, obtaining a load curve { L (t) + epsilon of the superimposed noise 0 w 1 (t),L(t)+ε 0 w 2 (t),...,L(t)+ε 0 w M (t), obtaining IMF component { I) by EMD sequence decomposition method 1,1 ,I 2,1 ,...,I M,1 Get the CEEMDAN component by taking the mean, i.e.:
step 1.3: decomposing M sets of sequences { r ] by EMD 1 (t)+ε 1 E 1 (w 1 (t)), i =1,2,. M }, each group of sequences stops decomposing when the 1 st IMF component is obtained, and the 2 nd component can be obtained by averaging the M IMF componentsNamely:
Step 1.4: for the k-th stage, the remaining residuals and components are obtained by equations (4) - (5):
4. The residential air conditioning load prediction model considering frequency domain data characteristic decomposition as claimed in claim 1, wherein: the time series arrangement entropy algorithm in the step 2 specifically comprises the following steps: the complexity of the time sequence is measured by defining the time sequence permutation entropy to be used as the basis for combining and recombining the subcomponents, if the number of the subcomponents is large, the numerical difference of the permutation entropy is small, the load fluctuation mode forms are close, the subcomponents can be combined and recombined, and the computing resources and the time cost of the subsequent prediction work are saved.
5. The residential air conditioning load prediction model considering frequency domain data characteristic decomposition as claimed in claim 4, wherein: the specific flow of the time series arrangement entropy algorithm in the step 2 comprises the following steps:
step 2.1: for one-dimensional timing load x Load ={x 1 ,x 2 ,...,x N And reconstructing the phase space of the image into a two-dimensional matrix X, namely:in the formula: l represents embedding dimension and determines the number of line vector samples; τ represents the number of interval samples;
step 2.2: for the reconstructed row vector X in X i The elements of (a) are sorted in a descending order to obtain a set of matrix element coordinate indexes { (i, j) 1 ),(i,j 2 ),...,(i,j L ) Is caused to satisfyThe larger the ordinate index value is, the more the element values are the same, the higher the ranking is;
step 2.3: for arbitrary row vectors X i Defining a corresponding load fluctuation pattern S i ={j 1 ,j 2 ,...,j L }, then there is L! A wave pattern is set to be a wave pattern,counting the probability of all fluctuation patterns in X { P } 1 ,P 2 ,...,P C Define the time sequence load x Load ={x 1 ,x 2 ,...,x N The permutation entropy H (L) of } is: the permutation entropy is normalized to be between 0 and 1 through the formula (9), the closer H is to 1, the richer the fluctuation mode is, the closer H is to 0, and the monotonous fluctuation mode is.
6. The residential air conditioning load prediction model considering frequency domain data characteristic decomposition as claimed in claim 1, wherein: the specific flow of the data information optimization extraction algorithm selected based on the Catboost characteristics in the step 3 comprises the following steps:
step 3.1: compared with the traditional gradient lifting decision tree algorithm, the Catboost can better process the category characteristics of time, weather and the like, the traditional Greeny TBS takes the category label mean value as the standard of node splitting, when the distribution of training and testing sample data sets is different, the problem of condition deviation easily occurs, the Catboost effectively reduces the influence of noise and low-frequency sample data on the distribution by adding prior distribution, and the formula (10) shows that:in the formula: σ = { σ = 1 ,σ 2 ,...,σ n When }, when>When it is in motion [ ]]=1, otherwise [ ·]=0;Is a category label value; p is prior; a is a weight coefficient;Average label values for the training set;Classifying feature values for the training set;
step 3.2: the feature importance degree can be evaluated by the Catboost in the training process, based on the feature importance degree, a plurality of feature selection strategies can be constructed, PVC represents the average fluctuation amount of the predicted value of the Catboost model when the unit of the feature value changes, and if the importance degree of the feature relative to the model is higher, the PVC is also larger, as shown in a formula (11); the LFC reflects the effect of the characteristics on accelerating the convergence of the model by comparing the change condition of the Catboost model loss function if the characteristics exist, as shown in formula (13): in the formula, W l 、V l 、W r 、V r Respectively representing the weight and the target value of the left leaf and the weight and the target value of the right leaf; LFC = L (X) -L (X) i ) (13) in the formula, X represents an input set { X ] having N feature components 1 ,x 2 ,...,x N };X i Representing a set of inputs { x ] having N-1 feature components 1 ,x 2 ,...,x i-1 ,x i+1 ,...,x N }; l (-) represents the loss function value of the model after the input features; the evaluation index I obtained by weighting calculation of PVC and LFC can give consideration to the advantages of PVC and LFC in different application scenes to comprehensively embody the characteristic importance semantics, as shown in formula (14): i = a · PVC + b · LFC (14), wherein: a and b are weight coefficients, the importance degree of PVC or LFC indexes can be enhanced by adjusting the sizes of a and b, and the method adapts to a differentiated application scene;
step 3.3: and (3) completing a feature selection process by adopting a recursive feature reduction method, searching an optimal feature subset based on a greedy strategy, and removing the least important features by repeatedly constructing a model.
7. The residential air conditioning load prediction model considering frequency domain data characteristic decomposition as claimed in claim 6, wherein: the recursive feature reduction method in step 3.3 specifically includes the following steps:
step 3.31: initializing parameters: input load data and associated impact signature X = { X = 1 ,x 2 ,...,x N As an argument, predicted data Y = { Y = 1 ,y 2 ,...,y M As a dependent variable;
step 3.32: generating a Catboost model: the first stage is to generate a regression tree based on greedy algorithm by calculating different features X in feature set X i And selecting the feature x with the minimum MSE error i Constructing an optimal tree model; the second stage is gradient lifting, a new regression tree is continuously constructed in the gradient descending direction of the current regression tree, and finally a plurality of regression trees are integrated to obtain a Catboost gradient lifting regression tree model;
step 3.33: removing characteristics: all features { x ] are calculated by equation (14) 1 ,x 2 ,...,x N Importance metric of { I } the importance of the target 1 ,I 2 ,...,I N And in descending order of feature importance I k1 ≥I k2 ≥...≥I kN Obtaining the sorting resultRecording the prediction accuracy p of the model on the test set at the moment N And feature combination>Rejection of the feature of lowest feature importance>Determining a residual characteristic>Whether the number is equal to 1, ifIf not, inputting the model into a new Catboost model again for learning and training, and repeating the steps from 3.32 to 3.33; if yes, entering step 4;
step 3.34: optimal feature subset selection: load prediction accuracy p of different feature numbers N ,p N-1 ,...,p 1 Sorting in descending order, and assuming that the number of characteristic inputs is j, having the highest prediction precisionThe corresponding optimal characteristic input is £ er>
8. The residential air conditioning load prediction model considering frequency domain data characteristic decomposition as claimed in claim 1, wherein: the air conditioner load prediction model based on the XGboost algorithm in the step 4 specifically comprises the following steps:
step 4.1: the XGBoost is an integrated learning algorithm taking a tree model as a basic model, the principle of the XGBoost is that the whole performance of the model is improved by building and integrating a plurality of basic learning device training results, a traditional gradient improvement decision tree (GBDT) is used as a basis, the XGBOSst performs second-order Taylor expansion aiming at a loss function of the XGBOSst, in addition, a regular term is added in the model, the model is accelerated to converge while the complexity of the model is effectively reduced, the number of samples to be predicted is given as N, and an air conditioner load data set with the characteristic number of M is as follows: d = { (x) i ,y i ):i=1,2,...,n,x i ∈R M ,y i The method belongs to the field of the following integrated prediction models, wherein the integrated prediction models belong to the group R and have K classification regression decision trees (CART):in the formula, y i * Representing an air conditioner load prediction result of the XGBoost model; k is the decision tree number size; f. of k Represents the kth CART decision tree, each f k All have corresponding independent decision tree structures and leaf nodes with different weightsPoint; f is a set space representing a decision tree, and its specific meaning is:In the formula: q represents respective independent decision tree structures, in which all samples can find the leaf nodes corresponding to the samples under the decision tree through mapping and pass through omega q(xi) Further mapping information contained in the leaf node into a specific numerical value;
step 4.2: calculating the deviation between the model prediction result and the real air conditioner load by defining an objective function, and training the XGBoost model by taking a minimized loss function as a target, wherein the objective function is defined as:in the formula: l (-) represents the loss function error, and adopts the mean square error; the regularization term Ω (·) is defined as:In the formula, T represents the number of leaf nodes, and the complexity of the tree structure can be set by changing the number of the leaf nodes; w is a j Representing the weight of the jth leaf node, and keeping the weight at a smaller value can effectively prevent overfitting; gamma and lambda represent penalty coefficients, and the relative importance degree of the two penalty terms can be set by changing the numerical values of the two coefficients;
step 4.3: based on forward step-by-step algorithm, by optimizing newly added CART decision tree f t To minimize the objective function, the t step removes the constant term and applies the objective function Obj of the second order Taylor expansion (t) Comprises the following steps:in the formula: i is j Representing all sample number sets mapped into jth leaf nodes through the CART decision tree;individual watchFirst and second derivatives of the loss function are shown; formula (20) relates to w j The optimal leaf node weight under a certain specific CART decision tree can be obtained by the following steps:Substituting equation (20) into equation (19) yields the corresponding optimal objective function under the specific CART decision tree:Combining formula (15), training the XGboost model with formula (21) as an objective function, and obtaining a final training result, which is an air conditioning load time sequence data set obtained by the air conditioning load prediction model, and recording as:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211700889.5A CN115860260A (en) | 2022-12-28 | 2022-12-28 | Resident air conditioner load prediction model considering frequency domain data characteristic decomposition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211700889.5A CN115860260A (en) | 2022-12-28 | 2022-12-28 | Resident air conditioner load prediction model considering frequency domain data characteristic decomposition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115860260A true CN115860260A (en) | 2023-03-28 |
Family
ID=85655623
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211700889.5A Pending CN115860260A (en) | 2022-12-28 | 2022-12-28 | Resident air conditioner load prediction model considering frequency domain data characteristic decomposition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115860260A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116706902A (en) * | 2023-08-03 | 2023-09-05 | 国网湖北省电力有限公司营销服务中心(计量中心) | Domestic electricity optimizing method for regional house, electronic equipment and computer readable medium |
CN117894491A (en) * | 2024-03-15 | 2024-04-16 | 济南宝林信息技术有限公司 | Physiological monitoring data processing method for assessing mental activities |
-
2022
- 2022-12-28 CN CN202211700889.5A patent/CN115860260A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116706902A (en) * | 2023-08-03 | 2023-09-05 | 国网湖北省电力有限公司营销服务中心(计量中心) | Domestic electricity optimizing method for regional house, electronic equipment and computer readable medium |
CN116706902B (en) * | 2023-08-03 | 2023-11-14 | 国网湖北省电力有限公司营销服务中心(计量中心) | Domestic electricity optimizing method for regional house, electronic equipment and computer readable medium |
CN117894491A (en) * | 2024-03-15 | 2024-04-16 | 济南宝林信息技术有限公司 | Physiological monitoring data processing method for assessing mental activities |
CN117894491B (en) * | 2024-03-15 | 2024-06-11 | 济南宝林信息技术有限公司 | Physiological monitoring data processing method for assessing mental activities |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111860982B (en) | VMD-FCM-GRU-based wind power plant short-term wind power prediction method | |
CN108805188B (en) | Image classification method for generating countermeasure network based on feature recalibration | |
CN115860260A (en) | Resident air conditioner load prediction model considering frequency domain data characteristic decomposition | |
CN111697621B (en) | Short-term wind power prediction method based on EWT-PDBN combination | |
CN109886464B (en) | Low-information-loss short-term wind speed prediction method based on optimized singular value decomposition generated feature set | |
CN109784473A (en) | A kind of short-term wind power prediction method based on Dual Clocking feature learning | |
CN112232244A (en) | Fault diagnosis method for rolling bearing | |
CN114399032B (en) | Method and system for predicting metering error of electric energy meter | |
CN110991721A (en) | Short-term wind speed prediction method based on improved empirical mode decomposition and support vector machine | |
CN114358389B (en) | Short-term power load prediction method combining VMD decomposition and time convolution network | |
CN110766060B (en) | Time series similarity calculation method, system and medium based on deep learning | |
CN114912077B (en) | Sea wave forecasting method integrating random search and mixed decomposition error correction | |
CN112434891A (en) | Method for predicting solar irradiance time sequence based on WCNN-ALSTM | |
CN109447333A (en) | A kind of Time Series Forecasting Methods and device based on random length fuzzy information granule | |
CN113411216A (en) | Network flow prediction method based on discrete wavelet transform and FA-ELM | |
CN116050621A (en) | Multi-head self-attention offshore wind power ultra-short-time power prediction method integrating lifting mode | |
CN111141879B (en) | Deep learning air quality monitoring method, device and equipment | |
CN116629431A (en) | Photovoltaic power generation amount prediction method and device based on variation modal decomposition and ensemble learning | |
CN117239722A (en) | System wind load short-term prediction method considering multi-element load influence | |
CN112464981A (en) | Self-adaptive knowledge distillation method based on space attention mechanism | |
CN117034055A (en) | L-converter-based short-term photovoltaic power generation power prediction method | |
CN117407660B (en) | Regional sea wave forecasting method based on deep learning | |
Luo et al. | A novel nonlinear combination model based on support vector machine for stock market prediction | |
CN117407704A (en) | Renewable energy source generation power prediction method, computer equipment and storage medium thereof | |
CN113496255B (en) | Power distribution network mixed observation point distribution method based on deep learning and decision tree driving |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |