CN113869556A - Power consumption prediction method, device and equipment - Google Patents

Power consumption prediction method, device and equipment Download PDF

Info

Publication number
CN113869556A
CN113869556A CN202111020639.2A CN202111020639A CN113869556A CN 113869556 A CN113869556 A CN 113869556A CN 202111020639 A CN202111020639 A CN 202111020639A CN 113869556 A CN113869556 A CN 113869556A
Authority
CN
China
Prior art keywords
data
modal
frequency component
components
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111020639.2A
Other languages
Chinese (zh)
Inventor
王浩淼
顾海林
关艳
田浩杰
王志斌
高曦莹
胡楠
胡畔
冉冉
薄珏
白亮
胡非
曲睿婷
齐俊
夏雨
刘育博
高强
刘晓强
刘祉成
张戈
邢子墨
黄梦彤
教传明
张福良
姜博宇
钟瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Information and Telecommunication Branch of State Grid Liaoning Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Information and Telecommunication Branch of State Grid Liaoning Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Information and Telecommunication Branch of State Grid Liaoning Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Publication of CN113869556A publication Critical patent/CN113869556A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06315Needs-based resource requirements planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention provides a method, a device and equipment for predicting power consumption. Setting an advance order, acquiring power consumption data and influence factor data in the advance order, and forming a training data set according to a time sequence; carrying out modal decomposition on the training data set to obtain a plurality of connotative modal components and a residual error item, and carrying out data standardization; dividing a plurality of connotative modal components after data standardization into high-frequency components and low-frequency components; and respectively establishing a prediction model for the high-frequency component and the low-frequency component to obtain a prediction result of each component, integrating the prediction results of each component, and performing data standardization inverse transformation to obtain a power consumption prediction result of the target time. In this way, the characteristic that the limit lifting tree has better applicability to high-frequency complex components and the characteristic that the ridge regression has better applicability to low-frequency stable components can be fully utilized, so that the data is diversified, the prediction process is efficient, and the predicted data is more accurate.

Description

Power consumption prediction method, device and equipment
Technical Field
Embodiments of the present invention generally relate to the field of power consumption measurement, and more particularly, to a power consumption prediction method, apparatus, and device.
Background
At present, economy and power industry are rapidly developed, the domestic power market enters a deep reformation stage, and power enterprises at all levels need to make predictions on the power demand market. Urban power consumption and load characteristic prediction are basic works in the field of electrical engineering. The medium-long term power demand prediction can reflect the influences of the future national policy large direction, the global economic situation, the global energy resource cooperation configuration direction and the environmental change. The reasonable utilization and planning of power resources are very important, the large-scale demand of industrial enterprises on power can be met while the domestic power consumption of urban and rural residents is guaranteed, and the economic stable development is realized. Therefore, the electricity consumption is reasonably predicted, and economic benefits and social benefits are achieved.
The existing problems of power consumption prediction in a power system have some problems: firstly, the existing research method does not consider the characteristics of strong fluctuation, low dimensionality and the like of power grid power data, and the characteristics of the power data before and after are not fully utilized, so that the accuracy rate of power consumption prediction is low; and secondly, the power consumption is closely related to factors such as weather, temperature, regions and the like, and when the power consumption is predicted, the integration of various characteristic information is more favorable for the accuracy of model prediction. Some prediction methods are mainly suitable for the conditions of small data quantity and few influence factors, and research on high-dimensional data variable screening and high-precision prediction methods is also lacked. The existing prediction models have the defect of simplification and fail to mine the nonlinear interaction relationship between multiple factors and the power consumption.
Disclosure of Invention
According to an embodiment of the present invention, a power usage prediction scheme is provided.
In a first aspect of the invention, a power usage prediction method is provided. The method comprises the following steps:
setting an advance order, acquiring power consumption data and influence factor data in the advance order, and forming a training data set according to a time sequence;
performing modal decomposition on the training data set to obtain a plurality of connotative modal components and a residual error item, and performing data standardization; dividing a plurality of connotative modal components after data standardization into high-frequency components and low-frequency components; the high-frequency component is a first connotative modal component obtained through the modal decomposition; the low-frequency component is other components except the first connotative modal component obtained through modal decomposition;
and respectively establishing a prediction model for the high-frequency component and the low-frequency component to obtain a prediction result of each component, integrating the prediction results of each component, and performing data standardization inverse transformation to obtain a power consumption prediction result of the target time.
Further, the performing modal decomposition on the training set data includes:
s201, adding white noise into the training data set to generate first time series data;
s202, calculating an upper envelope and a lower envelope of the first time series data according to the local maximum value and the local minimum value of the first time series data, and calculating an average value of the upper envelope and the lower envelope;
s203, extracting the difference between the training data set and the average value as an intermediate signal, judging whether the intermediate signal meets the constraint condition of the connotation modal component, and if so, taking the intermediate signal as the connotation modal component; otherwise, returning to S202 by taking the intermediate signal as a data basis;
s204, subtracting the first time series data from the connotative modal component to obtain a residual error, judging whether the residual error only contains one extreme value, and if so, obtaining a plurality of connotative modal components and one residual error; otherwise, taking the residual error as a data base, and returning to S202.
Further, the constraint conditions of the connotative modal components include:
within the data section, the difference between the number of extreme points and the number of zero-crossing points is not more than 1, and
at any time, the average of the upper envelope formed by the local maximum points and the lower envelope formed by the local minimum points is zero.
Further, for the high-frequency component, establishing a high-frequency component prediction model by using a limit lifting tree to obtain a prediction result of the high-frequency component, including:
setting the high-frequency component to D { (x)i,yi) And (c), wherein i =1, 2., n, n is the number of samples, and a limit lifting tree model is established as a high-frequency component prediction model, and the high-frequency component prediction model is as follows:
Figure BDA0003241807070000031
wherein, F is the set of all corresponding regression trees; f (x) is a function in F;
the objective function of the high-frequency component prediction model is as follows:
fobj(θ)=L(θ)+ω(θ)
Figure BDA0003241807070000032
Figure BDA0003241807070000033
wherein L (θ) is an error term; ω (θ) is a regularization term; k is the number of regression trees; k is the index of the regression tree.
Further, for the low-frequency component, a ridge regression is used for establishing a low-frequency component prediction model, and a prediction result of the low-frequency component is obtained; wherein, the loss function of the low-frequency component prediction model is as follows:
J(β)=∑(y-Xβ)2+∑λβ2
wherein, beta is a regression coefficient; λ is a constant coefficient; x is a feature vector; y is a tag value;
let the regression coefficient beta satisfy
Figure BDA0003241807070000034
Beta ═ X can be obtainedTX+λI)-1XTy, where I is an identity matrix; the regression coefficient beta is a function of the constant coefficient lambda to obtain the trend of the regression coefficient betaAnd obtaining a low-frequency component prediction model according to the constant coefficient lambda value corresponding to the stable point.
Further, the data is normalized, including: scaling and transforming the data after modal decomposition according to a transformation formula:
Figure BDA0003241807070000041
wherein x is an original feature; x is the number ofminIs the characteristic minimum; x is the number ofmaxIs the characteristic maximum value; x' is the converted feature;
the data standardization inverse transformation amplifies the prediction result of each component to the original proportion, and the transformation formula is as follows:
y′=y*(ymax-ymin)+ymin
wherein y is a predicted tag value; y isminIs the minimum value of the label; y ismaxIs the maximum value of the label; and y' is the final predicted value after conversion.
In a second aspect of the present invention, a power usage prediction apparatus is provided. The device includes:
the data acquisition module is used for setting an advance order, acquiring power consumption data and influence factor data in the advance order, and forming a training data set according to a time sequence;
the data decomposition module is used for carrying out modal decomposition on the training data set to obtain a plurality of connotative modal components and a residual error item and carrying out data standardization; dividing a plurality of connotative modal components after data standardization into high-frequency components and low-frequency components; the high-frequency component is a first connotative modal component obtained through the modal decomposition; the low-frequency component is other components except the first connotative modal component obtained through modal decomposition;
and the data prediction module is used for respectively establishing prediction models for the high-frequency component and the low-frequency component to obtain the prediction results of the components, integrating the prediction results of the components, and carrying out data standardization inverse transformation to obtain the power consumption prediction result of the target time.
In a third aspect of the invention, an electronic device is provided. The electronic device includes: a memory having a computer program stored thereon and a processor implementing the method as described above when executing the program.
In a fourth aspect of the invention, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the method as according to the first aspect of the invention.
It should be understood that the statements herein reciting aspects are not intended to limit the critical or essential features of any embodiment of the invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
The invention introduces multiple factors such as economy and weather related to the electricity consumption, predicts the electricity consumption, diversifies the data, and mines the nonlinear interaction relation between the multiple factors and the electricity consumption, so that the predicted data is more accurate; the high-frequency characteristic component and the low-frequency characteristic component are respectively predicted through the combined model according to the characteristics of the components, and then integrated processing is carried out, so that the characteristics that the limit lifting tree has good applicability to the high-frequency complex component and the ridge regression has good applicability to the low-frequency stable component are fully utilized, the prediction process is efficient, and the prediction data are accurate.
Drawings
The above and other features, advantages and aspects of various embodiments of the present invention will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, like or similar reference characters designate like or similar elements, and wherein:
FIG. 1 illustrates a flow diagram of a power usage prediction method according to an embodiment of the invention;
FIG. 2 illustrates a schematic flow diagram for modal decomposition of training set data according to an embodiment of the present invention;
FIG. 3 illustrates a block diagram of a power usage prediction apparatus according to an embodiment of the present invention;
FIG. 4 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
In the invention, multiple factors such as economy, weather and the like related to the electricity consumption are introduced to predict the electricity consumption, so that the data are diversified, the nonlinear interaction relation between the multiple factors and the electricity consumption is mined, and the predicted data are more accurate; the high-frequency characteristic component and the low-frequency characteristic component are respectively predicted through the combined model according to the characteristics of the components, and then integrated processing is carried out, so that the characteristics that the limit lifting tree has good applicability to the high-frequency complex component and the ridge regression has good applicability to the low-frequency stable component are fully utilized, and the prediction process is efficient and accurate.
FIG. 1 shows a flow chart of a power usage prediction method of an embodiment of the invention.
The method comprises the following steps:
s101, setting an advance order, acquiring power consumption data and influence factor data in the advance order, and forming a training data set according to a time sequence.
As an embodiment of the present invention, the order is a unit time for acquiring the history data, such as day, week, month, etc.; the advance order is the number of the orders selected by the current electricity consumption prediction, such as the previous 30 days, the previous 4 weeks, the previous 3 months and the like.
As an embodiment of the present invention, the influence factor data includes economic dimension data and meteorological dimension data; economic dimension data such as Purchase Manager Index (PMI) and industrial manufacturer Price Index (PPI); the purchase manager index PMI is formed by weighting a new order index, a production index, a practitioner index, a supplier distribution index and a main raw material inventory index through an index summarized by investigation in a corresponding order of a purchase manager, and is one of the leading indexes for monitoring the macroscopic economic trend. Because the manufacturing industry has characteristics of large scale and high energy consumption in power consumption, the fluctuation of the production condition can influence the trend of the power consumption of the whole society to a large extent, and thus the PMI can indicate the change of the power consumption increase condition of the whole society to a certain extent. The factory price index PPI of the industrial producer is used for measuring the price change trend of the production field, the side surface of the PPI can reflect the production demand and the enterprise operation condition, and the strength of the industrial production activity is directly related to the increase condition of the industrial power consumption, so that the increase level of the power consumption of the whole society is determined. Meteorological dimensional data such as temperature, precipitation, relative humidity, and other meteorological elements; these elements are very closely related to the amount of electricity consumed for life.
The data in the advance order can be divided according to the advance order and the time sequence to form a time sequence containing power consumption data and influence factor data, and the time sequence is used as a training data set for power consumption prediction.
As an embodiment of the present invention, the monthly power consumption is derived from a local website schedule, monthly power consumption data of the whole province in liaison province during the period from 1 month in 2010 to 6 months in 2020 is selected, the highest value of the monthly power consumption during the period is 2515.66 ten thousand watt hours, the lowest value is 1746.75 thousand watt hours, the average value is 2059.29 thousand watt hours, the standard deviation is 1622.44 thousand watt hours, and the fluctuation of the monthly power consumption during the period is large. The method comprises the following steps of selecting a monthly purchase manager index PMI and an industrial producer delivery price index (PPI) of the whole province of Liaoning province from 1 month in 2010 to 6 months in 2020 as economic dimensions of influencing factors of electricity consumption, wherein a data source is a Wande database. The weather dimension selects the monthly average temperature, precipitation and relative humidity indexes of the whole province of Liaoning province from 1 month in 2010 to 6 months in 2020, and the data source is the China weather net. The time series processing selects the electricity consumption of the next month predicted by using the electricity consumption of the previous four months and the influence factor data. In this embodiment, the order is an order, and the advance order is 4. Of course, the advance orders of the power consumption and the influencing factor data can be the same or different, that is, the power consumption of the previous four months and the influencing factor data of the previous 3 months can be used for prediction.
By introducing multiple factors such as economy and weather related to the electricity consumption, the electricity consumption is predicted, data are diversified, the nonlinear interaction relation between the multiple factors and the electricity consumption is mined, and the predicted data are more accurate.
S102, performing modal decomposition on the training data set to obtain a plurality of connotative modal components and a residual error item, and performing data standardization; dividing a plurality of connotative modal components after data standardization into high-frequency components and low-frequency components; the high-frequency component is a first connotative modal component obtained through the modal decomposition; the low-frequency component is the other component except the first connotative modal component obtained by the modal decomposition.
The performing modal decomposition on the training set data comprises:
s201, adding white noise into the training data set to generate first time series data.
In this embodiment, a set of data sets y (t), i.e. first time series data, is obtained by adding different white noise to the original signal, i.e. the training data set x (t).
S202, calculating an upper envelope e of the first time series data y (t) according to the local maximum value and the local minimum value of the first time series data y (t)u(t) and lower envelope el(t) and calculating an average value m (t) of the upper and lower envelopes;
Figure BDA0003241807070000081
wherein m (t) is the average of the upper and lower envelopesA value; e.g. of the typeu(t) is the upper envelope of the first time series data y (t); e.g. of the typel(t) is the lower envelope of the first time series data y (t).
S203, extracting a difference between the training data set x (t) and the average value m (t) as an intermediate signal h (t), i.e. h (t) ═ y (t) -m (t); then judging whether the intermediate signal h (t) meets the constraint condition of the connotation modal component, if so, the intermediate signal is the connotation modal component; otherwise, returning to S202 to execute again by taking the intermediate signal as the data basis.
In this embodiment, the constraint condition of the connotative modal component is to satisfy the following two conditions at the same time, that is:
condition 1: in the data segment, the difference between the number of extreme points and the number of zero-crossing points is not more than 1; for example, it may be 0 or 1;
condition 2: at any time, the average value m (t) of the upper envelope formed by the local maximum point and the lower envelope formed by the local minimum point is zero, that is, the upper and lower envelopes are locally symmetrical with respect to the time axis.
S204, subtracting the first time series data y (t) from the connotative modal component to obtain a residual error, judging whether the residual error only contains an extreme value, and if so, obtaining a plurality of connotative modal components and a residual error; otherwise, taking the residual error as a data base, and returning to S202.
In the present embodiment, the first connotative modal component c is obtained1Begin to obtain the corresponding residual r1I.e. r1=y(t)-c1(ii) a Determining residual error r1If only one extreme value is contained, if not, the residual error r1And returning to S202 to be executed again to obtain the subsequent residual error until the obtained residual error rnOnly one extreme value is contained, and the iteration is stopped. Residual rnNamely residual error items obtained by modal decomposition.
In summary, by performing modal decomposition on the training set data, we obtain:
Figure BDA0003241807070000091
wherein, cjTo obtain a residual error rnThe connotative modal components obtained.
Further, the result obtained by the modal decomposition is subjected to data standardization.
The data is normalized, including:
scaling the data after the modal decomposition to make the data fall into a specific interval; for example, in this embodiment, the data is mapped onto the [0, 1] interval using normalization, and the transformation formula is:
Figure BDA0003241807070000092
wherein x is an original feature; x is the number ofminIs the characteristic minimum; x is the number ofmaxIs the characteristic maximum value; x' is the converted feature.
Then, a plurality of connotative modal components after the data normalization are distinguished into high-frequency components and low-frequency components, including:
the first content modal component c obtained by the modal decomposition is used1As a high frequency component;
dividing the first connotation modal component c obtained by the modal decomposition1Other than c2~cjAs a low frequency component.
Through the process, the result obtained through modal decomposition is divided into the high-frequency component and the low-frequency component, so that the prediction model can be established for different components, and the process of predicting the result is more accurate and efficient.
S103, respectively establishing a prediction model for the high-frequency component and the low-frequency component to obtain a prediction result of each component, integrating the prediction results of each component, and performing data standardization inverse transformation to obtain a power consumption prediction result of the target time.
As an embodiment of the present invention, for the high-frequency component, a high-frequency component prediction model is established by using a limit lifting tree, so as to obtain a prediction result of the high-frequency component. The method specifically comprises the following steps:
setting the high-frequency component to D { (x)i,yi) And establishing a limit lifting tree model as a high-frequency component prediction model, wherein i is 1, 2.
Figure BDA0003241807070000101
Wherein, F is the set of all corresponding regression trees; f (x) is a function in F;
the objective function of the high-frequency component prediction model is as follows:
fobj(θ)=L(θ)+ω(θ)
Figure BDA0003241807070000102
Figure BDA0003241807070000103
wherein L (θ) is an error term (error function); omega (theta) is a regularization term and is used for measuring the complexity of the model; k is the number of regression trees; k is the index of the regression tree.
When training data is learned, a new function f is added on the basis of keeping an original model unchanged every time, a corresponding objective function is observed, and if the objective function can be minimized by adding the new function, the function is added into the model. And performing Taylor expansion on the target function, further converting the problem to be solved into the problem of solving the minimum value of the quadratic function, and solving the optimal solution.
As an embodiment of the present invention, for the low-frequency component, a ridge regression is used to establish a low-frequency component prediction model, so as to obtain a prediction result of the low-frequency component.
Wherein, the loss function of the low-frequency component prediction model is as follows:
J(β)=∑(y-Xβ)2+∑λβ2
wherein, beta is a regression coefficient; lambda is a constant coefficient and needs to be adjusted and optimized; x is a feature vector; y is a tag value;
the specific tuning process for the constant coefficient lambda comprises the following steps:
let the regression coefficient beta satisfy
Figure BDA0003241807070000111
Beta ═ X can be obtainedTX+λI)-1XTy, where I is an identity matrix; the regression coefficient beta is a function of a constant coefficient lambda, and when the coefficient belongs to the range from ∈ [0 to infinity), a beta-lambda curve in a plane rectangular coordinate system is called a ridge curve. When the point where the beta tends to be stable is the lambda value to be searched, the constant coefficient lambda value corresponding to the point where the beta tends to be stable of the regression coefficient beta is obtained, and the low-frequency component prediction model is obtained.
With the increase of the complexity of the model, the better the effect on the training set is, namely the smaller the deviation of the model is; but at the same time the larger the variance of the model (the variance of the regression coefficients). Ridge regression is a special biased estimation regression method for collinear data analysis, and is essentially an improved least square estimation method, and a regression coefficient is obtained by giving up unbiased property of the least square method at the cost of losing part of information and reducing precision, so that the regression method is more practical and reliable. The key to ridge regression is to find a reasonable lambda value to balance the variance and bias of the model.
Further, after the prediction results of the components are obtained, the prediction results of the high-frequency component and the low-frequency component are integrated, and the integrated data are subjected to standardized inverse transformation to finally obtain the power consumption prediction result of the target time. The normalized inverse transform amplifies the prediction result of each component to the original proportion, and the transform formula is as follows:
y′=y*(ymax-ymin)+ymin
wherein y is a predicted tag value; y isminIs the minimum value of the label; y ismaxIs the maximum value of the label; and y' is the final predicted value after conversion. The original proportion is after modal decompositionData scale before data normalization.
In the above embodiment, the error results of the ensemble empirical mode decomposition prediction model are compared with the error results of the extreme lifting tree prediction model, the linear regression prediction model, the ridge regression prediction model, and the artificial neural network prediction model, as shown in table 1.
Model (model) Root Mean Square Error (RMSE) Mean percent error (MAPE)
Ensemble empirical mode decomposition 77.720 3.175%
Limit lifting tree 112.520 4.655%
Linear regression 122.688 5.090%
Ridge regression 104.464 4.286%
Artificial neural network 118.287 5.151%
TABLE 1
According to the embodiment of the invention, a mixed effective combination model is provided, firstly, a power consumption time series is decomposed into a plurality of modal components (IMF) through ensemble empirical mode decomposition, then, high and low frequency components are distinguished, respective characteristics of the decomposed components are fully utilized, and different methods are adopted to respectively establish a prediction model. In the prediction of each component, the invention combines multiple factors (economic factors and meteorological factors) related to the power consumption to perform analysis. The invention uses a combined model to predict components with different frequencies, thus overcoming the defect of model simplification. The power consumption is closely related to factors such as macroscopic economy, weather and the like, and when the power consumption is predicted, integration of various characteristic information is more favorable for accuracy of model prediction.
In the decomposition integration framework of time series prediction, the decomposition of the time series has a direct influence on the integration prediction result. In order to make up for the defect that a traditional decomposition integration prediction model does not take physical factors and meteorological factors into consideration, multiple factors such as economy and meteorology related to power consumption are introduced, and the factors are used for predicting the decomposed power consumption. The power consumption can be divided into high-frequency characteristic components and low-frequency characteristic components, prediction models are respectively established for the high-frequency characteristic and the low-frequency characteristic, and finally aggregation is carried out. The characteristics that the limit lifting tree has good applicability to high-frequency complex components and the ridge regression has good applicability to low-frequency stable components are fully utilized, so that the prediction process is efficient and accurate.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that the acts and modules illustrated are not necessarily required to practice the invention.
The above is a description of method embodiments, and the embodiments of the present invention are further described below by way of apparatus embodiments.
As shown in fig. 3, the apparatus 300 includes:
the data acquisition module 310 is configured to set an advance order, acquire power consumption data and influence factor data within the advance order, and form a training data set according to a time sequence;
the data decomposition module 320 is configured to perform modal decomposition on the training data set to obtain a plurality of content modal components and a residual term, and perform data normalization; dividing a plurality of connotative modal components after data standardization into high-frequency components and low-frequency components;
and the data prediction module 330 is configured to respectively establish prediction models for the high-frequency component and the low-frequency component to obtain prediction results of the components, integrate the prediction results of the components, and perform data standardization inverse transformation to obtain a power consumption prediction result of the target time.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the described module may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
As shown in fig. 4, the device includes a Central Processing Unit (CPU) that can perform various appropriate actions and processes according to computer program instructions stored in a Read Only Memory (ROM) or computer program instructions loaded from a storage unit into a Random Access Memory (RAM). In the RAM, various programs and data required for the operation of the device can also be stored. The CPU, ROM, and RAM are connected to each other via a bus. An input/output (I/O) interface is also connected to the bus.
A plurality of components in the device are connected to the I/O interface, including: an input unit such as a keyboard, a mouse, etc.; an output unit such as various types of displays, speakers, and the like; storage units such as magnetic disks, optical disks, and the like; and a communication unit such as a network card, modem, wireless communication transceiver, etc. The communication unit allows the device to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processing unit executes the respective methods and processes described above, for example, methods S101 to S103. For example, in some embodiments, methods S101-S103 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as a storage unit. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device via ROM and/or the communication unit. When the computer program is loaded into RAM and executed by the CPU, one or more of the steps of methods S101-S103 described above may be performed. Alternatively, in other embodiments, the CPU may be configured to perform methods S101-S103 by any other suitable means (e.g., by way of firmware).
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a load programmable logic device (CPLD), and the like.
Program code for implementing the methods of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Further, while operations are depicted in a particular order, this should be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the invention. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (10)

1. A power consumption prediction method is characterized by comprising the following steps:
setting an advance order, acquiring power consumption data and influence factor data in the advance order, and forming a training data set according to a time sequence;
performing modal decomposition on the training data set to obtain a plurality of connotative modal components and a residual error item, and performing data standardization; dividing a plurality of connotative modal components after data standardization into high-frequency components and low-frequency components; the high-frequency component is a first connotative modal component obtained through the modal decomposition; the low-frequency component is other components except the first connotative modal component obtained through modal decomposition;
and respectively establishing a prediction model for the high-frequency component and the low-frequency component to obtain a prediction result of each component, integrating the prediction results of each component, and performing data standardization inverse transformation to obtain a power consumption prediction result of the target time.
2. The method of claim 1, wherein the performing modal decomposition of training set data comprises:
s201, adding white noise into the training data set to generate first time series data;
s202, calculating an upper envelope and a lower envelope of the first time series data according to the local maximum value and the local minimum value of the first time series data, and calculating an average value of the upper envelope and the lower envelope;
s203, extracting the difference between the training data set and the average value as an intermediate signal, judging whether the intermediate signal meets the constraint condition of the connotation modal component, and if so, taking the intermediate signal as the connotation modal component; otherwise, returning to S202 by taking the intermediate signal as a data basis;
s204, subtracting the first time series data from the connotative modal component to obtain a residual error, judging whether the residual error only contains one extreme value, and if so, obtaining a plurality of connotative modal components and one residual error; otherwise, taking the residual error as a data base, and returning to S202.
3. The method of claim 2, wherein the constraints on the connotative modal components comprise:
within the data section, the difference between the number of extreme points and the number of zero-crossing points is not more than 1, and
at any time, the average of the upper envelope formed by the local maximum points and the lower envelope formed by the local minimum points is zero.
4. The method of claim 1, wherein for the high-frequency component, building a high-frequency component prediction model using a limit-raised tree to obtain a prediction result of the high-frequency component, comprises:
setting the high-frequency component to D { (x)i,yi) And establishing a limit lifting tree model as a high-frequency component prediction model, wherein i is 1, 2.
Figure FDA0003241807060000021
Wherein, F is the set of all corresponding regression trees; f (x) is a function in F;
the objective function of the high-frequency component prediction model is as follows:
fobj(θ)=L(θ)+ω(θ)
Figure FDA0003241807060000022
Figure FDA0003241807060000023
wherein L (θ) is an error term; ω (θ) is a regularization term; k is the number of regression trees; k is the index of the regression tree.
5. The method of claim 1, wherein for the low frequency component, a ridge regression is used to build a low frequency component prediction model, and a prediction result of the low frequency component is obtained; wherein, the loss function of the low-frequency component prediction model is as follows:
J(β)=∑(y-Xβ)2+∑λβ2
wherein, beta is a regression coefficient; λ is a constant coefficient; x is a feature vector; y is a tag value;
let the regression coefficient beta satisfy
Figure FDA0003241807060000031
Beta ═ X can be obtainedTX+λI)-1XTy, where I is an identity matrix; the regression coefficient beta is a function of the constant coefficient lambda, and a constant coefficient lambda value corresponding to a point of the regression coefficient beta which tends to be stable is obtained, so that a low-frequency component prediction model is obtained.
6. The method of claim 1, wherein the data normalization comprises: scaling and transforming the data after modal decomposition according to a transformation formula:
Figure FDA0003241807060000032
wherein x is an original feature; x is the number ofminIs the characteristic minimum; x is the number ofmaxIs the characteristic maximum value; x' is the converted feature.
7. The method of claim 1, wherein the data normalization inverse transform comprises:
amplifying the prediction result of each component to an original proportion, wherein a transformation formula is as follows:
y′=y*(ymax-ymin)+ymin
wherein y is a predicted tag value; y isminIs the minimum value of the label; y ismaXIs the maximum value of the label; and y' is the final predicted value after conversion.
8. A power consumption amount prediction apparatus characterized by comprising:
the data acquisition module is used for setting an advance order, acquiring power consumption data and influence factor data in the advance order, and forming a training data set according to a time sequence;
the data decomposition module is used for carrying out modal decomposition on the training data set to obtain a plurality of connotative modal components and a residual error item and carrying out data standardization; dividing a plurality of connotative modal components after data standardization into high-frequency components and low-frequency components; the high-frequency component is a first connotative modal component obtained through the modal decomposition; the low-frequency component is other components except the first connotative modal component obtained through modal decomposition;
and the data prediction module is used for respectively establishing prediction models for the high-frequency component and the low-frequency component to obtain the prediction results of the components, integrating the prediction results of the components, and carrying out data standardization inverse transformation to obtain the power consumption prediction result of the target time.
9. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program, wherein the processor, when executing the program, implements the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202111020639.2A 2021-05-19 2021-09-01 Power consumption prediction method, device and equipment Pending CN113869556A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110548192 2021-05-19
CN202110548192X 2021-05-19

Publications (1)

Publication Number Publication Date
CN113869556A true CN113869556A (en) 2021-12-31

Family

ID=78989167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111020639.2A Pending CN113869556A (en) 2021-05-19 2021-09-01 Power consumption prediction method, device and equipment

Country Status (1)

Country Link
CN (1) CN113869556A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471017A (en) * 2022-11-15 2022-12-13 浙江大学 Regional microgrid interconnection optimization method and system based on mutual power assistance
CN116296243A (en) * 2023-03-03 2023-06-23 西南交通大学 Pneumatic identification method based on large-size nuclear dense blocks

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471017A (en) * 2022-11-15 2022-12-13 浙江大学 Regional microgrid interconnection optimization method and system based on mutual power assistance
CN116296243A (en) * 2023-03-03 2023-06-23 西南交通大学 Pneumatic identification method based on large-size nuclear dense blocks
CN116296243B (en) * 2023-03-03 2024-02-23 西南交通大学 Pneumatic identification method based on large-size nuclear dense blocks

Similar Documents

Publication Publication Date Title
US11398000B2 (en) Methods and systems for machine-learning for prediction of grid carbon emissions
CN108846530B (en) Short-term load prediction method based on clustering-regression model
US20190228329A1 (en) Prediction system and prediction method
CN110610280A (en) Short-term prediction method, model, device and system for power load
EP3550499A1 (en) Prediction system and prediction method
CN111680841B (en) Short-term load prediction method, system and terminal equipment based on principal component analysis
Khan et al. Genetic algorithm based optimized feature engineering and hybrid machine learning for effective energy consumption prediction
CN113869556A (en) Power consumption prediction method, device and equipment
Porteiro et al. Electricity demand forecasting in industrial and residential facilities using ensemble machine learning
CN108053048A (en) A kind of gradual photovoltaic plant ultra-short term power forecasting method of single step and system
CN114372360A (en) Method, terminal and storage medium for power load prediction
CN111191811A (en) Cluster load prediction method and device and storage medium
CN104573877A (en) Power distribution network equipment demand prediction and quantitative method and system
CN114595861A (en) MSTL (modeling, transformation, simulation and maintenance) and LSTM (least Square TM) model-based medium-and-long-term power load prediction method
Guo et al. Power demand forecasting and application based on SVR
Zhao et al. Short-term microgrid load probability density forecasting method based on k-means-deep learning quantile regression
CN106600029A (en) Macro-economy predictive quantization correction method based on electric power data
CN116091118A (en) Electricity price prediction method, device, equipment, medium and product
CN111160865A (en) Workflow management method and device
CN116470491A (en) Photovoltaic power probability prediction method and system based on copula function
CN116384622A (en) Carbon emission monitoring method and device based on electric power big data
CN111950752A (en) Photovoltaic power station generating capacity prediction method, device and system and storage medium thereof
El Sayed et al. A combined effective time series model based on clustering and whale optimization algorithm for forecasting smart meters electricity consumption
Zhao et al. Internet-of-thing based real-time electrical market monitoring system design
CN111967918A (en) System model for predicting electricity price based on support vector regression algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination