Disclosure of Invention
The invention aims to provide a method and a system for predicting a power consumption load interval of a transformer area based on online learning, which can enable an actual value of the power consumption load to fall within an interval range, enable a prediction interval to be as small as possible, meet the real demand of prediction of the load interval of the transformer area, and improve the accuracy of model prediction.
In order to achieve the above object, in a first aspect, the present invention provides a method for predicting a power load interval of a distribution area based on online learning, where the method includes: step S1, in an offline environment, obtains recent history data related to the prediction of the distribution room electrical load section as an input of the model. And step S2, training the recent historical data of the input model by using a time series neural network double-layer LSTM and combining with the load interval prediction loss function to obtain a load interval prediction model. In step S3, the load section prediction model is loaded in the platform area environment. And step S4, when a certain day x is finished, the intelligent electric meter collects the electricity load data of the day and preprocesses the data of the latest m days including the day, wherein x is more than or equal to 1, and m is the number of days corresponding to the training set data when the load interval prediction model is used for online learning. And step S5, in the early learning process, using the data preprocessed in the step S4 as a training set, performing online learning on the load interval prediction model loaded in the step S3, and training and optimizing to obtain the latest load interval prediction model. And step S6, preprocessing the data of the latest n days including the current day, wherein n is the time series length and is less than or equal to m. And step S7, loading the latest load interval prediction model, and inputting the data preprocessed in the step S6 to obtain the prediction result of the model on the load interval after the x day.
In an embodiment of the present invention, the method for predicting the electric load section of the distribution room based on online learning further includes: and step S8, when the x +1 th day is finished, repeating the steps S3 to S6, and obtaining the load interval prediction result of the model after the x +1 th day.
In one embodiment of the present invention, the recent history data is in units of days, and the recent history data includes district electrical load data, holiday data, lunar calendar data, and 24 solar terms data, and the district electrical load data, holiday data, lunar calendar data, and 24 solar terms data are preprocessed and combined as input of the model.
In an embodiment of the present invention, step S2 includes: for the M almanac history electrical load data of the platform area, a vector representation of each day is obtained through step S1. A time series of length n is constructed, and vector representations for n consecutive days are combined to give a 24 x n matrix, with a total of M-n +1 matrices. And taking the obtained M-n +1 time sequences as input, and transmitting the time sequences into a neural network double-layer LSTM for training. In time series neural network two-layer LSTM training, a load interval prediction loss function is used as a target.
In a second aspect, an embodiment of the present invention further provides a system for predicting a power load interval of a distribution room based on online learning, where the system includes: the device comprises an acquisition module, a training module, a loading module, a first preprocessing module, a learning module, a second preprocessing module and a prediction module. The acquisition module is used for acquiring recent historical data related to the prediction of the transformer area electric load interval in an offline environment and taking the recent historical data as the input of the model. The training module is used for training the recent historical data of the input model by using the time series neural network double-layer LSTM and combining the load interval prediction loss function to obtain a load interval prediction model. The loading module is used for loading the load interval prediction model in the platform area environment. The first preprocessing module is used for acquiring the electricity load data of a certain day when x finishes, and preprocessing the data of the latest m days including the day, wherein x is larger than or equal to 1, and m is the number of days corresponding to the training set data when the load interval prediction model is used for online learning. The learning module is used for using data obtained after processing by the first preprocessing module as a training set in the prior learning process, performing online learning on the load interval prediction model loaded by the loading module, and training and optimizing to obtain the latest load interval prediction model. The second preprocessing module is used for preprocessing the data of the latest n days including the current day, wherein n is the time sequence length, and n is less than or equal to m. And the prediction module is used for loading the latest load interval prediction model and inputting the data processed by the second preprocessing module to obtain the prediction result of the model on the load interval after the x day.
In one embodiment of the present invention, the recent history data is in units of days, and the recent history data includes district electrical load data, holiday data, lunar calendar data, and 24 solar terms data, and the district electrical load data, holiday data, lunar calendar data, and 24 solar terms data are preprocessed and combined as input of the model.
In an embodiment of the present invention, training recent historical data of an input model by using a time-series neural network double-layer LSTM and combining a load interval prediction loss function to obtain a load interval prediction model includes: for the M almanac history electrical load data of the platform area, a vector representation of each day is obtained through step S1. A time series of length n is constructed, and vector representations for n consecutive days are combined to give a 24 x n matrix, with a total of M-n +1 matrices. And taking the obtained M-n +1 time sequences as input, and transmitting the time sequences into a neural network double-layer LSTM for training. In time series neural network two-layer LSTM training, a load interval prediction loss function is used as a target.
Compared with the prior art, according to the method and the system for predicting the power consumption load interval of the transformer area based on online learning, a specially designed loss function is used during prediction of the load interval of the transformer area, so that the actual value of the power consumption load is within the interval range as much as possible, the interval of the prediction interval is as small as possible, and the real demand of prediction of the load interval of the transformer area is met. Meanwhile, aiming at the condition that no method is available for acquiring the climate data in the platform area environment, the lunar calendar data and the 24 solar terms data are creatively used for replacing the climate data as the input of the model, and the accuracy rate of model prediction is improved. In addition, because the power load law of the transformer area changes frequently, the method is applied to the transformer area environment in an online learning mode, the change law of the power load of the transformer area is learned in time, and the accuracy of prediction of the power load interval of the transformer area is not reduced after the model is deployed for a long time.
Detailed Description
The following detailed description of the present invention is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present invention is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.
Fig. 1 is a flowchart illustrating a method for predicting a distribution room electrical load interval based on online learning according to an embodiment of the present invention. Fig. 2 is a logic diagram of a method for predicting a distribution room electrical load interval based on online learning according to an embodiment of the present invention.
As shown in fig. 1 to 2, in a first aspect, a method for predicting a power load interval of a distribution area based on online learning according to a preferred embodiment of the present invention. The method comprises the following steps: step S1, in an offline environment, obtains recent history data related to the prediction of the distribution room electrical load section as an input of the model. And step S2, training the recent historical data of the input model by using a time series neural network double-layer LSTM and combining with the load interval prediction loss function to obtain a load interval prediction model. In step S3, the load section prediction model is loaded in the platform area environment. And step S4, when a certain day x is finished, the intelligent electric meter collects the electricity load data of the day and preprocesses the data of the latest m days including the day, wherein x is more than or equal to 1, and m is the number of days corresponding to the training set data when the load interval prediction model is used for online learning. And step S5, in the early learning process, using the data preprocessed in the step S4 as a training set, performing online learning on the load interval prediction model loaded in the step S3, and training and optimizing to obtain the latest load interval prediction model. And step S6, preprocessing the data of the latest n days including the current day, wherein n is the time series length and is less than or equal to m. And step S7, loading the latest load interval prediction model, and inputting the data preprocessed in the step S6 to obtain the prediction result of the model on the load interval after the x day.
In an embodiment of the present invention, the method for predicting the electric load section of the distribution room based on online learning further includes: and step S8, when the x +1 th day is finished, repeating the steps S3 to S6, and obtaining the load interval prediction result of the model after the x +1 th day.
In one embodiment of the present invention, the recent history data is in units of days, and the recent history data includes district electrical load data, holiday data, lunar calendar data, and 24 solar terms data, and the district electrical load data, holiday data, lunar calendar data, and 24 solar terms data are preprocessed and combined as input of the model.
In an embodiment of the present invention, step S2 includes: for the M almanac history electrical load data of the platform area, a vector representation of each day is obtained through step S1. A time series of length n is constructed, and vector representations for n consecutive days are combined to give a 24 x n matrix, with a total of M-n +1 matrices. And taking the obtained M-n +1 time sequences as input, and transmitting the time sequences into a neural network double-layer LSTM for training. In time series neural network two-layer LSTM training, a load interval prediction loss function is used as a target.
In a second aspect, the embodiment of the invention further provides a platform area electrical load interval prediction system based on online learning. The system comprises: the device comprises an acquisition module, a training module, a loading module, a first preprocessing module, a learning module, a second preprocessing module and a prediction module. The acquisition module is used for acquiring recent historical data related to the prediction of the transformer area electric load interval in an offline environment and taking the recent historical data as the input of the model. The training module is used for training the recent historical data of the input model by using the time series neural network double-layer LSTM and combining the load interval prediction loss function to obtain a load interval prediction model. The loading module is used for loading the load interval prediction model in the platform area environment. The first preprocessing module is used for acquiring the electricity load data of a certain day when x finishes, and preprocessing the data of the latest m days including the day, wherein x is larger than or equal to 1, and m is the number of days corresponding to the training set data when the load interval prediction model is used for online learning. The learning module is used for using data obtained after processing by the first preprocessing module as a training set in the prior learning process, performing online learning on the load interval prediction model loaded by the loading module, and training and optimizing to obtain the latest load interval prediction model. The second preprocessing module is used for preprocessing the data of the latest n days including the current day, wherein n is the time sequence length, and n is less than or equal to m. And the prediction module is used for loading the latest load interval prediction model and inputting the data processed by the second preprocessing module to obtain the prediction result of the model on the load interval after the x day.
In one embodiment of the present invention, the recent history data is in units of days, and the recent history data includes district electrical load data, holiday data, lunar calendar data, and 24 solar terms data, and the district electrical load data, holiday data, lunar calendar data, and 24 solar terms data are preprocessed and combined as input of the model.
In an embodiment of the present invention, training recent historical data of an input model by using a time-series neural network double-layer LSTM and combining a load interval prediction loss function to obtain a load interval prediction model includes: for the M almanac history electrical load data of the platform area, a vector representation of each day is obtained through step S1. A time series of length n is constructed, and vector representations for n consecutive days are combined to give a 24 x n matrix, with a total of M-n +1 matrices. And taking the obtained M-n +1 time sequences as input, and transmitting the time sequences into a neural network double-layer LSTM for training. In time series neural network two-layer LSTM training, a load interval prediction loss function is used as a target.
In practical application, the online learning-based platform area electrical load interval prediction method disclosed by the invention is specifically implemented by the following steps:
step 1, obtaining recent historical data related to prediction of a power load interval of a certain transformer area, wherein the data comprises transformer area power load data, holiday data, lunar calendar data and 24 solar terms data in units of days, preprocessing the three types of data, and combining the preprocessed data to be used as input of a model; (lunar calendar data and 24 solar terms data are correlated with climate, thus ensuring that climate data can be used substantially in a typhoon environment)
For data of one day, the preprocessing process comprises three parts:
(1) the station area power load data is normalized by the power load value of the same day, and the mean value and the variance of all power load historical data are used during normalization.
(2) The holiday data is divided into four categories of weekends, workdays, legal holidays and holidays, and is expressed by one-hot, for example, the workdays are expressed as [0,1,0,0], and the holidays are expressed as [0,0,0,1 ].
(3) 24-throttle data, 24 throttles are divided in a year, corresponding to a vector of length 24, a first vector element representing a first throttle, a second vector element representing a second throttle … …, the vector elements all having an initial value of zero, vector element values are calculated and updated using the time length of the day from the two throttles, for example, a day between the second and third throttles, a day from the second throttle, and b days from the third throttle on a certain day, so that the vector element value corresponding to the second throttle is represented as b/(a + b), the vector element value corresponding to the third throttle is represented as a/(a + b), and the remaining element values are 0, so that the 24-throttle data is represented as L [0, b/(a + b), a/(a + b),0, … …,0,0 ].
And (4) representing the vector of the three pre-processing parts as concat to obtain information of which the vector with the size of 29(1+4+24) represents the day.
Step 2, training the preprocessed data by using a time series neural network double-layer LSTM and combining with a loss function of the invention designed for a prediction interval to obtain a load interval prediction model;
the specific process is as follows:
(1) for the M almanac history electrical load data of the platform area, a vector representation of each day is obtained through step 1.
(2) A time series of length n is constructed, i.e. vector representations for n consecutive days are combined, resulting in a 24 x n matrix, with a total of M-n +1 such matrices.
(3) The obtained M-n +1 time sequences are used as input and are transmitted into a model for training, and the double-layer LSTM is used as a sample in the invention as long as the time sequence model is available.
(4) In the time series model training, the loss function defined by the present invention is used as a target.
The load interval prediction loss function formula is as follows:
y _ u, y _ l and y are vector values, the lengths of the y _ u, y _ l and y are the same, and the length is n, which represents that n groups of real values, n groups of prediction upper bounds and n groups of prediction lower bounds exist.
Such as: y ═ 1.0,2.0,3.0,4.0,5.0, y _ u ═ 1.3,2.4,2.9,4.1,4.5, y _ l ═ 0.8,2.1,2.4,3.6, 4.0.
1、K_u=max(0,sign(y_u-y))
In y _ u, a value greater than y is 1, and a value equal to or less than y is 0.
sign(y_u-y)=[1,1,-1,1,-1]
K_u=max(0,sign(y_u-y))=[1,1,0,1,0];
2、K_l=max(0,sign(y-y_l))
In the same way
sign(y-y_l)=[1,-1,1,1,1]
K_l=max(0,sign(y-y_l))=[1,0,1,1,1];
3、K_h=multiply(K_u,K_l)
K _ u and K _ l are dot multiplied, which means that y _ l (i) < y _ u (i) is needed in a group of data, and K _ h (i) < 1; otherwise K _ h (i) is 0.
K_h=multiply(K_u,K_l)=[1,0,0,1,0];
4、Loss_h=reduce_sum(multiply(y_u-y_l,K_h))/reduce_sum(K_h)
Loss _ h represents a target interval between y _ l and y _ u, and the smaller the interval, the smaller the Loss _ h. The denominator and numerator of the formula have the following meanings:
(1)reduce_sum(K_h)=reduce_sum([1,0,0,1,0])=3
to lower Loss _ h, the denominator needs to be large, so the goal is to satisfy y _ l (i) < y _ u (i), i.e., K _ h ═ 1,1,1,1, in each group of data.
(2)reduce_sum(multiply(y_u-y_l,K_h))=reduce_sum(multiply([0.5,0.3,0.5,0.5,0.5],[1,0,0,1,0]))=reduce_sum([0.5,0,0,0.5,0])=1
To lower Loss _ h, it is necessary to make the molecule smaller, so the goal is to make y _ u (i) -y _ l (i) smaller as better under the condition that y _ l (i) < y _ u (i). Only the smaller y _ u-y _ l is, the smaller the molecule.
5. Loss _ u and Loss _ l are quantile Loss functions;
6. the resulting loss function is a combination of the three, i.e.
Loss=λ_u*Loss_u+λ_l*Loss_l+λ_h*Loss_h。
Wherein y is a true value, y _ u is a prediction upper bound, y _ l is a prediction lower bound, r _ u is an upper bound quantile coefficient, r _ l is a prediction lower bound quantile coefficient, r _ u >0.5> r _ l, λ _ u is a prediction upper bound loss function coefficient, λ _ l is a prediction lower bound loss function coefficient, and λ _ h is an interval loss function coefficient.
Entering a platform area environment
And 3, loading the load interval prediction model obtained in the step 2 in a platform area environment.
And 4, when the x-th day of a certain day is finished (x > -1), the intelligent electric meter acquires the electricity load data of the same day and preprocesses the data of the latest m days including the same day, wherein m is the number of days corresponding to the training set data when the model is used for online learning.
And 5, in the online learning process, using the data obtained after the preprocessing in the step 4 as a training set, performing online learning on the load interval prediction model loaded in the step 3, and training and optimizing to obtain the latest load interval prediction model.
And 6, preprocessing the data of the latest n days including the current day, wherein n is the time series length, namely at least n days of data are required to be input during model prediction, and n is equal to m in general.
And 7, loading the latest load interval prediction model, and inputting the data preprocessed in the step 6 to obtain a prediction result of the model on the load interval after the x day.
And 8, repeating the steps 3 to 6 when the x +1 th day is finished, and obtaining the load interval prediction result of the model after the x +1 th day.
In summary, when the online learning-based prediction method and system for the power load interval of the transformer area are used for predicting the load interval of the transformer area, a specially designed loss function is used, so that the actual value of the power load is within the interval range as much as possible, the interval between prediction intervals is as small as possible, and the actual demand of prediction of the load interval of the transformer area is met. Meanwhile, aiming at the condition that no method is available for acquiring the climate data in the platform area environment, the lunar calendar data and the 24 solar terms data are creatively used for replacing the climate data as the input of the model, and the accuracy rate of model prediction is improved. In addition, because the power load law of the transformer area changes frequently, the method is applied to the transformer area environment in an online learning mode, the change law of the power load of the transformer area is learned in time, and the accuracy of prediction of the power load interval of the transformer area is not reduced after the model is deployed for a long time.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.