CN114819352A

CN114819352A - Load prediction method based on MIC-TCN-LSTM, storage medium and computer system

Info

Publication number: CN114819352A
Application number: CN202210460380.1A
Authority: CN
Inventors: 丁石川; 胡子玉; 王正风; 郭小璇; 鲍海波; 杭俊
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2022-04-28
Filing date: 2022-04-28
Publication date: 2022-07-29

Abstract

The invention discloses a load prediction method based on MIC-TCN-LSTM, a storage medium and a computer system, wherein the method comprises the following steps: s1, preprocessing the original data and then carrying out normalization processing; s2, performing correlation analysis on the data by using the maximum correlation coefficient MIC, and providing data with small or irrelevant correlation; s3, dividing a data set formed by the data processed in the step 2 into a training set and a testing set by using a sliding window; s4, constructing a TCN-LSTM model, and sending the training set in the step 3 into the TCN-LSTM model for training; and S5, predicting the test set by using the model trained in the step 4 to finally obtain a prediction result, and outputting the prediction result. The method and the device accurately predict the load in the power grid, and effectively improve the prediction accuracy. The method provided by the invention is simple and easy to realize.

Description

Load prediction method based on MIC-TCN-LSTM, storage medium and computer system

Technical Field

The invention relates to the field of load prediction methods, in particular to a load prediction method based on MIC-TCN-LSTM, a storage medium and a computer system.

Background

MIC-Maximal information coefficient, originated from a paper published in Science in 2011. MIC is used to measure the degree of correlation between two variables X and Y, and the strength of linearity or nonlinearity, and is commonly used for feature selection for machine learning. Compared with other correlation methods, the MIC has higher accuracy and is an excellent calculation method for data correlation. According to the nature of MIC, MIC has universality, fairness and symmetry. 1) Universality: when the sample size is large enough to contain most of the information of the sample, various interesting associations can be captured, and the method is not limited to a specific function type, such as a linear function, an exponential function or a periodic function, or can be balanced to cover all functional relations. Complex relationships between general variables can be modeled not just by a single function, but by superimposed functions. For a function with better universality, the starting points of the different types of association relations should be close. And is close to 1. 2) Fairness: when the sample size is large enough, similar coefficients can be given for the correlation relations of different types of single noise with similar degrees.

A Time Convolutional Network (TCN) is a neural network algorithm proposed in 2018 for analyzing time series data. The TCN introduces one-dimensional convolution, causal convolution, expansion convolution and residual error network, and solves the problem of feature extraction of long-term time sequence information. TCN is an efficient feature extraction and expression algorithm that can accurately model and estimate complex multi-feature, multi-scale nonlinear data. The load prediction algorithm based on the TCN can accurately simulate the complex relation between the load and the relevant characteristics of the load, and high-precision prediction is realized. TCNs represent a further advantage in the time series prediction task compared to traditional Recurrent Neural Network (RNN) based algorithms: 1) TCNs are good at capturing time dependence because convolution can capture local information; 2) the size of the receptive field of the TCN model can be flexibly adjusted to obtain more input characteristics.

The existing power grid load prediction precision is not high, and a method with high prediction precision is urgently needed to solve the problem.

Disclosure of Invention

In order to solve the existing problems, the invention provides a load prediction method, a storage medium and a computer system based on MIC-TCN-LSTM, and the specific scheme is as follows:

a load prediction method, a storage medium and a computer system based on MIC-TCN-LSTM, wherein the method comprises the following steps:

s1, preprocessing the original data and then carrying out normalization processing;

s2, carrying out correlation analysis on the data by utilizing the maximum correlation coefficient MIC, and providing data with small or irrelevant correlation;

s3, dividing a data set formed by the data processed in the step 2 into a training set and a testing set by using a sliding window;

s4, constructing a TCN-LSTM model, and sending the training set in the step 3 into the TCN-LSTM model for training;

and S5, predicting the test set by using the model trained in the step 4 to finally obtain a prediction result, and outputting the prediction result.

Preferably, the method for preprocessing the raw data in step 1 includes: and processing the original data into an ideal format, and filling missing items in the data by using the average values of the previous and next two days.

Preferably, the normalized data is data to be processed limited between 0 and 1, and the normalized data is processed as follows

Wherein x' is after normalizationData, x _min And x _max The minimum and maximum values of the load data, respectively.

Preferably, the method for performing correlation analysis on data in step 2 includes: the method for solving the correlation between other data including temperature, humidity, precipitation, date type and load data by using the maximum correlation information number MIC for measuring the correlation degree between two variables X and Y, linear or non-linear intensity and selecting the characteristics between the variables, and discarding unnecessary variables, wherein the basic principle of the maximum correlation information number MIC utilizes the mutual information concept which is described by using the following equation

Wherein x and y are two linked random variables, p (x, y) is a joint probability density distribution function, the two random variables are quantized into a scatter diagram, then are continuously divided by small squares, and then the falling probability P (X) and P (Y) in each square grid are calculated, thereby calculating the joint probability density distribution

The correlation between the load variable and the other variables is obtained, and the variables having a small correlation with the load are discarded.

Preferably, the TCN-LSTM model constructed in step 4 includes an input layer, a TCN layer, an LSTM layer, and an output layer, where data are sequentially transmitted.

Preferably, the TCN layer comprises a number of residual error networks including an extended causal convolution layer, a weight normalization layer WeightNorm and a regularization layer Dropout, one branch of the residual error networks performing a transformation operation F on the input X, the other branch performing a simple 1 × 1 transformation to maintain consistency of the number of feature maps in parallel with the existing branch, and the output X (h) of the h residual error module may be represented as

X ^(h) ＝ReLU(F(X ^(h-1) )+X ^(h-1) )

Wherein ReLU represents an activation function as shown in the following formula

Preferably, the calculation process of the LSTM layer includes:

SA1, forgetting gate determines which information will be forgotten according to input state, thereby obtaining output state f (t) of forgetting gate

f ^(t) ＝σ(W _fa α ^(t-1) +W _fx x ^(t) +b _f )

Where x (t) is the input at the current time, α (t-1) is the cell state at the previous time, and W _fa 、W _fx Is the forgetting gate weight coefficient; b _f Is a forgetting gate bias; σ is a forgetting gate activation function;

SA2, update gate to determine whether the cell state is updated; the updating of the cell state comprises updating the output state i (t) of the gate and the tan h function activation output

The product of (a) and the product of the last cell state c (t-1) and the forgetting gate f (t) has an output expression of

i ^(t) ＝σ(W _ia α ^(t-1) +W _ix x ^(t) +b _i )

In the formula, W _ia ，W _ix ，W _ca ，W _cx Is to update the weight coefficients of the gate; bi, bc are the bias of the update gate; σ, tanh is the activation function of the update gate; c (t) is the updated cell state, the output gate determines whether to pass the current state to the next time,

o ^(t) ＝σ(W _oa α ^(t-1) +W _ox x ^(t) +b _o )

α ^(t) ＝o ^(t) ×tanh(c ^(t) )

where o (t) is the hidden state of the current cell, W _oa ，W _ox Is the output gate cell state update weight coefficient; b _o Is output gate cell state bias; tanh is the output gate activation function; alpha (t) is the hidden state of the neural network at the current moment,

the hidden state of the final output of the SA3, LSTM network will retain all the past output information to calculate its output, whose output expression is:

in the formula, W _y Is the weight of the LSTM network; b _y Is an offset of the LSTM network;

is a predicted value calculated by the LSTM network; g (z) is an activation function.

The invention also discloses a computer readable storage medium, wherein a computer program is stored on the medium, and after the computer program runs, the load prediction method based on the MIC-TCN-LSTM is executed.

The invention also discloses a computer system, which comprises a processor and a storage medium, wherein the storage medium is stored with a computer program, and the processor reads the computer program from the storage medium and runs the computer program to execute the MIC-TCN-LSTM-based load prediction method.

The invention has the beneficial effects that:

the method and the device accurately predict the load in the power grid, and effectively improve the prediction accuracy. The method provided by the invention is simple and easy to realize.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a flow chart of a method of the present invention;

FIG. 2 is a schematic view of a sliding window;

FIG. 3 is a schematic diagram of the structure of TCN.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

A MIC-TCN-LSTM based load prediction method, storage medium and computer system, as shown in fig. 1, wherein the method comprises the steps of:

s5, predicting the test set by using the TCN-LSTM model trained in the step 4, detecting the prediction accuracy, finally obtaining a prediction result, and outputting the prediction result.

The method for preprocessing the original data in the step 1 comprises the following steps: and processing the original data into an ideal format, and filling missing items in the data by using the average values of the previous and next two days.

Normalizing data limits the data to be processed to be between 0 and 1, and normalizing is used for facilitating the subsequent data processing and ensuring that the convergence is accelerated during the operation of a program. The normalization processing method is as follows

Where x' is the data after normalization, x _min And x _max The minimum and maximum values of the load data, respectively.

The method for carrying out correlation analysis on the data in the step 2 comprises the following steps: the method for calculating the correlation between other data including temperature, humidity, precipitation, date type and load data by using the maximum correlation information number MIC, discarding unnecessary variables, wherein the maximum correlation information number MIC is used for measuring the correlation degree between two variables X and Y, and the linear or nonlinear strength is used for selecting the characteristics between the variables, the basic principle of the maximum correlation information number MIC is utilized to the mutual information concept, and the mutual information concept uses the following equation to explain the mutual information concept

Wherein x and y are two linked random variables, p (x, y) is a joint probability density distribution function, the two random variables are quantized into a scatter diagram, then the scatter diagram is continuously divided by small squares, and then the probabilities P (X) and P (Y) falling into each square grid are calculated, so that the joint probability density distribution is calculated, the larger the data volume is, the better the MIC effect is, and the specific method is as follows:

the correlation between the load variable and the other variables is obtained according to the above method, and the variables having a small correlation with the load are discarded.

The TCN-LSTM model constructed in the step 4 comprises an input layer, a TCN layer, an LSTM layer and an output layer, wherein data are transmitted in sequence.

The TCN layer includes several residual error networks including a dilated causal convolution layer, a weight normalization layer Weightnorm and a regularization layer Dropout, one branch of the residual error networks performing a transformation operation F on the input X, the other branch performing a simple 1X 1 transformation to maintain consistency of the number of feature maps in parallel with the existing branch, the output X (h) of the h residual error module can be represented as

X ^(h) ＝ReLU(F(X ^(h-1) )+X ^(h-1) )

And an LSTM network is added behind the TCN network, so that the accuracy of prediction is further improved.

The LSTM layer is an improvement of the recurrent neural network. The key of the LSTM layer is the cell state, and information in the cell state is updated and deleted through a forgetting gate, an updating gate and an output gate. The calculation process of the LSTM layer comprises the following steps:

f ^(t) ＝σ(W _fa α ^(t-1) +W _fx x ^(t) +b _f )

Where x (t) is an input at the current time, α (t-1) is a cell state at the previous time, and Wfa and Wfx are forgetting gate weight coefficients; bf is the forgotten gate bias; σ is a forgetting gate activation function;

SA2, update Gate determining whether cell State is updated(ii) a The updating of the cell state comprises updating the output state i (t) of the gate and activating the output by the tanh function

The product of (a) and the product of the last-moment cell state c (t-1) and the forgetting gate f (t), and the output expression is

i ^(t) ＝σ(W _ia α ^(t-1) +W _ix x ^(t) +b _i )

Wherein Wia, Wix, Wca, Wcx are weight coefficients of the update gate; bi, bc are the bias of the update gate; σ, tanh is the activation function of the update gate; c (t) is the updated cell state, the output gate determines whether to pass the current state to the next time,

o ^(t) ＝σ(W _oa α ^(t-1) +W _ox x ^(t) +b _o )

α ^(t) ＝o ^(t) ×tanh(c ^(t) )

where o (t) is the hidden state of the current cell, Woa, and Wox is the output gate cell state update weight coefficient; bo is the output gate cell state bias; tanh is the output gate activation function; alpha (t) is the hidden state of the neural network at the current moment,

where Wy is the weight of the LSTM network; by is the bias of the LSTM network;

The invention also discloses a computer system, which comprises a processor and a storage medium, wherein the storage medium is stored with a computer program, and the processor reads the computer program from the storage medium and runs the computer program to execute the MIC-TCN-LSTM-based load prediction method. The method and the device accurately predict the load in the power grid, and effectively improve the prediction accuracy. The method provided by the invention is simple and easy to realize.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.

In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk (disk) and disc (disc), as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks (disks) usually reproduce data magnetically, while discs (discs) reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A load prediction method based on MIC-TCN-LSTM is characterized by comprising the following steps:

2. The method according to claim 1, wherein the method for preprocessing the raw data in step 1 is as follows: and processing the original data into an ideal format, and filling missing items in the data by using the average values of the previous and next two days.

3. The method of claim 1, wherein: the normalized data limits the data to be processed between 0 and 1, and the processing method of the normalization is as follows

4. The method of claim 1, wherein the step 2 of performing correlation analysis on the data comprises: the method for solving the correlation between other data including temperature, humidity, precipitation, date type and load data by using the maximum correlation information number MIC for measuring the correlation degree between two variables X and Y, linear or non-linear intensity and selecting the characteristics between the variables, and discarding unnecessary variables, wherein the basic principle of the maximum correlation information number MIC utilizes the mutual information concept which is described by using the following equation

Wherein x and y are two linked random variables, p (x, y) is a joint probability density distribution function, the two random variables are quantized into a scatter diagram, then are continuously divided by small squares, and the falling probabilities P (X) and P (Y) in each square grid are calculated, so that the joint probability density distribution is calculatedThe volume method is calculated as follows

5. The method of claim 1, wherein: the TCN-LSTM model constructed in the step 4 comprises an input layer, a TCN layer, an LSTM layer and an output layer, wherein data are transmitted in sequence.

6. The method of claim 5, wherein: the TCN layer includes several residual error networks including an augmented causal convolution layer, a weight normalization layer Weightnorm and a regularization layer Dropout, one branch of the residual error networks performing a transform operation F on an input X, the other branch to perform a simple 1X 1 transformation to maintain consistency of the number of feature maps in parallel with the existing branch, and the output X (h) of the h-th residual error module can be represented as

X ^(h) ＝ReLU(F(X ^(h-1) )+X ^(h-1) )

7. The method of claim 6, wherein the calculation of the LSTM layer comprises:

f ^(t) ＝σ(W _fa α ^(t-1) +W _fx x ^(t) +b _f )

SA2, update gate to determine whether the cell state is updated; the updating of the cell state comprises updating the output state i (t) of the gate and activating the output by the tanh function

i ^(t) ＝σ(W _ia α ^(t-1) +W _ix x ^(t) +b _i )

o ^(t) ＝σ(W _oa α ^(t-1) +W _ox x ^(t) +b _o )

α ^(t) ＝o ^(t) ×tanh(c ^(t) )

8. A computer-readable storage medium characterized by: a medium having stored thereon a computer program which, when executed, performs a MIC-TCN-LSTM based load prediction method as claimed in any one of claims 1 to 7.

9. A computer system, characterized by: comprising a processor, a storage medium having a computer program stored thereon, the processor reading and executing the computer program from the storage medium to perform the MIC-TCN-LSTM based load prediction method according to any one of claims 1 to 7.