CN115186887A

CN115186887A - Multithreading parallel computing power load prediction method based on LSTM

Info

Publication number: CN115186887A
Application number: CN202210768886.9A
Authority: CN
Inventors: 秦虹; 宗明; 代杰杰; 赵熠旖; 朱洪成; 谢婧; 黄嵩; 陆黎
Original assignee: State Grid Shanghai Electric Power Co Ltd
Current assignee: State Grid Shanghai Electric Power Co Ltd
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2022-10-14

Abstract

The invention discloses a multithreading parallel computing power load prediction method based on LSTM, which comprises the following steps of 1, data processing; step 2, a prediction step; and 3, training. According to the invention, a neural network is constructed at the beginning of each thread, and all distribution transformers in the thread use the same network, so that the time for constructing the neural network is reduced; after a thread calculates a certain distribution transformation quantity, the thread is closed and a round of threads are started again, and the efficiency reduction caused by the increase of the calculation distribution transformation is eliminated; an array is newly established in the thread, the predicted data of the thread is stored in the array, and when the distribution task of the thread is completely finished, the predicted data of the thread is uniformly stored in the global array, so that the problem of thread locking is reduced to the greatest extent; in order to meet the requirements of accuracy and efficiency at the same time, a prediction program and a training program are separated, so that the LSTM neural network power load prediction with high calculation efficiency is realized.

Description

Multithreading parallel computing power load prediction method based on LSTM

Technical Field

The invention relates to an LSTM-based multithreading parallel computing power load prediction method which is used in the field of power load prediction and operation.

Background

The power demand has the characteristic of real-time change, and the power system cannot store electric energy in a large quantity, so that the generated energy of the system and the change of the load are required to reach a dynamic balance state. The timeliness of economic dispatch and the utilization rate of the power generation device can be improved by improving the power load prediction accuracy. In order to meet the actual production requirements, the prediction result needs to be output in time on the premise of meeting the accuracy. The LSTM model (Long Short-Term Memory artificial neural network) has high accuracy, but needs a large amount of training for new data every day, and has low calculation efficiency. If all the distribution transformers are predicted by adopting the LSTM model, the distribution transformer prediction result in a large area range cannot be output in one day.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a multithreading parallel computing power load forecasting method based on an LSTM (least Square TM), which can improve the computing efficiency of an LSTM model for power load forecasting.

One technical scheme for achieving the above purpose is as follows: a multithread parallel computing power load prediction method based on LSTM comprises the following steps:

step 1, data processing, specifically comprising:

step 1.1, a database reads specified time data;

step 1.2, classifying the data of the appointed time into m data sets according to ID;

step 1.3, sorting the data in each data set according to time;

step 1.4, preprocessing the sorted data;

and 2, predicting, wherein the specific steps comprise:

step 2.1, starting a thread n;

step 2.2, building a neural network;

step 2.3, reading the ith data set;

step 2.4, reading corresponding hyper-parameters;

step 2.5, predicting the power load;

step 2.6, storing the prediction data into a global array;

step 2.7, judging whether i is less than or equal to the calculation configuration number of each thread, if so, returning to the step 2.3 to read the (i + 1) th data set, if not, continuously judging whether i is less than the total configuration variable number, if so, counting i is increased by 1 and n threads are closed, otherwise, closing 33 threads, storing all the prediction data into a database and entering the step 3;

step 3, training, specifically comprising:

step 3.1, starting a thread n;

step 3.2, building a neural network;

step 3.3, reading the ith data set;

step 3.4, reading corresponding hyper-parameters;

step 3.5, training the power load prediction model;

step 3.6, updating the hyper-parameters;

and 3.7, judging whether i is less than or equal to the calculation configuration number of each thread, if so, returning to the step 3.3 to read the (i + 1) th data set, otherwise, continuously judging whether i is less than the total configuration variable number, if so, adding 1 to the count of i and closing n threads, otherwise, closing 33 threads, and ending the program.

Further, step 1.4, preprocessing the sorted data specifically includes supplementing missing data, deleting deviating data, and normalizing the data.

The multithreading parallel computing power load prediction method based on the LSTM improves the computing efficiency by adopting the following method:

1. at the beginning of each thread, a neural network is constructed, and all the distribution transformers in the thread use the same network, so that the time for constructing the neural network is reduced;

2. after a thread calculates a certain distribution transformation quantity, the thread is closed and a round of threads are started again, and the efficiency reduction caused by the increase of the calculation distribution transformation is eliminated;

3. when the local data is stored in the global array, the system automatically generates a thread lock to stop other threads from rewriting the global array. Therefore, an array is newly built in the thread, the prediction data of the thread is stored in the array, and when the distribution task of the thread is completely finished, the prediction data of the thread is uniformly stored in the global array. Because a plurality of prediction and training parts of the distribution transformer are operated, the process of each thread generates certain deviation, when one thread rewrites the global array, other threads still perform prediction or finish operation, and the problem of thread lock is reduced to the maximum extent;

4. to meet both accuracy and efficiency, the prediction program and the training program are separated. When yesterday data is input, the distribution transformer load is predicted by using the existing hyper-parameters, and after all distribution transformer predictions are finished, the input new data is used for training and updating the hyper-parameters.

Drawings

Fig. 1 is a schematic flow chart of a method for predicting an electrical load of an LSTM-based multithreaded parallel computing according to the present invention.

Detailed Description

In order to better understand the technical solution of the present invention, the following detailed description is given by specific examples:

the common LSTM load prediction method needs to be subjected to complete data set construction, network construction, fixed-time training and output during each operation. Through practical tests, data of a distribution transformer in one month are trained on a computer with a processor of i5 9400, the training times are 70, the number of three-layer networks is 8, and the time is about 15 minutes. When running on a large server (20-core processor), the time required is not less than 12 minutes, since the tenserflow platform is not multi-threaded. Such operating efficiency cannot meet daily forecasted demand as the number of loads increases.

The invention discloses an LSTM-based multithreading parallel computing power load forecasting method, which provides a solution for improving the efficiency of an LSTM long-term and short-term memory neural network in a large-scale power load forecasting scene, and the solution comprehensively comprises the following points:

1) The neural network hyper-parameters are reserved and recovered, and the training times are reduced

In practical short-term prediction applications, the algorithm needs to give a prediction value every day for a certain period of time. Between every two days, the dataset gap includes only one newly added day, and the percentage of this newly added data in the entire dataset is typically small. Thus, there is no need to start training again from 0 every day. The first day of program deployment is removed, and training based on a new data set is started on the basis of the original training result from the second day, so that the subsequent calculation speed can be greatly improved on the premise of not influencing the accuracy of the algorithm.

The adjustment made by the neural network training is to update the hyper-parameters of the network, so that the hyper-parameters are reserved and stored after each training is finished, the influence of early training data on the hyper-parameters of the network is saved, and the hyper-parameters are read when a subsequent program runs, so that repeated training can be avoided; meanwhile, the subsequent training set is adjusted, the proportion of the recent data is increased, and the influence weight of the recent data on the network hyperparameters is also increased. Through the two modes, the training times in the subsequent training can be reduced, so that the occupation of computing resources is reduced, and the efficiency is improved.

2) Multithread calculation for solving corresponding problems

When the LSTM is used for load prediction, the prediction and training time of program operation is longer. If the method is applied to a large-scale load prediction scene, the total time consumption of prediction and training exceeds the number of days which can be predicted, so that the actual load data is obtained when the prediction result is obtained. The prediction program loses meaning.

In actual environment, the CPU occupancy rate of the system is low and is about 3% when the load prediction program is operated. The utilization rate of system resources is low, and there are two main reasons as follows:

1. the system has a small number of cpu cores allocated to the program.

2. The neural network has a part with insufficient CPU utilization rate in the process of building, training and predicting.

Therefore, the utilization rate of system resources is improved by utilizing multiple threads, and the load forecasting efficiency is improved. The load prediction program is used as one thread, and a plurality of threads are started simultaneously. Gradual increase in the utilization of the system cpu is observed. The prediction time of each distribution transformer load is not effectively improved after the measurement and the calculation. The following problems mainly exist:

1. the database does not support simultaneous access of a plurality of threads, and when a certain thread operates the database, other threads need to stop running through the thread lock. And the database needs to be read and written in each thread, so that the database reading and writing time of other threads needs to be increased when the running time of one thread is increased.

2. Because the time required for screening in the database is long, when a plurality of threads run, the database reading time is far longer than the training and prediction time of the neural network

3. As the number of threads increases, the efficiency of a single thread decreases.

In order to solve the problems, firstly, indexes are added to the time and the distribution transform name of the database to improve the data reading efficiency of the database. Under the condition that the memory is enough, before the load prediction program, the data of all the distribution variables in the set time is read into the memory at one time, so that the database is prevented from being read for many times. Meanwhile, the data of each prediction is stored into a global array, and the data is stored into the database once after the prediction is finished. Therefore, the problem that other threads stop running due to the fact that the database is read and written in the threads is solved, and efficiency is improved. And simultaneously, uniformly preprocessing the read data.

The basic multithreading structure is built, the problems are solved, but the operation efficiency still cannot meet the requirements, and the following new problems mainly exist:

1. the neural network needs to be reconstructed every time a new distribution change is calculated in each thread, and the proportion of the neural network in the total time is large.

2. When the method is applied to large-scale loads, the efficiency of the threads is gradually reduced along with the increase of the calculation distribution change.

3. When the predicted data is stored in the global array, the time occupation ratio is too long.

To solve the above problems, the following solutions may be adopted:

1. at the beginning of each thread, a neural network is constructed, and all the distribution transformers in the thread use the same network, so that the time for constructing the neural network is reduced.

2. After a certain distribution change number is calculated by one thread, the thread is closed and a round of threads are opened again, and the efficiency reduction caused by the increase of the calculation distribution change is eliminated.

3. When the local data is stored in the global array, the system automatically generates a thread lock to stop other threads from rewriting the global array. Therefore, an array is newly built in the thread, the prediction data of the thread is stored in the array, and when the distribution task of the thread is completely finished, the prediction data of the thread is uniformly stored in the global array. Because a plurality of prediction and training parts of the distribution transformer are operated, the process of each thread generates certain deviation, when one thread rewrites the global array, other threads still perform prediction or finish operation, and the problem of thread lock is reduced to the maximum extent.

Referring to fig. 1, the method for predicting the electrical load of the multithreaded parallel computing based on the LSTM of the present invention includes the following steps:

step 1, data processing, specifically comprising:

step 1.1, a database reads specified time data;

step 1.3, sorting the data in each data set according to time;

step 1.4, preprocessing the sorted data; preprocessing the sorted data specifically comprises supplementing missing data, deleting deviating data and standardizing the data;

and 2, predicting, wherein the specific steps comprise:

step 2.1, starting a thread n;

step 2.2, building a neural network;

step 2.3, reading the ith data set;

step 2.4, reading corresponding hyper-parameters;

step 2.5, predicting the power load;

step 2.6, storing the prediction data into a global array;

step 3, training, specifically comprising:

step 3.1, starting a thread n;

step 3.2, building a neural network;

step 3.3, reading the ith data set;

step 3.4, reading corresponding hyper-parameters;

step 3.5, training the power load prediction model;

step 3.6, updating the hyper-parameters;

It should be understood by those skilled in the art that the above embodiments are only for illustrating the present invention and are not to be used as a limitation of the present invention, and that changes and modifications to the above described embodiments are within the scope of the claims of the present invention as long as they are within the spirit and scope of the present invention.

Claims

1. A multithreading parallel computing power load prediction method based on LSTM is characterized by comprising the following steps:

step 1, data processing, specifically comprising:

step 1.1, a database reads specified time data;

step 1.3, sorting the data in each data set according to time;

step 1.4, preprocessing the sorted data;

and 2, predicting, wherein the specific steps comprise:

step 2.1, starting a thread n;

step 2.2, building a neural network;

step 2.3, reading the ith data set;

step 2.4, reading corresponding hyper-parameters;

step 2.5, predicting the power load;

step 2.6, storing the prediction data into a global array;

step 3, training, specifically comprising:

step 3.1, starting a thread n;

step 3.2, building a neural network;

step 3.3, reading the ith data set;

step 3.4, reading corresponding hyper-parameters;

step 3.5, training the power load prediction model;

step 3.6, updating the hyper-parameters;

2. The LSTM-based multi-threaded parallel computing power load forecasting method of claim 1, wherein in step 1.4, the preprocessing of the sorted data specifically comprises supplementing missing data, deleting deviating data, and normalizing the data.