CN116681173A - Data-intensive power load prediction parallel optimization method based on RNN model - Google Patents

Data-intensive power load prediction parallel optimization method based on RNN model Download PDF

Info

Publication number
CN116681173A
CN116681173A CN202310661973.9A CN202310661973A CN116681173A CN 116681173 A CN116681173 A CN 116681173A CN 202310661973 A CN202310661973 A CN 202310661973A CN 116681173 A CN116681173 A CN 116681173A
Authority
CN
China
Prior art keywords
data
model
training
rnn model
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310661973.9A
Other languages
Chinese (zh)
Inventor
龙玉江
甘润东
李洵
卫薇
王杰峰
王策
钟掖
龙娜
卢仁猛
孙骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Power Grid Co Ltd
Original Assignee
Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Power Grid Co Ltd filed Critical Guizhou Power Grid Co Ltd
Priority to CN202310661973.9A priority Critical patent/CN116681173A/en
Publication of CN116681173A publication Critical patent/CN116681173A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Quality & Reliability (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data-intensive power load prediction parallel optimization method based on an RNN model, which comprises the following steps of: step 1, carrying out null value processing and abnormal value detection and processing on original data; and 2, inputting the electric load data of a time sequence after the processing of the step 1 into the RNN model for training, wherein the input of each time step of the time sequence comprises the past several time steps. The invention uses the RNN model to fully utilize the historical data to predict, and has higher prediction accuracy and stability. Through learning historical load data, the RNN can capture time sequence characteristics and long-term dependence of the data, so that future load changes can be predicted better.

Description

Data-intensive power load prediction parallel optimization method based on RNN model
Technical Field
The invention relates to a data-intensive power load prediction parallel optimization method based on an RNN model, and belongs to the technical field of power load prediction optimization.
Background
Electrical load prediction refers to predicting an electrical load over a period of time using historical electrical load data and related information through mathematical models and algorithms. An electrical load refers to the consumer's real-time electricity demand on an electrical system. The purpose of load prediction is to enable the power system to better plan and schedule power resources to meet future power demands. Power load forecasting has wide application in the power industry, including power resource scheduling, market trading, and power supply and demand balancing. The traditional electric load prediction method is mainly based on methods such as time sequence analysis, regression analysis, artificial neural network and the like.
The deep learning model in the existing artificial neural network is a machine learning method based on the artificial neural network, and the data is subjected to abstract and characterization learning through multi-layer nonlinear transformation, so that the large-scale and high-dimensional data can be effectively processed. In terms of power load prediction, deep learning models are widely used, for example, load prediction using a cyclic neural network (RNN), a long short-term memory network (LSTM), a Convolutional Neural Network (CNN), or the like. The method has the advantages of using the deep learning model to predict the power load and optimize the power load in parallel, and mainly comprises the aspects of improving prediction accuracy, accelerating calculation speed, optimizing model structure, improving energy consumption efficiency and the like.
With the rapid development of power systems and the rapid increase in data volume, conventional methods have been difficult to meet the requirements of power load prediction: training accuracy and training speed. Therefore, electric scientific researchers begin to explore new methods, such as deep learning, parallel computing and other techniques, for application in electric load prediction, aiming at improving prediction accuracy and training speed. These new techniques have powerful data processing and analysis capabilities and can improve the efficiency and accuracy of electrical load prediction by optimizing model structures and parallel computations. The application of these new technologies will bring new opportunities and challenges to the development of power systems, hopefully enabling a more intelligent, efficient, reliable and sustainable power supply.
Disclosure of Invention
The invention aims to solve the technical problems that: the data-intensive power load prediction parallel optimization method based on the RNN model solves the problems existing in the prior art.
The technical scheme adopted by the invention is as follows: an RNN model-based data-intensive power load prediction parallel optimization method comprises the following steps:
step 1, carrying out null value processing and abnormal value detection and processing on original data;
and 2, inputting the electric load data of a time sequence after the processing of the step 1 into the RNN model for training, wherein the input of each time step of the time sequence comprises the past several time steps.
Further, the steps of detection and processing in the step 1 are specifically as follows:
step 101: the raw data contains Global Energy Forecasting Competition datasets of historical data from real electrical loads, wind power generation and solar power generation in different countries;
step 102: processing the missing values in the source data set in the step 101;
step 103: detecting and processing the abnormal value after the hollow value in the step 102 is processed, and eliminating abnormal data;
step 104: dividing the original data into a training set, a verification set and a test set according to time;
step 105: converting the data according to the problem requirements, converting the temperature from degrees celsius to degrees fahrenheit, and converting the timestamp to a datetime format; resampling and interpolating the time series data to fill up missing values or to normalize the data sampling frequency;
step 106: performing feature extraction on the data subjected to resampling and interpolation processing in step 105 through feature engineering for model training;
step 107: and extracting time characteristics and statistical characteristics from the data subjected to resampling and difference processing in the step 105 to represent the change rule of the electric load. Time characteristics include hours, weeks, seasons, holidays, etc.; the statistical characteristics comprise the maximum value, the minimum value, the average value, the standard deviation and the like of the historical load data;
step 108: carrying out normalization processing on the data obtained in the step 107 to unify the value ranges of all the features to be [0,1], avoiding excessive influence of certain features on model training, and realizing normalization by adopting a linear transformation method;
step 109: the data is parallelized, and the data is divided into a plurality of small batches for processing due to overlarge data scale, so that the model training efficiency is improved;
step 110: the processed data is stored, so that the subsequent model training and prediction use are facilitated.
Further, the missing value processing method in step 102 includes: the filling process is performed with the value of the previous time.
Further, the abnormal data culling method in step 103 uses the 3 sigma rule.
Further, in the step 104, the first 80% of data is used as a training set, the middle 10% is used as a verification set, and the last 10% is used as a test set.
Further, the aboveThe normalization formula in step 108 is:where xmax is the maximum value of the sample data and xmin is the minimum value of the sample data.
Further, the RNN training method in the step 2 is as follows:
step 201: carrying out sliding window processing on the preprocessed data according to a set time window and a set hysteresis period to form the required characteristics of the model; inputting data of a plurality of past time points into an RNN model as characteristics by adopting a sliding window method so as to predict future loads;
step 202: dividing the feature matrix input into the RNN model according to the proportion of a 70% training set, a 15% verification set and a 15% test set;
step 203: respectively carrying out normalization processing on the training set and the testing set according to the maximum and minimum values;
step 204: inputting the normalized training set into an RNN model for training;
step 205: training a Recurrent Neural Network (RNN) using a BPTT algorithm; the BPTT algorithm is a back propagation algorithm that deploys the RNN into a feed-forward neural network and treats each time step as a layer of the network. And for each time step, calculating the gradient of the loss function of the current time step on the neural network parameters by using a back propagation algorithm, and accumulating the gradient of each time step to finally obtain the gradient of the whole sequence. The BPTT algorithm can process sequences with any length, but needs to be back-propagated on the whole sequence, so that the problems of high calculation and storage cost can occur when processing long sequences, and meanwhile, special processing is needed to avoid the problems of gradient disappearance or explosion;
step 206: dividing the training data set into a plurality of subsets, wherein each subset is processed by an independent computing unit, each computing unit uses the same RNN model, model parameters are updated on each time step, and finally, the results processed by the computing units are aggregated to obtain a final result;
step 207: to migrate the RNN model to the GPU, training and reasoning are performed using the parallel computing capabilities of the GPU, the following steps are required: the GPU driver and the CUDA are installed first, and the framework and the library supported by the GPU are selected. And deploying the RNN model on the GPU, and designating the GPU as the computing equipment. And then forward propagating the input data by using the model to obtain an inference result. Note whether GPU video memory is sufficient or not, and GPU optimization techniques may be used to improve performance. Finally, more information can be obtained with reference to the frame document and the optimization guideline. The GPU acceleration can obviously improve the calculation speed and accelerate the model training and prediction process;
step 208: and inputting the normalized test set into a trained RNN model for testing.
Further, the RNN training method further includes step 209: and performing performance analysis on the trained model, evaluating by adopting Root Mean Square Error (RMSE) and average absolute error (MAE), and comparing the results of the historical power load data by using different parameters through an ablation experiment.
Further, in the model training stage, a back propagation algorithm is adopted to perform model training.
The invention has the beneficial effects that: compared with the prior art, the invention has the following effects:
1) The invention uses the RNN model to fully utilize the historical data to predict, and has higher prediction accuracy and stability. Through learning historical load data, the RNN can capture time sequence characteristics and long-term dependence of the data, so that future load changes can be predicted better.
2) The invention carries out high-efficiency training and optimization on the RNN model through a back propagation algorithm and a parallel computing technology. The back propagation algorithm can update the weight and bias of the model according to the prediction error, thereby continuously improving the prediction accuracy of the model. And the GPU parallel computing technology can rapidly execute parallel computing tasks, so that the model training time is greatly shortened, and the training efficiency is improved.
3) The invention evaluates the model prediction result through two indexes, such as Root Mean Square Error (RMSE), average absolute error (MAE) and the like, and simultaneously displays and compares the prediction result by matching with a visualization technology, thereby helping a power system decision maker to better understand the prediction result and make corresponding countermeasures. The reliability and the practicability of the model are further improved, and powerful support is provided for the stable operation and the sustainable development of the power system.
Drawings
FIG. 1 is a schematic diagram of a basic flow of data preprocessing according to the present invention;
FIG. 2 is a schematic diagram of the basic flow of the model prediction of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific examples.
Example 1: 1-2, the data-intensive power load prediction parallel optimization method based on the RNN model comprises two parts: firstly, carrying out null value processing and abnormal value detection and processing on original data, and secondly, inputting a preprocessed data set into an RNN model for training prediction, parallel optimization and evaluation.
1. After the original data is read into the program, the basic flow of the basic data preprocessing is executed according to the basic flow shown in fig. 1, and the following main steps are adopted in the basic flow:
step 101: reading Global Energy Forecasting Competition data sets containing historical data of real electric loads, wind power generation and solar power generation from different countries into a program;
step 102: and processing the missing values in the source data set read into the program. Filling with the previous value;
step 103: detecting and processing the abnormal value after processing the null value, and eliminating the abnormal data by using a 3 sigma rule;
step 104: the raw data is divided into a training set, a verification set and a test set according to time. Typically, the first 80% of the data is used as a training set, the next 10% is used as a validation set, and the remaining 10% is used as a test set;
step 105: converting the data according to the problem requirements, converting the temperature from degrees celsius to degrees fahrenheit, converting the timestamp to datetime format, and the like; resampling and interpolating the time series data to fill up missing values or to normalize the data sampling frequency;
step 106: performing feature extraction on the data subjected to resampling and interpolation processing in step 105 through feature engineering for model training;
step 107: and extracting time characteristics and statistical characteristics from the data subjected to resampling and difference processing in the step 105 to represent the change rule of the electric load. Time characteristics include hours, weeks, seasons, holidays, etc.; the statistical characteristics comprise the maximum value, the minimum value, the average value, the standard deviation and the like of the historical load data; the method comprises the steps of carrying out a first treatment on the surface of the
Step 108: normalizing the data obtained in the step 107 to unify the value ranges of all the features to be [0,1]]The method has the advantages that the excessive influence of certain characteristics on model training is avoided, normalization is realized by adopting a linear transformation method, the value ranges of all the characteristics are ensured to be the same, the excessive influence of certain characteristics on model training is prevented, and the original data linearization method is converted into [0,1]]The normalized formula is:wherein x is max X is the maximum value of the sample data min Is the minimum value of the sample data;
step 109: the data is parallelized, and the data is divided into a plurality of small batches for processing due to overlarge data scale, so that the model training efficiency is improved;
step 110: the processed data is stored, so that the subsequent model training and prediction use are facilitated;
2. the data is cleaned and preprocessed, and then a time sequence is carried out, wherein the input of each time step comprises the input of power load data of a plurality of time steps in the past into an RNN model for training, and the following steps are main steps of a basic flow:
step 201: and carrying out sliding window processing on the preprocessed data according to a certain time window and a certain hysteresis period to form the required characteristics of the model. Inputting data of a plurality of past time points into an RNN model as characteristics by adopting a sliding window method so as to predict future loads;
step 202: dividing the feature matrix input into the RNN model according to the proportion of a 70% training set, a 15% verification set and a 15% test set;
step 203: respectively carrying out normalization processing on the training set and the testing set according to the maximum and minimum values;
step 204: inputting the normalized training set into an RNN model for training;
step 205: in the present invention, a Recurrent Neural Network (RNN) is trained using the BPTT algorithm. The BPTT algorithm is a back propagation algorithm that deploys the RNN into a feed-forward neural network and treats each time step as a layer of the network. And for each time step, calculating the gradient of the loss function of the current time step on the neural network parameters by using a back propagation algorithm, and accumulating the gradient of each time step to finally obtain the gradient of the whole sequence. The BPTT algorithm can process sequences with any length, but needs to be back-propagated on the whole sequence, so that the problems of high calculation and storage cost can occur when processing long sequences, and meanwhile, special processing is needed to avoid the problems of gradient disappearance or explosion;
step 206: the training data set is divided into a plurality of subsets, each subset being processed by a separate computing unit, each computing unit using the same RNN model and updating model parameters at each time step. Finally, the results processed by the computing units are aggregated to obtain a final result;
step 207: to migrate the RNN model to the GPU, training and reasoning are performed using the parallel computing capabilities of the GPU, the following steps are required: the GPU driver and the CUDA are installed first, and the framework and the library supported by the GPU are selected. And deploying the RNN model on the GPU, and designating the GPU as the computing equipment. And then forward propagating the input data by using the model to obtain an inference result. Note whether GPU video memory is sufficient or not, and GPU optimization techniques may be used to improve performance. Finally, more information can be obtained with reference to the frame document and the optimization guideline. The GPU acceleration can obviously improve the calculation speed and accelerate the model training and prediction process;
step 208: inputting the normalized test set into a trained RNN model for testing;
step 209: performing performance analysis on the trained model: the results of the historical electrical load data were compared by ablation experiments using different parameters.
The invention adopts the data set: global energy prediction contest (Global Energy Forecasting Competition, GEFC) datasets were sponsored by the international energy prediction research consortium (International Energy Forecasting Association, IEFA) aimed at improving the prediction accuracy of electrical loads, wind power generation and solar power generation. The data set contains historical data of real electric loads, wind power generation and solar power generation from different countries, the data granularity can be hours, days, weeks or months, and the time span is relatively long.
The data set is characterized by diversity and practicability, can be used for prediction research of different types of electric loads and renewable energy sources, and can also be used for testing and comparing the effects of different prediction methods.
The invention adopts the optimized training method that:
1) Data cleaning and pretreatment: data cleansing and preprocessing are critical steps in the process of predicting and parallel optimizing data-intensive electrical loads. First, the data needs to be cleaned to remove possible outliers and unreasonable data. For the Global Energy Forecasting Competition dataset, it is necessary to extract factors affecting the electrical load, such as temperature, weather, season, etc., using feature engineering techniques. These factors can have an important impact on the electrical load, which can be extracted and analyzed to better predict future load conditions. Meanwhile, as missing values may exist in the data, filling processing is required for the missing values so as to ensure the integrity and accuracy of the data. Finally, the normalization technology is also needed to process the data, so that the model can better understand the distribution and rule of the data, and the prediction accuracy is improved.
2) Model training and parallel optimization: in electrical load prediction and parallel optimization, a Recurrent Neural Network (RNN) model is employed for modeling. A recurrent neural network is a neural network with memory capability that can effectively capture timing information in time series data as the data is processed. Thus, in electrical load prediction, the recurrent neural network is able to process historical load data well while correlating it with future loads, thereby enabling load prediction.
In the model training stage, a back propagation algorithm is adopted to carry out model training so as to improve the accuracy and the prediction capability of the model. The back propagation algorithm updates the model parameters by calculating the loss function, so that the predicted result of the model is more similar to the real result. In the cyclic neural network model, the back propagation algorithm also needs to consider the specificity of time series data, so that special processing is needed, and the problem can be well solved by adopting the BPTT (Backpropagation Through Time) algorithm to realize model training.
Meanwhile, in order to accelerate the model training speed and improve the efficiency, the model can be optimized by utilizing a parallel computing technology, the RNN model is decomposed into a plurality of sub-models, each sub-model is responsible for processing a part of time steps, different sub-models can be operated on different computing units, and each computing unit is responsible for updating parameters of the corresponding sub-model. And finally, polymerizing the results processed by the sub-models to obtain a final result. GPU parallel computing techniques are used to accelerate model training. The GPU has higher parallel computing capacity and large-scale data processing capacity, and can effectively improve the model training speed and efficiency.
In addition, the model performance is optimized by adjusting the model structure and parameters, so that a better prediction result and a parallel optimization effect are obtained. In the cyclic neural network model, the model performance is optimized by adjusting parameters such as the network layer number, the hidden layer size, the activation function, the learning rate and the like, and meanwhile, the overfitting problem is avoided by technologies such as Dropout, batch Normalization and the like.
3) Evaluation of prediction results: and evaluating the prediction result of the model to judge the prediction capability and accuracy of the model. Two evaluation indices were used: root Mean Square Error (RMSE), mean Absolute Error (MAE). The root mean square error is the square root of the mean of the sum of squares of the differences between the predicted and actual values, and the mean absolute error is the mean of the absolute values of the differences between the predicted and actual values. These metrics may help researchers better understand the predictive power and accuracy of the model.
In addition to using the evaluation index, the prediction results are also presented and compared using a visualization technique. The actual and predicted values are plotted on the same graph to more intuitively compare the differences and trends between them. In addition, a time sequence analysis method is adopted to analyze and visually display the predicted result so as to help a power system decision maker to better understand the predicted result and make corresponding countermeasures. In general, through evaluation and visualization of the predicted outcome, power system decision makers can be helped to better understand the predicted outcome and make corresponding decisions.
The invention has the following advantages:
1. data cleaning and pretreatment: and cleaning and processing the electric load data to improve the quality and usability of the data.
2. Model selection and optimization: and selecting an RNN model suitable for electric load prediction, and further improving the prediction effect by adjusting the model structure and parameters.
3. Training and parallel computing: the training and prediction processes are accelerated through parallel computation, and meanwhile, the computation speed can be further improved by optimizing the model structure and parameter setting.
4. Model generalization ability: because the power load data has time variability and uncertainty, the generalization capability of the model is improved through a reasonable training and verifying method, and the problems of over fitting or under fitting and the like are avoided.
5. Real-time and reliability: the power load prediction needs to be highly real-time and reliable, and therefore needs to be implemented using fast and reliable algorithms and systems.
To sum up: the invention makes full use of the historical data to predict, and can capture the time sequence characteristics and long-term dependency relationship, thereby improving the prediction accuracy and stability. The invention uses a back propagation algorithm and a parallel computing technology to train and optimize the model with high efficiency, thereby accelerating the training speed of the model and improving the training efficiency. According to the invention, the model prediction result is evaluated through the two indexes, and the prediction result is displayed and compared by using a visual technology, so that a decision maker is helped to better understand the prediction result and formulate corresponding countermeasures, and the reliability and the practicability of the model are further improved. Aiming at the characteristics of power load prediction and parallel optimization, the invention adjusts the model structure and parameters, and further improves the model performance.
The foregoing is merely illustrative of the present invention, and the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the scope of the present invention, and therefore, the scope of the present invention shall be defined by the scope of the appended claims.

Claims (9)

1. A data-intensive power load prediction parallel optimization method based on an RNN model is characterized by comprising the following steps of: the method comprises the following steps:
step 1, carrying out null value processing and abnormal value detection and processing on original data;
and 2, inputting the electric load data of a time sequence after the processing of the step 1 into the RNN model for training, wherein the input of each time step of the time sequence comprises the past several time steps.
2. The RNN model-based data-intensive power load prediction parallel optimization method according to claim 1, wherein: the detection and processing steps in the step 1 are specifically as follows:
step 101: the raw data contains Global Energy Forecasting Competition datasets of historical data from real electrical loads, wind power generation and solar power generation in different countries;
step 102: processing the missing values in the source data set in the step 101;
step 103: detecting and processing the abnormal value after the hollow value in the step 102 is processed, and eliminating abnormal data;
step 104: dividing the original data into a training set, a verification set and a test set according to time;
step 105: converting the data according to the problem requirements, converting the temperature from degrees celsius to degrees fahrenheit, and converting the timestamp to a datetime format; resampling and interpolating the time series data;
step 106: performing feature extraction on the data subjected to resampling and interpolation processing in step 105 through feature engineering for model training;
step 107: extracting time characteristics and statistical characteristics from the resampled and difference processed data in step 105 to represent the change rule of the power load, wherein the time characteristics comprise hours, weeks, seasons, holidays and the like; the statistical characteristics comprise the maximum value, the minimum value, the average value, the standard deviation and the like of the historical load data;
step 108: performing normalization processing on the characteristic data extracted in the step 107, and converting the original data linearization method into a range of [0,1 ];
step 109: carrying out data parallelization processing;
step 110: and storing the processed data.
3. The RNN model-based data-intensive power load prediction parallel optimization method according to claim 2, wherein: the missing value processing method in step 102 is as follows: the filling process is performed with the value of the previous time.
4. The RNN model-based data-intensive power load prediction parallel optimization method according to claim 2, wherein: the outlier rejection method in step 103 uses the 3 sigma rule.
5. The RNN model-based data-intensive power load prediction parallel optimization method according to claim 2, wherein: the first 80% of the data is used as the training set, the middle 10% is used as the validation set, and the last 10% is used as the test set in step 104.
6. The RNN model-based data-intensive power load prediction parallel optimization method according to claim 2, wherein: the normalization formula in step 108 is:wherein x is max X is the maximum value of the sample data min Is the minimum value of the sample data.
7. The RNN model-based data-intensive power load prediction parallel optimization method according to claim 2, wherein: the RNN training method in step 2 comprises the following steps:
step 201: carrying out sliding window processing on the preprocessed data according to a set time window and a set hysteresis period to form the required characteristics of the model; inputting data of a plurality of past time points into an RNN model as characteristics by adopting a sliding window method so as to predict future loads;
step 202: dividing the feature matrix input into the RNN model according to the proportion of a 70% training set, a 15% verification set and a 15% test set;
step 203: respectively carrying out normalization processing on the training set and the testing set according to the maximum and minimum values;
step 204: inputting the normalized training set into an RNN model for training;
step 205: training a Recurrent Neural Network (RNN) using a BPTT algorithm;
step 206: dividing the training data set into a plurality of subsets, wherein each subset is processed by an independent computing unit, each computing unit uses the same RNN model, model parameters are updated on each time step, and finally, the results processed by the computing units are aggregated to obtain a final result;
step 207: transplanting the RNN model to the GPU, and training and reasoning by using the parallel computing capability of the GPU; to use GPU to infer RNN model, the following steps are needed: firstly installing a GPU driver and a CUDA, selecting a framework and a library supported by the GPU, then putting an RNN model on the GPU, setting the GPU as a computing device, then carrying out forward propagation on input data by using the model to obtain an reasoning result, if the GPU video memory is not enough, improving the performance by using a GPU optimization technology, and finally checking a framework document and optimizing a guideline to learn more;
step 208: and inputting the normalized test set into a trained RNN model for testing.
8. The RNN model-based data-intensive power load prediction parallel optimization method of claim 7, wherein: further comprising step 209: performing performance analysis on the trained model: the results of historical electrical load data were compared by ablation experiments using different parameters using Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) evaluations.
9. The RNN model-based data-intensive power load prediction parallel optimization method of claim 7, wherein: in the model training stage, a back propagation algorithm is adopted for model training.
CN202310661973.9A 2023-06-06 2023-06-06 Data-intensive power load prediction parallel optimization method based on RNN model Pending CN116681173A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310661973.9A CN116681173A (en) 2023-06-06 2023-06-06 Data-intensive power load prediction parallel optimization method based on RNN model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310661973.9A CN116681173A (en) 2023-06-06 2023-06-06 Data-intensive power load prediction parallel optimization method based on RNN model

Publications (1)

Publication Number Publication Date
CN116681173A true CN116681173A (en) 2023-09-01

Family

ID=87778680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310661973.9A Pending CN116681173A (en) 2023-06-06 2023-06-06 Data-intensive power load prediction parallel optimization method based on RNN model

Country Status (1)

Country Link
CN (1) CN116681173A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118035683A (en) * 2024-02-21 2024-05-14 广州龙数科技有限公司 Time sequence data analysis method, system and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118035683A (en) * 2024-02-21 2024-05-14 广州龙数科技有限公司 Time sequence data analysis method, system and equipment

Similar Documents

Publication Publication Date Title
CN113962364B (en) Multi-factor power load prediction method based on deep learning
CN107578124B (en) Short-term power load prediction method based on multilayer improved GRU neural network
Lin et al. An improved moth-flame optimization algorithm for support vector machine prediction of photovoltaic power generation
CN109948845A (en) A kind of distribution network load shot and long term Memory Neural Networks prediction technique
CN110751318A (en) IPSO-LSTM-based ultra-short-term power load prediction method
Bendali et al. Deep learning using genetic algorithm optimization for short term solar irradiance forecasting
CN113107626B (en) Load prediction method of combined cycle generator set based on multivariable LSTM
Cao et al. Multi-step wind power forecasting model using LSTM networks, similar time series and LightGBM
CN114218872B (en) DBN-LSTM semi-supervised joint model-based residual service life prediction method
CN108596242A (en) Power grid meteorology load forecasting method based on wavelet neural network and support vector machines
Liu et al. Heating load forecasting for combined heat and power plants via strand-based LSTM
CN112163689A (en) Short-term load quantile probability prediction method based on depth Attention-LSTM
Sodsong et al. Short-term solar PV forecasting using gated recurrent unit with a cascade model
CN113158572A (en) Short-term load prediction method and device
CN115860177A (en) Photovoltaic power generation power prediction method based on combined machine learning model and application thereof
CN114021848A (en) Generating capacity demand prediction method based on LSTM deep learning
CN116681173A (en) Data-intensive power load prediction parallel optimization method based on RNN model
CN115409369A (en) Comprehensive energy system reliability evaluation method based on mechanism and data hybrid driving
CN115238854A (en) Short-term load prediction method based on TCN-LSTM-AM
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
CN111697560B (en) Method and system for predicting load of power system based on LSTM
CN117132132A (en) Photovoltaic power generation power prediction method based on meteorological data
CN115481788B (en) Phase change energy storage system load prediction method and system
CN113095534A (en) Wind power prediction method combining ARIMA and improved Elman neural network
CN115759343A (en) E-LSTM-based user electric quantity prediction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination