CN111461400B - Kmeans and T-LSTM-based load data completion method - Google Patents

Kmeans and T-LSTM-based load data completion method Download PDF

Info

Publication number
CN111461400B
CN111461400B CN202010128406.3A CN202010128406A CN111461400B CN 111461400 B CN111461400 B CN 111461400B CN 202010128406 A CN202010128406 A CN 202010128406A CN 111461400 B CN111461400 B CN 111461400B
Authority
CN
China
Prior art keywords
data
load
day
load data
complemented
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010128406.3A
Other languages
Chinese (zh)
Other versions
CN111461400A (en
Inventor
冯珺
陈蕾
童力
黄红兵
黄海潮
陈彤
黄�俊
余慧华
韩翊
陈建铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Zhejiang Electric Power Co Ltd
Zhejiang Huayun Information Technology Co Ltd
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
State Grid Zhejiang Electric Power Co Ltd
Zhejiang Huayun Information Technology Co Ltd
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Zhejiang Electric Power Co Ltd, Zhejiang Huayun Information Technology Co Ltd, Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd filed Critical State Grid Zhejiang Electric Power Co Ltd
Priority to CN202010128406.3A priority Critical patent/CN111461400B/en
Publication of CN111461400A publication Critical patent/CN111461400A/en
Application granted granted Critical
Publication of CN111461400B publication Critical patent/CN111461400B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Economics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Strategic Management (AREA)
  • Evolutionary Biology (AREA)
  • Human Resources & Organizations (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Mathematical Optimization (AREA)
  • Marketing (AREA)
  • Mathematical Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)

Abstract

The invention discloses a load data complement method based on Kmeans and T-LSTM, and relates to a data complement method. The existing data complement method has large data deviation and often cannot achieve the expected effect. The invention comprises the following steps: constructing a data model; respectively training the data of the K load intervals to obtain corresponding K data models; the load data of the same day of the data to be complemented are taken at regular time; calculating the average value of load data of the same day; acquiring a corresponding data model according to the average value; and inputting the to-be-complemented load data into a corresponding data model, and calculating to obtain the complemented complete load data. According to the technical scheme, load data with similar characteristics can be classified into one type, and interference of data with different characteristics is discharged; the real load value of the missing data can be accurately reflected. The method realizes accurate data complement and has the advantages of small error and high convergence rate.

Description

Kmeans and T-LSTM-based load data completion method
Technical Field
The invention relates to a data complement method, in particular to a load data complement method based on Kmeans and T-LSTM.
Background
In the current age background, the rapid development and diversified data acquisition approaches of information industry technologies enable the data volume of each industry organization to be increased rapidly, for example, the power load data of the national network has extremely large data storage quantity, and the rapid speed is still increased rapidly at present. Experience has shown that there are many available contents in these data, and it is very interesting to extract potential data value and to apply upper layers if it is possible to analyze the content underlying the data more effectively and completely.
However, most theoretical innovation, development and technical implementation in the current data mining field are based on ideal and complete data sets, however, load data collected by a real terminal is missing and incomplete due to various reasons such as terminal damage and no communication, and incomplete load data can distort, invalidate and even draw an erroneous conclusion on the result of data mining. The completion of the missing data is a particularly important, non-negligible link in the data mining process.
The existing data complement methods comprise linear complement, difference value complement and the like, the idea of the linear complement algorithm is to obtain a missing data value by using the average of the previous time data and the next time data of the missing point, and the method is simple but has large deviation compared with a true value, and often cannot achieve the expected effect. Moreover, many complement algorithms do not classify historical load data, and the model is influenced by abrupt change of the load data, so that the error is too large. In addition, the LSTM (Long Short Term Memory) network based on the time sequence has better complementing effect under the condition that continuous and time intervals are regular, but the actual condition is that missing data are random, so that the LSTM network data complementing can not meet the requirement.
Disclosure of Invention
The invention aims to solve the technical problems and provide the technical task of perfecting and improving the prior art scheme, and provides a load data complement method based on Kmeans and T-LSTM so as to achieve the purpose of accurately complementing data. For this purpose, the present invention adopts the following technical scheme.
A load data complement method based on Kmeans and T-LSTM comprises the following steps:
1) Constructing a data model;
101 Batch acquiring load data;
102 Randomly digging out continuous points in the load data as the load data to be complemented;
103 Carrying out Kmeans clustering on the load data;
104 Obtaining optimal K classification modes through Kmeans clustering, dividing a total sample into K categories according to the K classification modes, wherein each category corresponds to different load sections, and obtaining K classified load sections;
105 Calculating a load average value and carrying out normalization processing on load data;
106 Determining a load interval according to the load average value, and inputting the load data subjected to normalization processing into a T-LSTM neural network of the corresponding load interval for training, so as to obtain a data model of the corresponding load interval; respectively training the data of the K load intervals to obtain corresponding K data models;
2) The load data of the same day of the data to be complemented are taken at regular time;
3) Calculating the average value of load data of the same day;
4) Acquiring a corresponding data model according to the average value;
5) And inputting the to-be-complemented load data into a corresponding data model, and calculating to obtain the complemented complete load data.
As a preferable technical means: when the data model is constructed:
in step 101), the acquired load data includes load data of a certain unit of a certain day and 1 day before a certain day and a seventh day;
in step 102), randomly digging out continuous points in load data of a certain day as load data to be complemented;
in step 105), the average value of the load on a certain day is calculated, and the load data on a certain day and the day and seventh days before are normalized.
As a preferable technical means: in the step 2), load data of the day before and the seventh day before the data to be complemented are also obtained in addition to the load data of the day of the data to be complemented at regular time;
in step 5), load data of the previous day and the previous seventh day of the normalization processing are input into the corresponding data model, in addition to the data to be complemented are input into the corresponding data model; the data model is complemented according to the load data of the day, the previous day and the previous seventh day.
As a preferable technical means: step 104), K value is obtained by using an elbow method when Kmeans clustering is performed.
As a preferable technical means: and when the step 1) is carried out to construct the data model, finally, a verification step is further included, the data with the missing is normalized and then is input into the corresponding data model, the historical information at the moment is supplemented, the historical data before yesterday and seven days are included, the complete sequence is finally obtained, then the complete sequence is compared with the real data to obtain an error, and after the error is converged, training is finished, and a final data model is obtained and stored.
The beneficial effects are that: according to the technical scheme, the Kmeans method is adopted for clustering the collected public variable load data, the load data with similar characteristics can be well classified, and interference of different characteristic data is discharged. And then the data of the same category is input into the T-LSTM neural network, because the T-LSTM design considers the deletion rule of the load deletion data, some of the deletion data are continuous, some of the deletion data are discontinuous, and the delta T can be well distinguished, so that the neural network learns interval information, and the real load value of the deletion data can be reflected more accurately. The method realizes accurate data complement and has the advantages of small error and high convergence rate.
Drawings
Fig. 1 is a flow chart of the present invention.
Fig. 2 is a graph of the sum of squares of cluster errors versus k of the present invention.
Fig. 3 is a diagram of the LSTM network structure of the present invention.
FIG. 4 is a diagram of the structure of the T-LSTM of the present invention.
Fig. 5 is a data model training diagram of the present invention.
FIG. 6 is a test flow chart of the present invention
Detailed Description
The technical scheme of the invention is further described in detail below with reference to the attached drawings.
As shown in fig. 1, the present invention includes the steps of:
1) Constructing a data model;
101 Obtaining load data of a certain day and 1 day before a certain day of a certain unit in batches;
102 Randomly digging out continuous points in the load data as the load data to be complemented;
103 Carrying out Kmeans clustering on the load data;
104 Obtaining optimal K classification modes through Kmeans clustering, dividing a total sample into K categories according to the K classification modes, wherein each category corresponds to different load sections, and obtaining K classified load sections;
105 Calculating the average value of the load on a certain day, and normalizing the load data on the day before and the seventh day before the certain day;
106 Determining a load interval according to the load average value, and inputting the load data subjected to normalization processing into a T-LSTM neural network of the corresponding load interval for training, so as to obtain a data model of the corresponding load interval; respectively training the data of the K load intervals to obtain corresponding K data models;
2) The method comprises the steps of regularly taking load data of the day of data to be complemented, load data of the day before the data to be complemented and load data of the seventh day before the data to be complemented;
3) Calculating the average value of load data of the same day;
4) Acquiring a corresponding data model according to the average value;
5) And inputting the load data to be complemented and the load data of the previous day and the previous seventh day of normalization processing into corresponding data models, and calculating to obtain the complemented complete load data.
The following further describes some of the steps:
kmeans clustering: the K value is obtained by using the elbow method, and the clustering effect is the best because the curvature at the elbow is the largest.
The technical scheme adopts an elbow method to determine the k value (the number of clusters) of the clusters. The core idea of the elbow method is that when K is smaller than the actual cluster number, the aggregation degree of each cluster is greatly increased due to the increase of K, so that the decrease amplitude of the square sum of the cluster errors of all samples is large, when K reaches the actual cluster number, the return of the aggregation degree obtained by increasing K again is rapidly reduced, so that the decrease amplitude of the square sum of the cluster errors is rapidly reduced, and then gradually becomes gentle along with the continuous increase of K value, namely, the relation diagram of the square sum of the cluster errors and K is an elbow shape, and the K (curvature highest) value corresponding to the elbow is the actual cluster number of data, so that the K value is determined by utilizing the characteristic.
Because the power supply characteristics of different public transformers are different, the daily load changes of the public transformers have the characteristics of the public transformers, and the absolute values of the loads are also very different, the method for classifying the data by using the clustering analysis is provided, and interference among samples with different power supply characteristics is eliminated. And (3) dividing the total auspicious cost into a plurality of categories through Kmeans clustering, and taking the total auspicious cost as a training sample of each data complement network. The method comprises the following specific steps: taking 96 load values of 4 kilo-metric transformers of Jinhua department in one day and the load values of the same as characteristics of samples, inputting the characteristics into a Kmeans clustering model, and drawing a relation graph of a sum of squares of clustering errors (sum of differences between the load values of the samples and the load values of the center points) and k as shown in figure 2. Since k decreases relatively rapidly before 3 and gradually from 3 onwards, the number of clusters of kmeans can be assumed to be 3 (the highest curvature).
T-LSTM (long and short term memory network of variants): with the T-LSTM neural network, the problem of missing data complement can be well handled by the T-LSTM, wherein the uncertainty of the missing data of the load is considered, and the situation of missing of a plurality of points is possible.
LSTM was originally proposed by Hochreiter et al and improved by Graves, a modified version of the recurrent neural network proposed for the gradient explosion problem and long-term dependency problem in the native RNN, as shown in fig. 3. The main work of the LSTM is to modify the internal structure of the RNN network, and control the memory duration is realized by adding a plurality of gates, for example, a plurality of information is filtered by a plurality of forgetting gates, so that the information is remembered for a longer time. As shown in fig. 3.
The formula is as follows:
g t =tanh(W g x t +U g h t-1 +b g )
i t =σ(W i x t +U i h t-1 +b i )
f t =σ(W f x t +U f h t-1 +b f )
o t =σ(W o x t +U o h t-1 +b f )
c t =f t ·c {t-1} +i t ·g t
h t =o t ·tanh(c t )
wherein h is t ,c t ∈R H H is the hidden layer size and σ (·) is the sigmoid function, i, f, o, g represent input gate, forget gate, output gate and cell state, respectively.
{W f ,U f ,b f },{W i ,U i ,b i },{W o ,U o ,b o },{W c ,U c ,b c Each is a network parameter of each part. More specifically, the input gate i adjusts the degree of new value data fed into the unit, the forgetting gate f adjusts the degree of forgetting history, and the output gate o determines the weights of different parts to calculate the output.
However, for the data with missing, our input is discontinuous, the time interval is irregular, and the LSTM network cannot have a good processing effect, so the technical solution adopts a T-LSTM network considering the time interval, as shown in fig. 4: Δt is added to the input layer, and other parameters are not changed, so that the network learns the time interval information.
The improved portion of T-LSTM over LSTM is as follows:
Figure BDA0002395117780000071
Figure BDA0002395117780000072
g(δ t )=1/log(e+δt)
Figure BDA0002395117780000076
Figure BDA0002395117780000073
Figure BDA0002395117780000074
Figure BDA0002395117780000075
h t =o t ·tanh(c t )
wherein delta t I.e. the time interval Δt of the current input, the definition of the input gate, the output gate and the forget gate are identical to LSTM, except for the update of the cell state. Compared with LSTM, T-LSTM considers not only the specific value of the current input but also the interval of the input, and solves the problem of inconsistent interval in the time sequence with the missing. Inputting the cell state c at the last moment in each T-LSTM cell t-1 Hidden layer state h t-1 Current input value x t And time interval delta t Obtaining the cell state c of this cell t Hidden layer state h t And proceeds to the next T-LSTM cell.
And (3) calculating the average value of the to-be-completed load data: in order to determine which class of load interval the data to be complemented belongs to.
Training a data model:
as shown in fig. 5, in order to improve accuracy, in step 1), when the data model is constructed, a verification step is finally included, the data with the missing is normalized and then input into the corresponding data model, and the historical information of this moment is supplemented, including the historical data before yesterday and seven days, so as to finally obtain a complete sequence, then the complete sequence is compared with the real data to obtain an error, and when the error converges, training is finished, and a final data model is obtained and stored. The complete model training process comprises the following steps: extracting data as a training data set, performing kmeans clustering to obtain n kinds of load data and load intervals after data processing, normalizing the data with the defects, then encoding by using T-LSTM to obtain a Temporal context, then inputting the Temporal context into a decoder taking the LSTM as a unit, assisting with the historical information of the moment, including the historical data before yesterday and seven days, finally obtaining a complete decoded sequence, comparing the complete decoded sequence with real data to obtain errors, and after error convergence, finishing training to obtain K models and storing.
The following data model training description is performed by taking Jinhua portion data as an example:
1. public transformer load data of 5 thousands of 2018, 11 months to 2019, 5 months and 8 months in total were prepared in Jinhua department.
2. The training data set is processed, namely continuous missing points are dug out as data to be complemented, and the average value of the load of the whole day to be complemented is obtained.
3. And inputting the processed training data into Kmeans for clustering to obtain K classifications.
4. The K classified data are normalized by adding the load data of the first 1 day and the first 7 days and then are input into a T-LSTM network for coding processing to obtain a sample context
5. Inputting the obtained sample context into LSTM decoder, and comparing with real data to obtain error
6. If the error is not converged, continuing training
7. And after the error is converged, training is finished, K models are obtained and stored.
On the basis of obtaining K models, taking the data of Jinhua homemade part as an example to carry out flow description of data completion:
the data set was derived from 221 days of data from 11 months of the last year of Jinhua, a total of 174 users, 96 load points per day. We manually excavate about 1% of the points' data (approximating the true data loss rate) and are all missing for 5 points in succession, which is closer to the true case loss.
The method comprises the following specific steps:
1. data from day 221 of 2018, 11 of Jinhua ministry of China was prepared, and there were 174 public transformer users and 96 load data per day.
2. Data to be completed and load data from the previous day and the previous day 7 were collected in batches.
3. And 5 continuous points to be complemented are manually dug out to serve as verification, and the average value of the data of the days to be complemented is calculated.
4. And judging which type of load interval the load average value belongs to.
5. Data normalization
5. The historical information of the moment of adding the missing value to the trained model is added, the historical information comprises the historical data of the previous day and the previous seventh day, and the completed load data is finally obtained
And calculating the completed load data and the original data to obtain the average absolute error and average absolute percentage error of the test data. The data are shown in Table one.
The left is the result obtained by the method, the right is the result obtained by a linear model (the missing data value is obtained by averaging the sum of the last point and the last point), where mae is the mean absolute error and mape is the mean absolute percent error. It can be seen that the method is better than the linear model and has a percentage error of about 10% in the case of a relatively large load value.
Table one: test results
Figure BDA0002395117780000101
The load data complement method based on Kmeans and T-LSTM shown in the above figures 1-6 is a specific embodiment of the present invention, has already shown the essential characteristics and improvements of the present invention, and can be modified in terms of shape, structure, etc. according to the practical use requirement, under the teaching of the present invention, all of which are within the scope of protection of the present invention.

Claims (5)

1. The load data complement method based on Kmeans and T-LSTM is characterized by comprising the following steps:
1) Constructing a data model;
101 Batch acquiring load data;
102 Randomly digging out continuous points in the load data as the load data to be complemented;
103 Carrying out Kmeans clustering on the load data;
104 Obtaining optimal K classification modes through Kmeans clustering, dividing a total sample into K categories according to the K classification modes, wherein each category corresponds to different load sections, and obtaining K classified load sections;
105 Calculating a load average value and carrying out normalization processing on load data;
106 Determining a load interval according to the load average value, and inputting the load data subjected to normalization processing into a T-LSTM neural network of the corresponding load interval for training, so as to obtain a data model of the corresponding load interval; respectively training the data of the K load intervals to obtain corresponding K data models;
2) The load data of the same day of the data to be complemented are taken at regular time;
3) Calculating the average value of load data of the same day;
4) Acquiring a corresponding data model according to the average value;
5) And inputting the to-be-complemented load data into a corresponding data model, and calculating to obtain the complemented complete load data.
2. The Kmeans and T-LSTM based load data completion method of claim 1, wherein:
when the data model is constructed:
in step 101), the acquired load data includes load data of a certain unit of a certain day and 1 day before a certain day and a seventh day;
in step 102), randomly digging out continuous points in load data of a certain day as load data to be complemented;
in step 105), the average value of the load on a certain day is calculated, and the load data on a certain day and the day and seventh days before are normalized.
3. A Kmeans and T-LSTM based load data completion method according to claim 2, wherein: in the step 2), load data of the day before and the seventh day before the data to be complemented are also obtained in addition to the load data of the day of the data to be complemented at regular time;
in step 5), load data of the previous day and the previous seventh day of the normalization processing are input into the corresponding data model, in addition to the data to be complemented are input into the corresponding data model; the data model is complemented according to the load data of the day, the previous day and the previous seventh day.
4. A Kmeans and T-LSTM based load data completion method according to claim 3, wherein: step 104), K value is obtained by using an elbow method when Kmeans clustering is performed.
5. A Kmeans and T-LSTM based load data completion method according to claim 2, wherein: and when the step 1) is carried out to construct the data model, finally, a verification step is further included, the data with the missing is normalized and then is input into the corresponding data model, the historical information at the moment is supplemented, the historical data before yesterday and seven days are included, the complete sequence is finally obtained, then the complete sequence is compared with the real data to obtain an error, and after the error is converged, training is finished, and a final data model is obtained and stored.
CN202010128406.3A 2020-02-28 2020-02-28 Kmeans and T-LSTM-based load data completion method Active CN111461400B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010128406.3A CN111461400B (en) 2020-02-28 2020-02-28 Kmeans and T-LSTM-based load data completion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010128406.3A CN111461400B (en) 2020-02-28 2020-02-28 Kmeans and T-LSTM-based load data completion method

Publications (2)

Publication Number Publication Date
CN111461400A CN111461400A (en) 2020-07-28
CN111461400B true CN111461400B (en) 2023-06-23

Family

ID=71682448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010128406.3A Active CN111461400B (en) 2020-02-28 2020-02-28 Kmeans and T-LSTM-based load data completion method

Country Status (1)

Country Link
CN (1) CN111461400B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107833153A (en) * 2017-12-06 2018-03-23 广州供电局有限公司 A kind of network load missing data complementing method based on k means clusters
CN109598381A (en) * 2018-12-05 2019-04-09 武汉理工大学 A kind of Short-time Traffic Flow Forecasting Methods based on state frequency Memory Neural Networks
CN109754113A (en) * 2018-11-29 2019-05-14 南京邮电大学 Load forecasting method based on dynamic time warping Yu length time memory
CN109934375A (en) * 2018-11-27 2019-06-25 电子科技大学中山学院 Power load prediction method
CN110245801A (en) * 2019-06-19 2019-09-17 中国电力科学研究院有限公司 A kind of Methods of electric load forecasting and system based on combination mining model
CN110334726A (en) * 2019-04-24 2019-10-15 华北电力大学 A kind of identification of the electric load abnormal data based on Density Clustering and LSTM and restorative procedure
CN110674999A (en) * 2019-10-08 2020-01-10 国网河南省电力公司电力科学研究院 Cell load prediction method based on improved clustering and long-short term memory deep learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190143517A1 (en) * 2017-11-14 2019-05-16 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for collision-free trajectory planning in human-robot interaction through hand movement prediction from vision

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107833153A (en) * 2017-12-06 2018-03-23 广州供电局有限公司 A kind of network load missing data complementing method based on k means clusters
CN109934375A (en) * 2018-11-27 2019-06-25 电子科技大学中山学院 Power load prediction method
CN109754113A (en) * 2018-11-29 2019-05-14 南京邮电大学 Load forecasting method based on dynamic time warping Yu length time memory
CN109598381A (en) * 2018-12-05 2019-04-09 武汉理工大学 A kind of Short-time Traffic Flow Forecasting Methods based on state frequency Memory Neural Networks
CN110334726A (en) * 2019-04-24 2019-10-15 华北电力大学 A kind of identification of the electric load abnormal data based on Density Clustering and LSTM and restorative procedure
CN110245801A (en) * 2019-06-19 2019-09-17 中国电力科学研究院有限公司 A kind of Methods of electric load forecasting and system based on combination mining model
CN110674999A (en) * 2019-10-08 2020-01-10 国网河南省电力公司电力科学研究院 Cell load prediction method based on improved clustering and long-short term memory deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
T-LSTM: A Long Short-Term Memory Neural Network Enhanced by Temporal Information for Traffic Flow Prediction;LUNTIAN MOU;IEEE ACCESS;98053-98061 *
基于ST-LSTM 网络的位置预测模型;许芳芳;计算机工程;1-7 *

Also Published As

Publication number Publication date
CN111461400A (en) 2020-07-28

Similar Documents

Publication Publication Date Title
CN111738512B (en) Short-term power load prediction method based on CNN-IPSO-GRU hybrid model
CN107315884B (en) Building energy consumption modeling method based on linear regression
CN106022521B (en) Short-term load prediction method of distributed BP neural network based on Hadoop architecture
WO2018045642A1 (en) A bus bar load forecasting method
CN105488528B (en) Neural network image classification method based on improving expert inquiry method
CN111563706A (en) Multivariable logistics freight volume prediction method based on LSTM network
CN111814956B (en) Multi-task learning air quality prediction method based on multi-dimensional secondary feature extraction
CN112488395A (en) Power distribution network line loss prediction method and system
CN106022954B (en) Multiple BP neural network load prediction method based on grey correlation degree
CN107590565A (en) A kind of method and device for building building energy consumption forecast model
CN110674999A (en) Cell load prediction method based on improved clustering and long-short term memory deep learning
CN111639783A (en) Line loss prediction method and system based on LSTM neural network
CN114065653A (en) Construction method of power load prediction model and power load prediction method
CN114528949A (en) Parameter optimization-based electric energy metering abnormal data identification and compensation method
CN112241836B (en) Virtual load leading parameter identification method based on incremental learning
CN111008726A (en) Class image conversion method in power load prediction
CN109214444B (en) Game anti-addiction determination system and method based on twin neural network and GMM
CN113537469A (en) Urban water demand prediction method based on LSTM network and Attention mechanism
CN110987436A (en) Bearing fault diagnosis method based on excitation mechanism
CN113868938A (en) Short-term load probability density prediction method, device and system based on quantile regression
CN112508286A (en) Short-term load prediction method based on Kmeans-BilSTM-DMD model
CN113095484A (en) Stock price prediction method based on LSTM neural network
CN113139570A (en) Dam safety monitoring data completion method based on optimal hybrid valuation
CN111353603A (en) Deep learning model individual prediction interpretation method
CN113627594B (en) One-dimensional time sequence data augmentation method based on WGAN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant