CN112712194A - Electric quantity prediction method and device for power consumption cost intelligent optimization analysis - Google Patents
Electric quantity prediction method and device for power consumption cost intelligent optimization analysis Download PDFInfo
- Publication number
- CN112712194A CN112712194A CN202011487494.2A CN202011487494A CN112712194A CN 112712194 A CN112712194 A CN 112712194A CN 202011487494 A CN202011487494 A CN 202011487494A CN 112712194 A CN112712194 A CN 112712194A
- Authority
- CN
- China
- Prior art keywords
- user data
- user
- data
- prediction
- processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 84
- 238000013433 optimization analysis Methods 0.000 title claims abstract description 16
- 238000012545 processing Methods 0.000 claims abstract description 92
- 230000005611 electricity Effects 0.000 claims abstract description 84
- 238000012216 screening Methods 0.000 claims abstract description 66
- 238000010276 construction Methods 0.000 claims abstract description 28
- 238000004140 cleaning Methods 0.000 claims abstract description 25
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000001514 detection method Methods 0.000 claims description 16
- 230000002159 abnormal effect Effects 0.000 claims description 14
- 238000012559 user support system Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000007689 inspection Methods 0.000 claims description 4
- 238000005192 partition Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims 1
- 238000012549 training Methods 0.000 description 22
- 230000000694 effects Effects 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000005562 fading Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000000611 regression analysis Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Economics (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Human Resources & Organizations (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Strategic Management (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Water Supply & Treatment (AREA)
- Primary Health Care (AREA)
- Evolutionary Biology (AREA)
- Public Health (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses an electric quantity prediction method and device for intelligent optimization analysis of electricity consumption cost, wherein the method comprises the following steps: cleaning the collected user data to obtain cleaned user data; carrying out modeling sample screening processing on the cleaned user data to obtain screened user data; clustering the screened user data serving as a clustering center according to a preset division type to obtain clustered user data; performing feature construction processing based on the clustered user data to obtain user data features; selecting a prediction model from preset prediction models based on the number of users in the user data and the characteristics of the user data, and obtaining a selected prediction model; and inputting the user data characteristics into the selective prediction model to perform power consumption prediction processing, and obtaining a power consumption prediction result. In the embodiment of the invention, the accuracy can be higher, the power utilization rule of a large user can be captured, and the electric quantity of the user can be accurately predicted in real time.
Description
Technical Field
The invention relates to the technical field of power grid data mining, in particular to an electric quantity prediction method and device for power consumption cost intelligent optimization analysis.
Background
With the rise of big data models in recent years, power data are gradually emphasized, and methods for predicting electric quantity are greatly developed, and the methods for predicting electric quantity in the past are basically focused on the following methods: an electric power elasticity coefficient method, an electric quantity production benefit method, a regression analysis method and a grey prediction method. The specific introduction is as follows: an electric power elasticity coefficient method, an electric quantity production benefit method, a regression analysis method and a grey prediction method.
The above solutions are not very large in computation amount, and have a certain gap in accuracy from the current machine learning algorithm model.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides the electric quantity prediction method and the electric quantity prediction device for the intelligent optimization analysis of the electricity consumption cost, which can be higher in accuracy rate, can capture the electricity consumption law of a large user and accurately predict the electric quantity of the user in real time.
In order to solve the above technical problem, an embodiment of the present invention provides an electric quantity prediction method for power consumption cost intelligent optimization analysis, where the method includes:
carrying out data cleaning processing on the collected user data of the large-scale industrial power utilization users to obtain cleaned user data;
carrying out modeling sample screening processing on the cleaned user data to obtain screened user data;
clustering the screened user data serving as a clustering center according to a preset division type to obtain clustered user data;
performing feature construction processing based on the clustered user data to obtain user data features;
selecting a prediction model from preset prediction models based on the number of users in the user data and the characteristics of the user data, and obtaining a selected prediction model;
and inputting the user data characteristics into the selection prediction model to perform power consumption prediction processing, and obtaining a power consumption prediction result.
Optionally, the data cleaning process includes effective user data screening, user data outlier processing, user data missing value padding, and user data white noise checking.
Optionally, the data cleaning processing is performed on the collected user data of the large-scale industrial power consumption user, so as to obtain the cleaned user data, and the method includes:
setting the data which are negative values and 0 in the user data as Nan;
calculating the average value and the median of the user monthly electricity consumption in the user data, comparing the difference between the average value and the median of the user monthly electricity consumption, screening the user monthly electricity consumption in the electricity consumption data one by one if the difference is larger than a preset difference, and setting the user data with the difference between the monthly electricity consumption and the median larger than the preset difference as Nan;
finding out quartering points of the monthly electricity consumption of the user according to historical data in the user data, wherein q 1 is 1/4 quantiles, q 3 is 3/4 quantiles, and the quantile difference IQR is q 3-q 1; marking points exceeding the upper and lower limits as outliers with q 3 +1.5 IQR as the upper limit and max (q 1-1.5 IQR,200) as the lower limit; replacing outliers greater than q 3 +1.5 IQR with q 3 +1.5 IQR; replacing outliers with a lower limit of q 1-1.5 IQR with q 1-1.5 IQR; marking the abnormal point with the lower limit of 200 as Nan;
filling missing values after being set to Nan in a mode of combining K neighbor and moving average to obtain filled user data;
and carrying out data cleaning and screening on the filled user data by using white noise inspection to obtain cleaned user data.
Optionally, the missing value padding is performed on the missing value after being set as Nan by using a mode of combining K nearest neighbor and moving average, and obtaining padded user data includes:
carrying out sliding average filling processing with a window of 3 on a single independent missing value in the missing values after being set to Nan to obtain user data after being filled;
and filling missing values with continuous missing values in the missing values after being set to the Nan by adopting K neighbor to obtain the user data after filling.
Optionally, the screening the model building sample of the cleaned user data to obtain the screened user data includes:
carrying out modeling sample screening processing on the cleaned user data based on a historical power consumption comparison sample method or a post sample detection method to obtain screened user data;
optionally, the screening the modeling sample of the cleaned user data based on a historical power consumption comparison sample method to obtain the screened user data includes:
calculating the average value of the power consumption of the user in the latest n months by taking the current time as a reference for the user data of a single user in the user data;
calculating the ratio between the monthly electricity consumption of the user and the average electricity consumption of the user;
calculating a first index position of the monthly electricity consumption with the ratio smaller than 0.2 and a second index position of the monthly electricity consumption with the ratio larger than 2;
judging the first index position and the second index position obtained by calculation, determining a new template, selecting the data with the largest index position from the first index position and the second index position, and recording the data as ind _ max and n-14; taking the minimum index as the start index ind _ start which needs to perform data interception, thereby intercepting all monthly electric quantity of the user data from the index ind _ start to the last bit as the user data after screening processing;
the step of carrying out modeling sample screening processing on the cleaned user data based on a post sample detection method to obtain screened user data comprises the following steps:
calculating relative errors between the historical monthly electric quantity of the last 2 months in the cleaned user data and the predicted monthly electric quantity, and recording the relative errors as E1 and E2;
comparing the E1 and the E2 with a given threshold, if the E1 and the E2 are both larger than the given threshold and the E1 and the E2 are both larger than 0 or smaller than 0, indicating that the cleaned user data needs to be screened, otherwise, using the cleaned user data as the user data after screening;
and directly carrying out the sliding average with the window dimension of 3 on the cleaned user data as the predicted value of the user.
Optionally, the performing feature construction processing on the user data based on the cluster to obtain user data features includes:
standardizing the electricity utilization data of each user in the clustered user data to obtain electricity utilization data of standardized users;
and performing feature construction processing based on the user data of the standardized user to obtain user data features.
Optionally, the performing feature construction processing based on the user data of the standardized user to obtain the user data features includes:
constructing a monthly power consumption state matrix of the p-dimensional user based on the user data of the standardized user;
and adding date information into the monthly electricity consumption state matrix of the p-dimensional user, constructing a user state space matrix, and obtaining user data characteristics.
Optionally, the preset prediction model includes a time sequence model and a non-time sequence model; wherein the timing class model comprises a multi-user lstm model; the non-time sequence type model comprises a multi-user lightgbm model, a single-user support vector regression model and a single-user xgboost model.
In addition, an embodiment of the present invention further provides an electricity quantity prediction apparatus for electricity consumption cost intelligent optimization analysis, where the apparatus includes:
a data cleaning module: the system is used for cleaning and processing the collected user data of the large-scale industrial power users to obtain the cleaned user data;
the data screening module: the modeling sample screening processing module is used for carrying out modeling sample screening processing on the cleaned user data to obtain screened user data;
a data clustering module: the cluster processing module is used for carrying out cluster processing on the screened user data serving as a cluster center according to preset partition types to obtain clustered user data;
a characteristic construction module: the feature construction processing is carried out on the basis of the clustered user data to obtain user data features;
a model selection module: the prediction model selection module is used for selecting a prediction model in a preset prediction model based on the number of users in the user data and the characteristics of the user data and obtaining a selection prediction model;
a prediction processing module: and the power consumption prediction module is used for inputting the user data characteristics into the selection prediction model to perform power consumption prediction processing, and obtaining a power consumption prediction result.
In the embodiment of the invention, the accuracy can be higher, the power utilization rule of a large user can be captured, and the electric quantity of the user can be accurately predicted in real time; the peak valley and the monthly electricity quantity are simultaneously predicted by combination, and the efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a power prediction method for power consumption cost intelligent optimization analysis according to an embodiment of the present invention;
fig. 2 is a schematic structural composition diagram of an electricity quantity prediction apparatus for intelligent optimization analysis of electricity consumption cost in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
Referring to fig. 1, fig. 1 is a schematic flow chart of a power consumption prediction method for power consumption cost intelligent optimization analysis according to an embodiment of the present invention.
As shown in fig. 1, a power prediction method for power consumption cost intelligent optimization analysis includes:
s11: carrying out data cleaning processing on the collected user data of the large-scale industrial power utilization users to obtain cleaned user data;
in the specific implementation process of the invention, the data cleaning processing comprises effective user data screening, user data abnormal value processing, user data missing value filling and user data white noise detection.
Further, the data cleaning processing is performed on the collected user data of the large-scale industrial power utilization user to obtain the cleaned user data, and the method comprises the following steps: setting the data which are negative values and 0 in the user data as Nan; calculating the average value and the median of the user monthly electricity consumption in the user data, comparing the difference between the average value and the median of the user monthly electricity consumption, screening the user monthly electricity consumption in the electricity consumption data one by one if the difference is larger than a preset difference, and setting the user data with the difference between the monthly electricity consumption and the median larger than the preset difference as Nan; finding out quartering points of the monthly electricity consumption of the user according to historical data in the user data, wherein q 1 is 1/4 quantiles, q 3 is 3/4 quantiles, and the quantile difference IQR is q 3-q 1; marking points exceeding the upper and lower limits as outliers with q 3 +1.5 IQR as the upper limit and max (q 1-1.5 IQR,200) as the lower limit; replacing outliers greater than q 3 +1.5 IQR with q 3 +1.5 IQR; replacing outliers with a lower limit of q 1-1.5 IQR with q 1-1.5 IQR; marking the abnormal point with the lower limit of 200 as Nan; filling missing values after being set to Nan in a mode of combining K neighbor and moving average to obtain filled user data; and carrying out data cleaning and screening on the filled user data by using white noise inspection to obtain cleaned user data.
Further, the missing value padding is performed on the missing value after being set as Nan by adopting a mode of combining K neighbor and moving average, and the padded user data is obtained, which includes: carrying out sliding average filling processing with a window of 3 on a single independent missing value in the missing values after being set to Nan to obtain user data after being filled; and filling missing values with continuous missing values in the missing values after being set to the Nan by adopting K neighbor to obtain the user data after filling.
Specifically, the data cleaning link is mainly divided into 4 parts: 1. effective user screening, 2, data abnormal value processing, 3, data missing value filling, and 4, white noise detection. The specific treatment process is as follows: firstly, the monthly electricity quantity data of each single month are sorted according to the same user, and the monthly electricity consumption data of each user in the period from 1 month in 2018 to 12 months in 2019 are obtained. And secondly, screening on the basis of the sorted data (screening out users with the user missing value less than 30%).
Then, processing abnormal data of the large industrial monthly electricity quantity by using a triple variance method and filling missing values, wherein the method comprises the following specific steps: (1) firstly, processing negative values and 0, and setting all the negative values and 0 as Nan; (2) calculating the average value and the median of the monthly electric quantity of the user, comparing the difference between the average value and the median, screening the monthly electric quantity of the user one by one if the difference is overlarge, and setting the data with overlarge difference between the average value and the median of the monthly electric quantity as Nan; (3) solving a quartile point of the monthly electricity according to historical data of a data set, wherein q 1 is 1/4 quantile points, q 3 is 3/4 quantile points, and the quantile difference IQR is q 3-q 1; marking points exceeding the upper and lower limits as outliers with q 3 +1.5 IQR as the upper limit and max (q 1-1.5 IQR,200) as the lower limit; replacing outliers greater than q 3 +1.5 IQR with q 3 +1.5 IQR; replacing outliers with a lower limit of q 1-1.5 IQR with q 1-1.5 IQR; for an outlier with a lower limit of 200, mark it as empty (Nan); (4) filling missing values of the processed abnormal values by adopting a mode of combining K neighbor and moving average, and specifically comprises the following steps: 1. carrying out sliding average filling 2 with a window of 3 on a single independent missing value, and carrying out missing value filling on data with continuous missing values by using K neighbors; 5) and (3) screening the user data by using a white noise test, setting the P value to be 0.1, and if the detected P value is more than 0.1, indicating that the original hypothesis is rejected, wherein the user data belongs to a pure random sequence and needs to be removed.
S12: carrying out modeling sample screening processing on the cleaned user data to obtain screened user data;
in a specific implementation process of the present invention, the screening of the modeling sample on the cleaned user data to obtain the screened user data includes: and carrying out modeling sample screening processing on the cleaned user data based on a historical power consumption comparison sample method or a post sample detection method to obtain screened user data.
Further, the step of performing modeling sample screening processing on the cleaned user data based on a historical power consumption comparison sample method to obtain screened user data includes: calculating the average value of the power consumption of the user in the latest n months by taking the current time as a reference for the user data of a single user in the user data; calculating the ratio between the monthly electricity consumption of the user and the average electricity consumption of the user; calculating a first index position of the monthly electricity consumption with the ratio smaller than 0.2 and a second index position of the monthly electricity consumption with the ratio larger than 2; judging the first index position and the second index position obtained by calculation, determining a new template, selecting the data with the largest index position from the first index position and the second index position, and recording the data as ind _ max and n-14; taking the minimum index as the start index ind _ start which needs to perform data interception, thereby intercepting all monthly electric quantity of the user data from the index ind _ start to the last bit as the user data after screening processing; the step of carrying out modeling sample screening processing on the cleaned user data based on a post sample detection method to obtain screened user data comprises the following steps: calculating relative errors between the historical monthly electric quantity of the last 2 months in the cleaned user data and the predicted monthly electric quantity, and recording the relative errors as E1 and E2; comparing the E1 and the E2 with a given threshold, if the E1 and the E2 are both larger than the given threshold and the E1 and the E2 are both larger than 0 or smaller than 0, indicating that the cleaned user data needs to be screened, otherwise, using the cleaned user data as the user data after screening; and directly carrying out the sliding average with the window dimension of 3 on the cleaned user data as the predicted value of the user.
Specifically, it is found that the electricity utilization condition of a large industrial user is often closely related to the development period of the user at that time, and some users have a great promotion within two years, which may be that a large industrial user enterprise is in an ascending period; some users may keep a relatively stable rule, which is also the performance of the enterprise when the enterprise develops into a stable period; and some enterprises are in a fading period, and the data in 2019 has larger change compared with the data in 2018. In this case, if all the history samples are used as modeling samples, the future power change situation will be affected erroneously. Therefore, the method for screening out a sample suitable for current user training from the data after data cleaning is used for modeling, and comprises the following main steps: 1. a historical electric quantity comparison sample method 2 and a post sample detection method. The specific introduction is as follows:
1. the historical electricity quantity comparison sample method is a prior sample screening method used before model screening training, and mainly solves the problem that electricity quantity displayed in electricity consumption data of some users is abnormally large or small, the abnormal data needs to be cleaned, and the specific steps are as follows:
(1) for a single user, calculating the average value of the electric quantity of the latest n months by taking the current time as a reference(i.e. theWhere n is the minimum between the sample size and 12). (2) Calculating the current month electric quantity and the average month electric quantityThe ratio R between; (3) calculating a month power index position ind01 where R is less than 0.2 and a month power index position ind02 where R is greater than 2; (4) and judging the acquired index position to further determine a new sample, selecting the largest index position number of ind01 and ind02 as ind _ max, and n-14 (here, the feature is selected to be 12 of the time series, so that n-14 represents the minimum training data index starting position for training the user model), and taking the smaller index as the starting index ind _ start needing data interception, so that the monthly power of the user data from the index ind _ start to the last bit is intercepted as a sample needing training by the user.
The post sample detection method is a sample screening or result prediction method based on the change condition of the historical prediction error of the user, and mainly solves the problems that the power consumption of some users is suddenly increased or decreased, the future power consumption level of the users is greatly different from the past, the same samples need to be screened (the method combines the practical situation and the training efficiency to comprehensively consider and select the selected samples to directly carry out the sliding average with the window dimension of 3 to determine the predicted values), and the method comprises the following specific steps: (1) calculating relative errors between the historical monthly electric quantity and the predicted monthly electric quantity of the latest 2 months, and recording as E1 and E2; (2) comparing E1 and E2 with a given threshold (0.45), if both are greater than the threshold and E1 and E2 are both greater than 0 or less than 0, indicating that the sample needs to be screened, and turning to (3), otherwise, using the original sample to perform model training; (3) and directly carrying out the sliding average with the window dimension of 3 on the user sample as the predicted value of the user.
S13: clustering the screened user data serving as a clustering center according to a preset division type to obtain clustered user data;
in the specific implementation process of the invention, the adopted data is all data of all users, and in the process of modeling a certain type of users, the users of different types have mutual interference to influence the model prediction effect, so that the general clustering is adopted in the text to firstly divide the types of the users (through experiments, the 7-type division effect is better), and the 7-type different multi-user models are trained to predict the users of different types, so that the obtained prediction effect is more obvious.
S14: performing feature construction processing based on the clustered user data to obtain user data features;
in a specific implementation process of the present invention, the performing feature construction processing on the user data based on the cluster to obtain user data features includes: standardizing the electricity utilization data of each user in the clustered user data to obtain electricity utilization data of standardized users; and performing feature construction processing based on the user data of the standardized user to obtain user data features.
Further, the performing feature construction processing based on the user data of the standardized user to obtain user data features includes: constructing a monthly power consumption state matrix of the p-dimensional user based on the user data of the standardized user; and adding date information into the monthly electricity consumption state matrix of the p-dimensional user, constructing a user state space matrix, and obtaining user data characteristics.
A characteristic construction link, namely adopting monthly electric quantity data and date information of the electric consumption data of each user in the clustered user data, wherein the monthly electric quantity data and the date information are used for carrying out standardized processing on the electric consumption data of each user, and the specific formula is as follows:
wherein x is*Represents the normalized monthly capacity data,represents the average value of the user's monthly power, std represents the standard deviation of the user's monthly power. And calculating the average value, the variance, the minimum value and the maximum value of the monthly electric quantity of the user as 4 characteristic dimensions of the user after standardization.
Because the electric quantity data of the large industry month is less, the user data characteristics are obtained by performing the following processing on each user:
constructing a p-dimensional monthly electric quantity state matrix, namely predicting the current monthly electric quantity by using the monthly electric quantity of the previous p days, wherein the monthly electric quantity state space matrix is as follows:
where p is 12, i.e. the monthly power of the previous year is used for the power prediction of the month, and n is the monthly dimension of the user data, which is 21.
Adding date information and a user state space matrix, and obtaining user data characteristics, namely:
among them, monthn-p+1To predict the belonged month information of the current month, yearn-p+1To predict the year information of the month (this feature is effective in the case of a large amount of data).
Meanwhile, a parity ratio and ring ratio information construction characteristic is added, and as a correlation exists between the peak valley power utilization level of a user and the previous historical power, a ratio of the month to the previous month in the year before the month is predicted, namely a parity ratio is added to the construction characteristic (if the peak valley power of the 10 month in 2019 needs to be predicted, the peak valley power of the 10 month in 2018 and the peak valley power of the 9 month in 2018 are added to the characteristic); similarly, a difference value between one month before the predicted month and one month before the predicted month, namely a ring ratio (if the peak-to-valley electric quantity of the 2019 month 10 needs to be predicted, a difference value between the 2019 month 9 month and the 2019 month 8 is added in the feature) is added to the constructed feature.
S15: selecting a prediction model from preset prediction models based on the number of users in the user data and the characteristics of the user data, and obtaining a selected prediction model;
in the specific implementation process of the invention, the preset prediction model comprises a time sequence model and a non-time sequence model; wherein the timing class model comprises a multi-user lstm model; the non-time sequence type model comprises a multi-user lightgbm model, a single-user support vector regression model and a single-user xgboost model.
Specifically, the prediction model can be selected from preset prediction models according to the number of users in the user data and the characteristics of the user data, and a selected prediction model is obtained; in the multi-user lstm model, time sequence sample characteristics of the first 12 months are mainly input; the multi-user lightgbm model mainly inputs time sequence samples of the first 12 months, predicted monthly and predicted year characteristics; the single-user lightgbm model, the single-user support vector regression model and the single-user xgboost model mainly input peak-to-average valley electric quantity data of 13 months + predicted month + predicted year + user total data average + total data variance + total data maximum + total data minimum + user predicted month parity + ring ratio of last month and last month of user prediction + average of the last 13 months of predicted month + average of the last half year of predicted month + average of the last 3 months of predicted month.
The time sequence class model: and taking the historical monthly electric quantity as a time sequence, performing fitting training by using an LSTM neural network, and predicting to obtain the monthly electric quantity.
Non-time sequence model: non-sequential models fall into two categories during the test: 1. and training by multi-user data 2 and training by single user. For multi-user data, the text selects the first 12-month time series data samples of each user and the month and year of the corresponding predicted month; for single-user training, three different models are used for testing, the three models are respectively characterized in certain aspects, and the input characteristic dimensions are the time sequence data samples of the first 13 months of each user, the months and the predicted years of the corresponding predicted month, the average value, the variance, the maximum value, the minimum value, the predicted month parity ratio, the ring ratio of the last month and the last month of the predicted month of the user, the average value of the last 13 months of the predicted month, the average value of the last half year of the predicted month and the average value of the last 3 months of the predicted month.
S16: and inputting the user data characteristics into the selection prediction model to perform power consumption prediction processing, and obtaining a power consumption prediction result.
In the specific implementation process of the invention, after the selective prediction model is selected and obtained, the user data characteristics are input into the selective prediction model to carry out power consumption prediction processing, and then the power consumption prediction result can be obtained.
In the specific implementation process of the invention, the accuracy can be higher, the power utilization rule of a large user can be captured, and the electric quantity of the user can be accurately predicted in real time in the embodiment of the invention; the peak valley and the monthly electricity quantity are simultaneously predicted by combination, and the efficiency is improved.
Examples
Referring to fig. 2, fig. 2 is a schematic structural composition diagram of an electric quantity prediction apparatus for power consumption cost intelligent optimization analysis according to an embodiment of the present invention.
As shown in fig. 2, an electricity quantity prediction apparatus for intelligent optimization analysis of electricity consumption cost includes:
the data cleansing module 21: the system is used for cleaning and processing the collected user data of the large-scale industrial power users to obtain the cleaned user data;
in the specific implementation process of the invention, the data cleaning processing comprises effective user data screening, user data abnormal value processing, user data missing value filling and user data white noise detection.
Further, the data cleaning processing is performed on the collected user data of the large-scale industrial power utilization user to obtain the cleaned user data, and the method comprises the following steps: setting the data which are negative values and 0 in the user data as Nan; calculating the average value and the median of the user monthly electricity consumption in the user data, comparing the difference between the average value and the median of the user monthly electricity consumption, screening the user monthly electricity consumption in the electricity consumption data one by one if the difference is larger than a preset difference, and setting the user data with the difference between the monthly electricity consumption and the median larger than the preset difference as Nan; finding out quartering points of the monthly electricity consumption of the user according to historical data in the user data, wherein q 1 is 1/4 quantiles, q 3 is 3/4 quantiles, and the quantile difference IQR is q 3-q 1; marking points exceeding the upper and lower limits as outliers with q 3 +1.5 IQR as the upper limit and max (q 1-1.5 IQR,200) as the lower limit; replacing outliers greater than q 3 +1.5 IQR with q 3 +1.5 IQR; replacing outliers with a lower limit of q 1-1.5 IQR with q 1-1.5 IQR; marking the abnormal point with the lower limit of 200 as Nan; filling missing values after being set to Nan in a mode of combining K neighbor and moving average to obtain filled user data; and carrying out data cleaning and screening on the filled user data by using white noise inspection to obtain cleaned user data.
Further, the missing value padding is performed on the missing value after being set as Nan by adopting a mode of combining K neighbor and moving average, and the padded user data is obtained, which includes: carrying out sliding average filling processing with a window of 3 on a single independent missing value in the missing values after being set to Nan to obtain user data after being filled; and filling missing values with continuous missing values in the missing values after being set to the Nan by adopting K neighbor to obtain the user data after filling.
Specifically, the data cleaning link is mainly divided into 4 parts: 1. effective user screening, 2, data abnormal value processing, 3, data missing value filling, and 4, white noise detection. The specific treatment process is as follows: firstly, the monthly electricity quantity data of each single month are sorted according to the same user, and the monthly electricity consumption data of each user in the period from 1 month in 2018 to 12 months in 2019 are obtained. And secondly, screening on the basis of the sorted data (screening out users with the user missing value less than 30%).
Then, processing abnormal data of the large industrial monthly electricity quantity by using a triple variance method and filling missing values, wherein the method comprises the following specific steps: (1) firstly, processing negative values and 0, and setting all the negative values and 0 as Nan; (2) calculating the average value and the median of the monthly electric quantity of the user, comparing the difference between the average value and the median, screening the monthly electric quantity of the user one by one if the difference is overlarge, and setting the data with overlarge difference between the average value and the median of the monthly electric quantity as Nan; (3) solving a quartile point of the monthly electricity according to historical data of a data set, wherein q 1 is 1/4 quantile points, q 3 is 3/4 quantile points, and the quantile difference IQR is q 3-q 1; marking points exceeding the upper and lower limits as outliers with q 3 +1.5 IQR as the upper limit and max (q 1-1.5 IQR,200) as the lower limit; replacing outliers greater than q 3 +1.5 IQR with q 3 +1.5 IQR; replacing outliers with a lower limit of q 1-1.5 IQR with q 1-1.5 IQR; for an outlier with a lower limit of 200, mark it as empty (Nan); (4) filling missing values of the processed abnormal values by adopting a mode of combining K neighbor and moving average, and specifically comprises the following steps: 1. carrying out sliding average filling 2 with a window of 3 on a single independent missing value, and carrying out missing value filling on data with continuous missing values by using K neighbors; 5) and (3) screening the user data by using a white noise test, setting the P value to be 0.1, and if the detected P value is more than 0.1, indicating that the original hypothesis is rejected, wherein the user data belongs to a pure random sequence and needs to be removed.
The data screening module 22: the modeling sample screening processing module is used for carrying out modeling sample screening processing on the cleaned user data to obtain screened user data;
in a specific implementation process of the present invention, the screening of the modeling sample on the cleaned user data to obtain the screened user data includes: and carrying out modeling sample screening processing on the cleaned user data based on a historical power consumption comparison sample method or a post sample detection method to obtain screened user data.
Further, the step of performing modeling sample screening processing on the cleaned user data based on a historical power consumption comparison sample method to obtain screened user data includes: calculating the average value of the power consumption of the user in the latest n months by taking the current time as a reference for the user data of a single user in the user data; calculating the ratio between the monthly electricity consumption of the user and the average electricity consumption of the user; calculating a first index position of the monthly electricity consumption with the ratio smaller than 0.2 and a second index position of the monthly electricity consumption with the ratio larger than 2; judging the first index position and the second index position obtained by calculation, determining a new template, selecting the data with the largest index position from the first index position and the second index position, and recording the data as ind _ max and n-14; taking the minimum index as the start index ind _ start which needs to perform data interception, thereby intercepting all monthly electric quantity of the user data from the index ind _ start to the last bit as the user data after screening processing; the step of carrying out modeling sample screening processing on the cleaned user data based on a post sample detection method to obtain screened user data comprises the following steps: calculating relative errors between the historical monthly electric quantity of the last 2 months in the cleaned user data and the predicted monthly electric quantity, and recording the relative errors as E1 and E2; comparing the E1 and the E2 with a given threshold, if the E1 and the E2 are both larger than the given threshold and the E1 and the E2 are both larger than 0 or smaller than 0, indicating that the cleaned user data needs to be screened, otherwise, using the cleaned user data as the user data after screening; and directly carrying out the sliding average with the window dimension of 3 on the cleaned user data as the predicted value of the user.
Specifically, it is found that the electricity utilization condition of a large industrial user is often closely related to the development period of the user at that time, and some users have a great promotion within two years, which may be that a large industrial user enterprise is in an ascending period; some users may keep a relatively stable rule, which is also the performance of the enterprise when the enterprise develops into a stable period; and some enterprises are in a fading period, and the data in 2019 has larger change compared with the data in 2018. In this case, if all the history samples are used as modeling samples, the future power change situation will be affected erroneously. Therefore, the method for screening out a sample suitable for current user training from the data after data cleaning is used for modeling, and comprises the following main steps: 1. a historical electric quantity comparison sample method 2 and a post sample detection method. The specific introduction is as follows:
1. the historical electricity quantity comparison sample method is a prior sample screening method used before model screening training, and mainly solves the problem that electricity quantity displayed in electricity consumption data of some users is abnormally large or small, the abnormal data needs to be cleaned, and the specific steps are as follows:
(1) for a single user, calculating the average value of the electric quantity of the latest n months by taking the current time as a reference(i.e. theWhere n is the minimum between the sample size and 12). (2) Calculating the current month electric quantity and the average month electric quantityThe ratio R between; (3) calculating a month power index position ind01 where R is less than 0.2 and a month power index position ind02 where R is greater than 2; (4) and judging the acquired index position to further determine a new sample, selecting the largest index position number of ind01 and ind02 as ind _ max, and n-14 (here, the feature is selected to be 12 of the time series, so that n-14 represents the minimum training data index starting position for training the user model), and taking the smaller index as the starting index ind _ start needing data interception, so that the monthly power of the user data from the index ind _ start to the last bit is intercepted as a sample needing training by the user.
The post sample detection method is a sample screening or result prediction method based on the change condition of the historical prediction error of the user, and mainly solves the problems that the power consumption of some users is suddenly increased or decreased, the future power consumption level of the users is greatly different from the past, the same samples need to be screened (the method combines the practical situation and the training efficiency to comprehensively consider and select the selected samples to directly carry out the sliding average with the window dimension of 3 to determine the predicted values), and the method comprises the following specific steps: (1) calculating relative errors between the historical monthly electric quantity and the predicted monthly electric quantity of the latest 2 months, and recording as E1 and E2; (2) comparing E1 and E2 with a given threshold (0.45), if both are greater than the threshold and E1 and E2 are both greater than 0 or less than 0, indicating that the sample needs to be screened, and turning to (3), otherwise, using the original sample to perform model training; (3) and directly carrying out the sliding average with the window dimension of 3 on the user sample as the predicted value of the user.
The data clustering module 23: the cluster processing module is used for carrying out cluster processing on the screened user data serving as a cluster center according to preset partition types to obtain clustered user data;
in the specific implementation process of the invention, the adopted data is all data of all users, and in the process of modeling a certain type of users, the users of different types have mutual interference to influence the model prediction effect, so that the general clustering is adopted in the text to firstly divide the types of the users (through experiments, the 7-type division effect is better), and the 7-type different multi-user models are trained to predict the users of different types, so that the obtained prediction effect is more obvious.
The feature construction module 24: the feature construction processing is carried out on the basis of the clustered user data to obtain user data features;
in a specific implementation process of the present invention, the performing feature construction processing on the user data based on the cluster to obtain user data features includes: standardizing the electricity utilization data of each user in the clustered user data to obtain electricity utilization data of standardized users; and performing feature construction processing based on the user data of the standardized user to obtain user data features.
Further, the performing feature construction processing based on the user data of the standardized user to obtain user data features includes: constructing a monthly power consumption state matrix of the p-dimensional user based on the user data of the standardized user; and adding date information into the monthly electricity consumption state matrix of the p-dimensional user, constructing a user state space matrix, and obtaining user data characteristics.
A characteristic construction link, namely adopting monthly electric quantity data and date information of the electric consumption data of each user in the clustered user data, wherein the monthly electric quantity data and the date information are used for carrying out standardized processing on the electric consumption data of each user, and the specific formula is as follows:
wherein x is*Represents the normalized monthly capacity data,represents the average value of the user's monthly power, std represents the standard deviation of the user's monthly power. And calculating the average value, the variance, the minimum value and the maximum value of the monthly electric quantity of the user as 4 characteristic dimensions of the user after standardization.
Because the electric quantity data of the large industry month is less, the user data characteristics are obtained by performing the following processing on each user:
constructing a p-dimensional monthly electric quantity state matrix, namely predicting the current monthly electric quantity by using the monthly electric quantity of the previous p days, wherein the monthly electric quantity state space matrix is as follows:
where p is 12, i.e. the monthly power of the previous year is used for the power prediction of the month, and n is the monthly dimension of the user data, which is 21.
Adding date information and a user state space matrix, and obtaining user data characteristics, namely:
among them, monthn-p+1To predict the belonged month information of the current month, yearn-p+1To predict the year information of the month (this feature is effective in the case of a large amount of data).
Meanwhile, a parity ratio and ring ratio information construction characteristic is added, and as a correlation exists between the peak valley power utilization level of a user and the previous historical power, a ratio of the month to the previous month in the year before the month is predicted, namely a parity ratio is added to the construction characteristic (if the peak valley power of the 10 month in 2019 needs to be predicted, the peak valley power of the 10 month in 2018 and the peak valley power of the 9 month in 2018 are added to the characteristic); similarly, a difference value between one month before the predicted month and one month before the predicted month, namely a ring ratio (if the peak-to-valley electric quantity of the 2019 month 10 needs to be predicted, a difference value between the 2019 month 9 month and the 2019 month 8 is added in the feature) is added to the constructed feature.
Model selection module 25: the prediction model selection module is used for selecting a prediction model in a preset prediction model based on the number of users in the user data and the characteristics of the user data and obtaining a selection prediction model;
in the specific implementation process of the invention, the preset prediction model comprises a time sequence model and a non-time sequence model; wherein the timing class model comprises a multi-user lstm model; the non-time sequence type model comprises a multi-user lightgbm model, a single-user support vector regression model and a single-user xgboost model.
Specifically, the prediction model can be selected from preset prediction models according to the number of users in the user data and the characteristics of the user data, and a selected prediction model is obtained; in the multi-user lstm model, time sequence sample characteristics of the first 12 months are mainly input; the multi-user lightgbm model mainly inputs time sequence samples of the first 12 months, predicted monthly and predicted year characteristics; the single-user lightgbm model, the single-user support vector regression model and the single-user xgboost model mainly input peak-to-average valley electric quantity data of 13 months + predicted month + predicted year + user total data average + total data variance + total data maximum + total data minimum + user predicted month parity + ring ratio of last month and last month of user prediction + average of the last 13 months of predicted month + average of the last half year of predicted month + average of the last 3 months of predicted month.
The time sequence class model: and taking the historical monthly electric quantity as a time sequence, performing fitting training by using an LSTM neural network, and predicting to obtain the monthly electric quantity.
Non-time sequence model: non-sequential models fall into two categories during the test: 1. and training by multi-user data 2 and training by single user. For multi-user data, the text selects the first 12-month time series data samples of each user and the month and year of the corresponding predicted month; for single-user training, three different models are used for testing, the three models are respectively characterized in certain aspects, and the input characteristic dimensions are the time sequence data samples of the first 13 months of each user, the months and the predicted years of the corresponding predicted month, the average value, the variance, the maximum value, the minimum value, the predicted month parity ratio, the ring ratio of the last month and the last month of the predicted month of the user, the average value of the last 13 months of the predicted month, the average value of the last half year of the predicted month and the average value of the last 3 months of the predicted month.
The prediction processing module 26: and the power consumption prediction module is used for inputting the user data characteristics into the selection prediction model to perform power consumption prediction processing, and obtaining a power consumption prediction result.
In the specific implementation process of the invention, after the selective prediction model is selected and obtained, the user data characteristics are input into the selective prediction model to carry out power consumption prediction processing, and then the power consumption prediction result can be obtained.
In the embodiment of the invention, the accuracy can be higher, the power utilization rule of a large user can be captured, and the electric quantity of the user can be accurately predicted in real time; the peak valley and the monthly electricity quantity are simultaneously predicted by combination, and the efficiency is improved.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
In addition, the above detailed description is given to the electric quantity prediction method and apparatus for power consumption cost intelligent optimization analysis provided by the embodiment of the present invention, and a specific example should be adopted herein to explain the principle and the implementation manner of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
Claims (10)
1. An electricity quantity prediction method for intelligent optimization analysis of electricity consumption cost is characterized by comprising the following steps:
carrying out data cleaning processing on the collected user data of the large-scale industrial power utilization users to obtain cleaned user data;
carrying out modeling sample screening processing on the cleaned user data to obtain screened user data;
clustering the screened user data serving as a clustering center according to a preset division type to obtain clustered user data;
performing feature construction processing based on the clustered user data to obtain user data features;
selecting a prediction model from preset prediction models based on the number of users in the user data and the characteristics of the user data, and obtaining a selected prediction model;
and inputting the user data characteristics into the selection prediction model to perform power consumption prediction processing, and obtaining a power consumption prediction result.
2. The power prediction method of claim 1, wherein the data cleaning process comprises active user data filtering, user data outlier processing, user data missing value padding, and user data white noise checking.
3. The electric quantity prediction method according to claim 1, wherein the step of performing data cleaning processing on the collected user data of the large-scale industrial power consumption user to obtain the cleaned user data comprises:
setting the data which are negative values and 0 in the user data as Nan;
calculating the average value and the median of the user monthly electricity consumption in the user data, comparing the difference between the average value and the median of the user monthly electricity consumption, screening the user monthly electricity consumption in the electricity consumption data one by one if the difference is larger than a preset difference, and setting the user data with the difference between the monthly electricity consumption and the median larger than the preset difference as Nan;
finding out quartering points of the monthly electricity consumption of the user according to historical data in the user data, wherein q 1 is 1/4 quantiles, q 3 is 3/4 quantiles, and the quantile difference IQR is q 3-q 1; marking points exceeding the upper and lower limits as outliers with q 3 +1.5 IQR as the upper limit and max (q 1-1.5 IQR,200) as the lower limit; replacing outliers greater than q 3 +1.5 IQR with q 3 +1.5 IQR; replacing outliers with a lower limit of q 1-1.5 IQR with q 1-1.5 IQR; marking the abnormal point with the lower limit of 200 as Nan;
filling missing values after being set to Nan in a mode of combining K neighbor and moving average to obtain filled user data;
and carrying out data cleaning and screening on the filled user data by using white noise inspection to obtain cleaned user data.
4. The method for predicting electric quantity according to claim 3, wherein the filling of the missing value after being set as Nan is performed by adopting a mode of combining K nearest neighbor and moving average to obtain filled user data, and the method comprises:
carrying out sliding average filling processing with a window of 3 on a single independent missing value in the missing values after being set to Nan to obtain user data after being filled;
and filling missing values with continuous missing values in the missing values after being set to the Nan by adopting K neighbor to obtain the user data after filling.
5. The method for predicting electric quantity according to claim 1, wherein the step of performing model sample screening processing on the cleaned user data to obtain screened user data comprises:
and carrying out modeling sample screening processing on the cleaned user data based on a historical power consumption comparison sample method or a post sample detection method to obtain screened user data.
6. The method for predicting electric quantity according to claim 5, wherein the step of performing modeling sample screening processing on the cleaned user data based on a historical power consumption versus sample method to obtain screened user data comprises:
calculating the average value of the power consumption of the user in the latest n months by taking the current time as a reference for the user data of a single user in the user data;
calculating the ratio between the monthly electricity consumption of the user and the average electricity consumption of the user;
calculating a first index position of the monthly electricity consumption with the ratio smaller than 0.2 and a second index position of the monthly electricity consumption with the ratio larger than 2;
judging the first index position and the second index position obtained by calculation, determining a new template, selecting the data with the largest index position from the first index position and the second index position, and recording the data as ind _ max and n-14; taking the minimum index as the start index ind _ start which needs to perform data interception, thereby intercepting all monthly electric quantity of the user data from the index ind _ start to the last bit as the user data after screening processing;
the step of carrying out modeling sample screening processing on the cleaned user data based on a post sample detection method to obtain screened user data comprises the following steps:
calculating relative errors between the historical monthly electric quantity of the last 2 months in the cleaned user data and the predicted monthly electric quantity, and recording the relative errors as E1 and E2;
comparing the E1 and the E2 with a given threshold, if the E1 and the E2 are both larger than the given threshold and the E1 and the E2 are both larger than 0 or smaller than 0, indicating that the cleaned user data needs to be screened, otherwise, using the cleaned user data as the user data after screening;
and directly carrying out the sliding average with the window dimension of 3 on the cleaned user data as the predicted value of the user.
7. The electric quantity prediction method according to claim 1, wherein the performing feature construction processing based on the clustered user data to obtain user data features comprises:
standardizing the electricity utilization data of each user in the clustered user data to obtain electricity utilization data of standardized users;
and performing feature construction processing based on the user data of the standardized user to obtain user data features.
8. The method for predicting electric quantity according to claim 7, wherein said performing a feature construction process based on the user data of the standardized user to obtain user data features comprises:
constructing a monthly power consumption state matrix of the p-dimensional user based on the user data of the standardized user;
and adding date information into the monthly electricity consumption state matrix of the p-dimensional user, constructing a user state space matrix, and obtaining user data characteristics.
9. The power prediction method according to claim 1, wherein the preset prediction model includes a time-series model and a non-time-series model; wherein the timing class model comprises a multi-user lstm model; the non-time sequence type model comprises a multi-user lightgbm model, a single-user support vector regression model and a single-user xgboost model.
10. An electricity quantity prediction device for intelligent optimization analysis of electricity consumption cost, which is characterized by comprising:
a data cleaning module: the system is used for cleaning and processing the collected user data of the large-scale industrial power users to obtain the cleaned user data;
the data screening module: the modeling sample screening processing module is used for carrying out modeling sample screening processing on the cleaned user data to obtain screened user data;
a data clustering module: the cluster processing module is used for carrying out cluster processing on the screened user data serving as a cluster center according to preset partition types to obtain clustered user data;
a characteristic construction module: the feature construction processing is carried out on the basis of the clustered user data to obtain user data features;
a model selection module: the prediction model selection module is used for selecting a prediction model in a preset prediction model based on the number of users in the user data and the characteristics of the user data and obtaining a selection prediction model;
a prediction processing module: and the power consumption prediction module is used for inputting the user data characteristics into the selection prediction model to perform power consumption prediction processing, and obtaining a power consumption prediction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011487494.2A CN112712194A (en) | 2020-12-16 | 2020-12-16 | Electric quantity prediction method and device for power consumption cost intelligent optimization analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011487494.2A CN112712194A (en) | 2020-12-16 | 2020-12-16 | Electric quantity prediction method and device for power consumption cost intelligent optimization analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112712194A true CN112712194A (en) | 2021-04-27 |
Family
ID=75543896
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011487494.2A Pending CN112712194A (en) | 2020-12-16 | 2020-12-16 | Electric quantity prediction method and device for power consumption cost intelligent optimization analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112712194A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113762600A (en) * | 2021-08-12 | 2021-12-07 | 北京市燃气集团有限责任公司 | LightGBM-based monthly gas consumption prediction method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019019255A1 (en) * | 2017-07-25 | 2019-01-31 | 平安科技(深圳)有限公司 | Apparatus and method for establishing prediction model, program for establishing prediction model, and computer-readable storage medium |
CN109784542A (en) * | 2018-12-19 | 2019-05-21 | 广东电网有限责任公司 | A kind of moon electricity demand forecasting method of seasonal support vector regression model |
CN111178611A (en) * | 2019-12-23 | 2020-05-19 | 广西电网有限责任公司 | Method for predicting daily electric quantity |
CN111814407A (en) * | 2020-07-28 | 2020-10-23 | 安徽沃特水务科技有限公司 | Flood forecasting method based on big data and deep learning |
-
2020
- 2020-12-16 CN CN202011487494.2A patent/CN112712194A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019019255A1 (en) * | 2017-07-25 | 2019-01-31 | 平安科技(深圳)有限公司 | Apparatus and method for establishing prediction model, program for establishing prediction model, and computer-readable storage medium |
CN109784542A (en) * | 2018-12-19 | 2019-05-21 | 广东电网有限责任公司 | A kind of moon electricity demand forecasting method of seasonal support vector regression model |
CN111178611A (en) * | 2019-12-23 | 2020-05-19 | 广西电网有限责任公司 | Method for predicting daily electric quantity |
CN111814407A (en) * | 2020-07-28 | 2020-10-23 | 安徽沃特水务科技有限公司 | Flood forecasting method based on big data and deep learning |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113762600A (en) * | 2021-08-12 | 2021-12-07 | 北京市燃气集团有限责任公司 | LightGBM-based monthly gas consumption prediction method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108520357B (en) | Method and device for judging line loss abnormality reason and server | |
WO2019165673A1 (en) | Reimbursement form risk prediction method, apparatus, terminal device, and storage medium | |
CN110826648A (en) | Method for realizing fault detection by utilizing time sequence clustering algorithm | |
CN111796957B (en) | Transaction abnormal root cause analysis method and system based on application log | |
CN113449919B (en) | Power consumption prediction method and system based on feature and trend perception | |
CN113378899A (en) | Abnormal account identification method, device, equipment and storage medium | |
CN112101765A (en) | Abnormal data processing method and system for operation index data of power distribution network | |
CN117078048A (en) | Digital twinning-based intelligent city resource management method and system | |
CN111738348A (en) | Power data anomaly detection method and device | |
CN112712194A (en) | Electric quantity prediction method and device for power consumption cost intelligent optimization analysis | |
CN110674100A (en) | User demand prediction method and framework based on full-channel operation data | |
CN112579847A (en) | Method and device for processing production data, storage medium and electronic equipment | |
CN107590747A (en) | Power grid asset turnover rate computational methods based on the analysis of comprehensive energy big data | |
CN105227410A (en) | Based on the method and system that the server load of adaptive neural network detects | |
CN111654853B (en) | Data analysis method based on user information | |
CN115409115A (en) | Time sequence clustering abnormal terminal identification method based on user log | |
CN114281808A (en) | Traffic big data cleaning method, device, equipment and readable storage medium | |
CN112737834A (en) | Cloud hard disk fault prediction method, device, equipment and storage medium | |
CN113139673A (en) | Method, device, terminal and storage medium for predicting air quality | |
CN111625525A (en) | Environmental data repairing/filling method and system | |
CN111724048A (en) | Characteristic extraction method for finished product library scheduling system performance data based on characteristic engineering | |
CN111475319A (en) | Hard disk screening method and device based on machine learning | |
CN106301880A (en) | One determines that cyberrelationship degree of stability, Internet service recommend method and apparatus | |
KR102392131B1 (en) | Food-web network analysis-based ecosystem prediction evaluation system and operation method thereof | |
CN116155755B (en) | Link symbol prediction method based on linear optimization closed sub-graph coding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |