CN113361750A - Electricity sales amount prediction method based on business expansion large data - Google Patents

Electricity sales amount prediction method based on business expansion large data Download PDF

Info

Publication number
CN113361750A
CN113361750A CN202110532108.5A CN202110532108A CN113361750A CN 113361750 A CN113361750 A CN 113361750A CN 202110532108 A CN202110532108 A CN 202110532108A CN 113361750 A CN113361750 A CN 113361750A
Authority
CN
China
Prior art keywords
data set
data
prediction
result
electricity sales
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110532108.5A
Other languages
Chinese (zh)
Inventor
张瑞
彭宗旭
朱正友
张维青
徐佳
刘啸野
曹勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaibei Power Supply Co of State Grid Anhui Electric Power Co Ltd
Original Assignee
Huaibei Power Supply Co of State Grid Anhui Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaibei Power Supply Co of State Grid Anhui Electric Power Co Ltd filed Critical Huaibei Power Supply Co of State Grid Anhui Electric Power Co Ltd
Priority to CN202110532108.5A priority Critical patent/CN113361750A/en
Publication of CN113361750A publication Critical patent/CN113361750A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Human Resources & Organizations (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The application discloses a business expansion big data-based electricity sales quantity prediction method, which solves the problems that the electricity sales quantity data prediction result is inaccurate and the application range is small in the prior art; the method comprises the following steps: constructing an industry expansion installation data set and an electricity sales data set; clustering the customers by adopting a K-Prototype algorithm according to the business expansion installation data set and the electricity sales volume data set to determine a plurality of customer groups; carrying out power selling decomposition on each customer group, and determining a decomposition result; fitting the decomposition result to determine a fitting result; adding and reconstructing the fitting result, obtaining an optimal prediction curve by combining an AHP algorithm, and determining an optimal prediction model; the method and the device realize the process of fully considering the influence of the business expansion data on the electricity sales amount, analyze the incidence relation between the capacity and the electricity sales amount in the business expansion installation large data, improve the accuracy of the electricity sales amount prediction result and scientifically and accurately predict the electricity sales amount.

Description

Electricity sales amount prediction method based on business expansion large data
Technical Field
The application relates to the technical field of power sale quantity prediction, in particular to a power sale prediction method based on business expansion installation big data.
Background
With the development of society, the power consumption of various industries can increase steeply, and the current power selling amount prediction mainly adopts an expert experience method, a classical prediction method for estimating the future power amount by the mutual relation between simple variables and a historical power amount data analysis prediction method.
The prediction according to the conventional method is poor in accuracy, and a time series method for analyzing and predicting according to historical electric quantity data is simple and requires a small amount of data, but other factors related to the electric quantity sold are not considered in the method.
However, in the existing prediction method, the prediction precision is poor and the factors considered are not comprehensive, so that the prediction of the electricity sales amount is not accurate.
Disclosure of Invention
The embodiment of the application solves the problems that the result of the electricity sales quantity data is inaccurate and the application range is small in the prior art by providing the electricity sales quantity prediction method based on the business expansion large data, and realizes accurate prediction of the electricity sales quantity data.
In a first aspect, an embodiment of the present invention provides a business expansion large data-based electricity sales amount prediction method, including the following steps:
constructing an industry expansion installation data set, an electricity sales data set and economic data;
clustering customers by adopting a K-Prototype algorithm according to the business expansion and installation data set and the electricity sales volume data set to determine a plurality of customer groups;
carrying out power selling decomposition on each customer group to determine a decomposition result;
fitting the decomposition result to determine a fitting result;
and performing summation and reconstruction on the fitting result, and determining an optimal prediction model by combining an AHP algorithm to obtain an optimal prediction curve.
With reference to the first aspect, in a possible implementation manner, before the clustering of the customers by using the K-Prototype algorithm is implemented and a plurality of customer groups are determined, centralized data processing is performed on the electricity sales data set, where the data processing includes:
identifying abnormal values of the electricity selling quantity data set by using a box type graph method, and correcting the abnormal values by using historical mean value data;
and filling the missing value by using an expert filling method and combining a Lagrange interpolation method.
With reference to the first aspect, in a possible implementation manner, the clustering the clients by using a K-Prototype algorithm includes: clustering is performed according to supply voltage, industry expansion type, net capacity, industry classification, load trend and load property.
With reference to the first aspect, in a possible implementation manner, the decomposing the electricity sales amount for each customer group includes: and decomposing each customer group into a trend item, a seasonal item and a random item by adopting X13 seasonal decomposition.
With reference to the first aspect, in a possible implementation manner, the fitting the decomposition result includes:
adopting a support vector machine regression SVR algorithm, an echo state network ESN algorithm and a gray model GM to the trend item, and combining the industry expansion selling electric quantity data set and the economic data set to predict to obtain a trend item result;
predicting the seasonal item by adopting an L1/2 sparse iterative algorithm to obtain a seasonal item result;
and predicting the random item by adopting a linear regression algorithm and a Kalman filtering algorithm in combination with weather and holiday data to obtain a trend item result.
With reference to the first aspect, in a possible implementation manner, the determining an optimal prediction model by obtaining an optimal prediction curve with reference to an AHP algorithm includes: and comprehensively considering the prediction error, the trend reliability and the prediction similarity to determine the optimal prediction model.
In a second aspect, an embodiment of the present invention provides an apparatus for predicting a sold electricity amount based on business expansion installation big data, where the apparatus includes the following units:
a data acquisition unit: the system is used for constructing an industry expansion installation data set and an electricity sales data set;
a data clustering unit: the system comprises a business expansion and installation data set, an electricity sales volume data set and a K-Prototype algorithm, wherein the business expansion and installation data set is used for acquiring business expansion and installation data of customers;
decomposition prediction unit: the system is used for decomposing the electricity sales amount of each customer group and determining a decomposition result;
a grouping fitting unit: fitting the decomposition result to determine a fitting decomposition result;
a model determination unit: and the method is used for performing summation and reconstruction on the fitting decomposition result, obtaining an optimal prediction curve by combining an AHP algorithm, and determining an optimal prediction model.
With reference to the second aspect, in a possible implementation manner, the data clustering unit further includes a data processing unit, where the data processing unit performs centralized data processing on the electricity sales data set before implementing the clustering of customers by using the K-Prototype algorithm and determining a plurality of customer groups, where the data processing includes:
identifying abnormal values of the electricity selling quantity data set by using a box type graph method, and correcting the abnormal values by using historical mean value data;
and filling the missing value by using an expert filling method and combining a Lagrange interpolation method.
With reference to the second aspect, in one possible implementation manner, the data clustering unit performs clustering according to a power supply voltage, an industry expansion type, a net capacity, an industry classification, a load trend, and a load property.
With reference to the second aspect, in a possible implementation manner, the decomposition prediction unit is specifically configured to: and decomposing each customer group into a trend item, a seasonal item and a random item by adopting X13 seasonal decomposition.
With reference to the second aspect, in a possible implementation manner, the group fitting unit is specifically configured to:
adopting a support vector machine regression SVR algorithm, an echo state network ESN algorithm and a gray model GM to the trend item, and combining the industry expansion selling electric quantity data set and the economic data set to predict to obtain a trend item result;
predicting the seasonal item by adopting an L1/2 sparse iterative algorithm to obtain a seasonal item result;
and predicting the random item by adopting a linear regression algorithm and a Kalman filtering algorithm in combination with weather and holiday data to obtain a trend item result.
With reference to the second aspect, in a possible implementation manner, the model determining unit is specifically configured to: and comprehensively considering the prediction error, the trend reliability and the prediction similarity to determine the optimal prediction model.
In a third aspect, an embodiment of the present invention provides a device for predicting electricity sales amount based on business expansion installation big data, including a memory and a processor;
the memory is to store computer-executable instructions;
the processor is configured to execute the computer-executable instructions to implement the method of the first aspect as well as any one of the possible implementations of the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where executable instructions are stored, and when the executable instructions are executed by a computer, the method described in the first aspect and any possible implementation manner of the first aspect can be implemented.
One or more technical solutions provided in the embodiments of the present invention have at least the following technical effects or advantages:
the embodiment of the invention adopts a method, a device and a storage medium for predicting the electricity sales amount based on business expansion large data, in the method, a business expansion electricity sales data and an electricity sales amount data set are firstly constructed to form a basis for completing the method, and data analysis is established on the basis of the data; according to the business expansion installation data set and the electricity sales volume data set, clustering is carried out on the customers by adopting a K-Prototype algorithm, a clustering result is determined, a clustering method is adopted, the clustering method is linearly related to the number of samples, and the algorithm is very efficient and good in flexibility for processing a large data set; the principle is simple, and the realization is easy; carrying out decomposition prediction on the clustering result, determining the decomposition result, decomposing the clustering result into a trend item, a seasonal item and a random item, and carrying out decomposition prediction to ensure that the predicted result is more comparative and an optimal prediction model can be selected preferentially; and fitting the decomposition result, determining a fitting result, adding and reconstructing the fitting result, obtaining an optimal prediction curve by combining an AHP algorithm, determining an optimal prediction model, obtaining an electric quantity prediction value, effectively solving the problem that the traditional electric quantity sales prediction is usually carried out by establishing a prediction model, then bringing the historical data of the total electric quantity consumption of the local area into the established model, realizing the prediction of the electric quantity sales by the way, and then analyzing and calculating the electric quantity data of the customer group. The prediction idea ignores the influence of the power development trend (namely business expansion data) on the model accuracy, so that the problems of poor prediction accuracy, incomplete considered factors and inaccurate prediction of the electric sales quantity are caused, the influence process of the business expansion data on the electric sales quantity is fully considered, the incidence relation between the capacity and the electric sales quantity in the business expansion data is analyzed, the accuracy of the electric sales quantity prediction result is improved, and the electric sales quantity is scientifically and accurately predicted.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments of the present invention or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart illustrating steps of a business expansion big data-based electricity sales amount prediction method according to an embodiment of the present application;
fig. 2 is a flowchart of grouping steps of a business expansion large data-based power selling amount prediction method according to an embodiment of the present application;
fig. 3 is a schematic diagram of an electricity sales prediction apparatus based on business expansion installation big data according to an embodiment of the present application;
fig. 4 is a schematic diagram of an electricity sales amount prediction apparatus based on business expansion installation big data according to an embodiment of the present application.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The method for predicting the electricity sales amount based on the business expansion large data solves the problems that the result of predicting the electricity sales amount data is inaccurate and the application range is small in the prior art, and realizes accurate prediction of the electricity sales amount data.
The method for predicting the power selling amount based on business expansion large data is shown in fig. 1, and includes steps S101 to S105.
Step S101: and constructing an industry expansion installation data set, an electricity sales data set and an economic data set.
Step S102: and clustering the customers by adopting a K-Prototype algorithm according to the business expansion installation data set and the electricity sales volume data set to determine a plurality of customer groups.
Step S103: and (4) carrying out power selling decomposition on each customer group, and determining a decomposition result.
Step S104: and fitting the decomposition result to determine a fitting result.
Step S105: and adding and reconstructing the fitting result, obtaining an optimal prediction curve by combining an AHP algorithm, and determining an optimal prediction model.
In the steps, an industry expansion electricity selling data and an electricity selling quantity data set are firstly constructed to form a basis for completing the method, and data analysis is established on the basis of the data; according to the business expansion installation data set and the electricity sales volume data set, clustering is carried out on the customers by adopting a K-Prototype algorithm, a clustering result is determined, a clustering method is adopted, the clustering method is linearly related to the number of samples, and the algorithm is very efficient and good in flexibility for processing a large data set; the principle is simple, and the realization is easy; carrying out decomposition prediction on the clustering result, determining the decomposition result, decomposing the clustering result into a trend item, a seasonal item and a random item, and carrying out decomposition prediction to ensure that the predicted result is more comparative and an optimal prediction model can be selected preferentially; and fitting the decomposition result, determining a fitting result, adding the fitting result, then restoring, combining an AHP algorithm to obtain an optimal prediction curve, determining an optimal prediction model, and obtaining an electric quantity prediction value. The prediction idea ignores the influence of the power development trend (namely business expansion data) on the model accuracy, so that the problems of poor prediction accuracy, incomplete considered factors and inaccurate prediction of the electric sales quantity are caused, the influence process of the business expansion data on the electric sales quantity is fully considered, the incidence relation between the capacity and the electric sales quantity in the business expansion data is analyzed, the accuracy of the electric sales quantity prediction result is improved, and the electric sales quantity is scientifically and accurately predicted.
Before clustering customers by adopting a K-Prototype algorithm and determining a plurality of customer groups, carrying out centralized data processing on a power selling quantity data set, wherein the data processing comprises the following steps:
and (3) identifying abnormal values of the electricity sales volume data set by using a box graph method, and correcting the abnormal values by using historical mean data. The specific method for processing the abnormal value identification comprises the following steps: recording historical electricity selling amount data
Figure BDA0003068247580000071
Define the target point as
Figure BDA0003068247580000072
The specific formula for solving the historical electricity selling quantity data is as follows:
Figure BDA0003068247580000073
further, we obtain { u (i) | i ═ 1, 2., n-1}, and define the target point
Figure BDA0003068247580000074
Corresponding target-related points are U (i) and U (i-1), binning operation is successively performed except for the target-related points,
Figure BDA0003068247580000079
wherein P isupDenotes the upper boundary, PdownRepresenting the lower border, B the median, A the upper quartile, and C the lower quartile.
Judgment of
Figure BDA0003068247580000075
The basic steps of the abnormal condition of (1) are as follows: if neither U (i) nor U (i-1) is within the boundary, determining the target point
Figure BDA0003068247580000076
Is an abnormal value. Otherwise
Figure BDA0003068247580000077
Not an outlier.
After the abnormal value is identified, the historical mean value is adopted for replacement, and monthly electricity sales historical data after the abnormal value is processed is obtained
Figure BDA0003068247580000078
And (4) filling the missing values by using an expert filling method and combining a Lagrange interpolation method for the rest of the missing data.
In the present application, a reference example is provided, where weather data, holiday data, and economic data of the year 2018 month 1 to the year 2020 month 9 are collected or captured from the internet, and historical electricity sales amount data and industry expansion basic data of the industry expansion users of the year 2018 month 1 to the year 2020 month 9 are collected from the electricity utilization collection system and the marketing system. Statistics shows that the monthly average power consumption of the high-voltage business expansion users is more than 300 times that of the low-voltage business expansion users, and the monthly power selling amount of the high-voltage business expansion users accounts for more than 90% of the monthly total power consumption, so that the high-voltage business expansion users are taken for carrying out power selling amount prediction analysis; and performing data cleaning on the 30-month electricity sales volume data sets of different high-voltage business expansion customers, wherein the data cleaning comprises the processing of some missing values and abnormal values.
And then, converting the historical electricity sales data format and normalizing the data format into a format capable of carrying out model construction. Note that the historical data of the electricity sales amount collected here is in a row-by-month format, and later, for convenience of prediction development, the data is converted from rows to columns by using a melt method.
After historical electricity sales data are processed, according to trend analysis of daily load 96 point data, users are divided into five categories, namely a daily peak type, a night peak type, a double peak type, a night electricity utilization type and a stable type. The peak type means that the electricity consumption of the user reaches the peak of electricity consumption in the daytime; the night peak type means that the user reaches the peak of electricity utilization at night; the user has electricity utilization peak in the daytime and at night; the night electricity utilization type refers to a user who uses electricity at night and does not use electricity in the daytime; the stable type means that the electricity consumption of the user is relatively stable in the same day.
The data specific processing mode is that for each user, the average load value of each point of 96 collection points every day every month is calculated to serve as load data P1 … P96 of the collection point, then the maximum load data Max1 and Max2 and the average load Mean1 are respectively taken in the daytime (P28-P76) and the nighttime (P1-P27 and P77-P96), M is taken as the average value of the P1-P96 points through Mean2, and then the load trend type of the user is judged according to the size relation between Max1 and Max2 and the upper quartile Q, M, Mean1 and Mean2 of the P1-P96. Max1 is more than or equal to Q, Max2 is more than or equal to Q, and the model is bimodal; max1 is more than or equal to Q, Max2 is less than M, and the daily peak shape; max1 is less than M, Max2 is more than or equal to Q, and the model is night peak; max1> M, night use type; and others, a smooth type.
In step S102, clustering the clients by using a K-Prototype algorithm includes: clustering is performed according to supply voltage, industry expansion type, net capacity, industry classification, load trend and load property. Wherein, the business expansion type, the business classification, the load trend and the load property are converted into numerical value form according to the established rule when grouping. Then clustering is performed.
After the data processing is carried out, K-Prototype algorithm clustering is carried out, so that the method is an effective algorithm capable of clustering mixed numerical attribute and classification attribute data, and the weights in the numerical attribute and the classification attribute are controlled through parameters. The basic step flow using the K-Prototype algorithm is shown in fig. 2, and specifically includes steps S201 to S230.
Step S201: k data objects are randomly selected from data set X as the initial cluster centers.
Step S202: for each data object in dataset X, according to d (X)i,Ql) The distance from the formula to the center of each cluster is calculated, and the clusters with the closest distance are divided. And updating the cluster center of the corresponding cluster after each division is finished.
Step S203: when all the objects in the data set are allocated to the corresponding clusters, the distances from the data objects to the current cluster center are recalculated, if the cluster center closest to a certain data object is found to be in other clusters, the data object is reallocated to the cluster where the cluster center closest to the certain data object is located, and then the cluster centers of the two clusters where the data object change occurs are updated.
Step S204: step S203 is repeated until no data objects of the changed cluster exist after a new round of computation.
When the client group is divided, no actual label column exists, and the method belongs to unsupervised learning. The method adopts a K-Prototype clustering algorithm to divide the customer groups, can manually control the division type, selects the division conditions, and accurately divides various user behavior variables, and is convenient, rapid, scientific and accurate. The method can accurately group the customers, so that the electricity sales amount prediction in the following steps can be more accurate.
In step S103, the power selling amount decomposition is performed for each customer group, and the method includes: each customer group is decomposed into a trend term, a seasonal term, and a random term using X13 seasonal decomposition. Before the decomposition prediction is carried out on the customer group, log smoothing processing is firstly carried out on the electricity sales amount related to the customer group, and then decomposition is carried out.
Fitting the decomposition results, including: adopting a support vector machine regression SVR algorithm, an echo state network ESN algorithm and a gray model GM to the trend item, and combining the industry expansion selling electric quantity data set and the economic data set to predict to obtain a trend item result; predicting the seasonal item by adopting an L1/2 sparse iterative algorithm to obtain a seasonal item result; and (4) predicting the random item by adopting a linear regression algorithm and a Kalman filtering algorithm in combination with weather and holiday data to obtain a trend item result.
After obtaining the decomposition result, in step S104, the decomposition result is fitted, and the specific fitting process is as follows:
the steps for predicting the trend items of various industries are as follows:
the first step is as follows: the data sequence of the power selling trend item is { Qt(i)|i∈1,2,...,n}。
The second step is that: obtaining economic influence factors corresponding to industries from the factor table, namely data in the economic data set, and marking as { E (i) | i ∈ 1, 2.. and n }, if more than one influence factor of a certain industry is available, then E (i) } E1(i),E2(i),...,Em(i) I belongs to 1,2, the.. the.n.m belongs to 1,2, the.. the.k.the m represents the number of the influence factors, and if the trend item of the client group 1 predicts that all the people can control the income, the m is 1; and the trend item prediction of the client group 2 does not consider the economic data index, and m is equal to 0, and so on.
The third step: and establishing a trend item prediction model. Let j-1 denote SVR, j-2 denote ESN, and j-3 denote GM. Establishing a trend item prediction model as follows:
Qt(i)=fj(E(i),Qt(i-1),Qt(i-2),...,Qt(i-12)),j=1,2,3,4 (1)
the fourth step: three prediction results of the trend item are obtained. Substituting the predicted values of the factors into the formula (1) to obtain the predicted result of the trend term
Figure BDA0003068247580000101
The seasonal item of the electricity sales quantity is high in regularity, the annual same-month value fluctuation is small, the trend change is stable, and the seasonal item of the electricity sales quantity is predicted by L1/2 sparse iteration. The prediction steps are as follows:
the first step is as follows: the season item data sequence of electricity sale is { Qs(i)|i∈1,2,...,n}。
The second step is that: and establishing a seasonal term prediction model by using L1/2 sparse iteration.
Qs(i)=fL1/2(Qs(i-1),Qs(i-2),...,Qs(i-12)) (2)
The third step: obtaining the prediction result of seasonal item
Figure BDA0003068247580000102
Where l denotes the prediction step size.
The prediction of the random item is carried out by adopting a linear regression algorithm and a Kalman filtering algorithm, and the specific steps are as follows:
the first step is as follows: the sequence of the electricity selling random items is { Qr(i)|i∈1,2,...,n}。
The second step is that: the holiday days and the average air temperature data are respectively recorded as { H (i) | i ∈ 1, 2.,. n }, { T (i) | i ∈ 1, 2.,. n }.
The third step: and predicting random items in the electricity sales volume curves of each customer group and Huaibei city by using a linear regression prediction model.
Qr(i)=f(H(i),T(i),Qr(i-1),Qr(i-2),...,Qr(i-12)) (2)
The fourth step: and obtaining a predicted value of the factor. The holiday days can be directly obtained by inquiring through a national government website, so that the statement value is { H (i) | E1, 2. The average temperature can be obtained by L1/2 sparse iterative prediction, and the obtained predicted value of the average temperature is
Figure BDA0003068247580000103
The fifth step: substituting the predicted value of the factor into formula (2), namely performing Kalman filtering correction on the predicted value to obtain a random term prediction result
Figure BDA0003068247580000104
Where l denotes the prediction step size.
In step S105, an optimal prediction curve is obtained by combining the AHP algorithm, and an optimal prediction model is determined, including: and comprehensively considering the prediction error, the trend reliability and the prediction similarity to determine an optimal prediction model.
Through the process, the prediction results of the trend item, the seasonal item and the random item in the prediction step length l period are obtained respectively, wherein the trend item comprises three prediction results. Three prediction results of the electricity sales amount can be obtained through an addition and reconstruction mode.
Figure BDA0003068247580000111
Three prediction results of the electricity sales are obtained through prediction reconstruction, and the preferred purpose is to select a prediction curve with the most prediction performance by comprehensively considering the aspects of prediction errors, prediction trends, prediction curve forms and the like.
Preferentially adopting an AHP comprehensive evaluation algorithm, and dividing the preferred problem of the electricity sales prediction result into a target layer, a standard layer and a scheme layer according to the hierarchical structure of the AHP:
target layer: the 1 objective is to select the prediction curve with the most predictive performance.
A criterion layer: the evaluation method comprises 3 evaluation indexes: training errors, predicted trend similarity and predicted trend credibility of the model. The indices have the following meanings:
1. the model training error reflects the degree of matching between the simulated trend of the prediction model and the historical electricity sales data, namely:
Figure BDA0003068247580000112
where Q (j) is the history value at time j, F (j) is the fitting value at time j, and N is the number of fitting values.
2. The similarity of the prediction trend reflects the similarity of the current prediction result of the prediction model and the form of the historical electricity sales amount curve, namely:
Figure BDA0003068247580000113
wherein xjIs the true value of the historical j time, yjIs the predicted value at the moment of the prediction model j,
Figure BDA0003068247580000114
and
Figure BDA0003068247580000115
are each xj,yjAnd (4) average value.
3. The prediction trend credibility is a measure for the current prediction result of the prediction model to conform to the trend change of the historical curve, namely:
Figure BDA0003068247580000121
wherein r is the ratio of the predicted annual accumulated electricity sales and the previous annual history accumulated electricity sales,
Figure BDA0003068247580000122
the interval is formed by the minimum value and the maximum value of the historical annual cumulative ratio.
Scheme layer: the method comprises 3 schemes, namely a power selling amount prediction result under three algorithms of an SVR (singular value response), a gray model and an echo state network. And then, constructing a judgment matrix of the target layer to the criterion layer, and solving a weight vector.
The embodiment of the invention also provides a device for predicting the electricity sales amount based on business expansion big data, which comprises the following units as shown in fig. 3: a data acquisition unit 301, a data clustering unit 302, a decomposition prediction unit 303, a grouping fitting unit 304, and a model determination unit 305.
Wherein the data acquisition unit 301: the method is used for constructing an industry expansion installation data set, an electricity sales data set and an economic data set. The data clustering unit 302: the method is used for clustering the customers by adopting a K-Prototype algorithm according to the business expansion loading data set and the electricity sales volume data set to determine a plurality of customer groups. Decomposition prediction unit 303: the system is used for decomposing the electricity sales amount of each customer group and determining a decomposition result. The packet fitting unit 304: and fitting the decomposition result to determine a fitting result. Model determination unit 305: and the method is used for adding and reconstructing the fitted rows, obtaining an optimal prediction curve by combining an AHP algorithm and determining an optimal prediction model.
The data clustering unit 302 further includes a data processing unit, the data processing unit performs centralized data processing on the electricity sales data set before performing clustering on the customers by using a K-Prototype algorithm and determining a plurality of customer groups, and the data processing includes: identifying abnormal values of the electricity selling quantity data set by using a box type graph method, and correcting the abnormal values by using historical mean data; and filling the missing value by using an expert filling method and combining a Lagrange interpolation method.
In the data clustering section 302, clustering is performed according to the power supply voltage, the business expansion type, the net capacity, the business classification, the load trend, and the load property.
The decomposition prediction unit 303 is specifically configured to: each customer group is decomposed into a trend term, a seasonal term, and a random term using X13 seasonal decomposition.
The group fitting unit 304 is specifically configured to: adopting a support vector machine regression SVR algorithm, an echo state network ESN algorithm and a gray model GM to the trend item, and combining the business expansion selling electric quantity data set to predict to obtain a trend item result; predicting the seasonal item by adopting an L1/2 sparse iterative algorithm to obtain a seasonal item result; and (4) predicting the random item by adopting a linear regression algorithm and a Kalman filtering algorithm in combination with weather and holiday data to obtain a trend item result.
The model determining unit 305 is specifically configured to: and comprehensively considering the prediction error, the trend reliability and the prediction similarity to determine an optimal prediction model.
The apparatuses or units illustrated in the above embodiments may be specifically implemented by a computer chip or an entity, or implemented by a product with certain functions. For convenience of description, the above devices are described as being divided into various units by function, and are described separately. The functionality of the units may be implemented in one or more software and/or hardware when implementing the application. Of course, a unit that implements a certain function may also be implemented by a plurality of sub-units or a combination of sub-units.
The embodiment of the invention provides a business expansion big data-based electricity sales amount prediction device, which comprises a memory 401 and a processor 402, as shown in fig. 4; the memory 401 and the processor 402 are connected by a system bus 403. The memory 401 is used to store computer executable instructions; the processor 402 is configured to execute computer-executable instructions to implement the method for predicting electricity sales based on business expansion installation big data according to the embodiment of the present invention.
The embodiment of the invention provides a computer-readable storage medium, wherein executable instructions are stored in the computer-readable storage medium, and when the computer executes the executable instructions, the method for predicting the electricity sales amount based on business expansion large data provided by the embodiment of the invention can be realized.
The storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache, a Hard Disk (Hard Disk Drive), or a Memory Card (HDD). The memory may be used to store computer program instructions.
The methods, apparatus or units described herein may be implemented in any suitable manner by implementing a controller in computer readable program code, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, Application Specific Integrated Circuits (ASICs), programmable logic controllers and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software element for performing the method and a structure within a hardware component.
Some elements of the apparatus described herein may be described in the general context of computer-executable instructions, such as program elements, being executed by a computer. Generally, program elements include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program elements may be located in both local and remote computer storage media including memory storage devices.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary hardware. Based on such understanding, the technical solutions of the present application may be embodied in the form of software products or in the implementation process of data migration, which essentially or partially contributes to the prior art. The computer software product may be stored in a storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, mobile terminal, server, or network device, etc.) to perform the methods described in the various embodiments or portions of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same or similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. All or portions of the present application are operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, mobile communication terminals, multiprocessor systems, microprocessor-based systems, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the present application; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the present disclosure.

Claims (9)

1. A business expansion big data-based electricity sales amount prediction method is characterized by comprising the following steps:
constructing a business expansion installation data set, an electricity selling data set, an economic data set, a weather data set and a holiday data set;
clustering customers by adopting a K-Prototype algorithm according to the business expansion and installation data set and the electricity sales volume data set to determine a plurality of customer groups;
carrying out power selling decomposition on each customer group to determine a decomposition result;
fitting the decomposition result to determine a fitting result;
and performing summation and reconstruction on the fitting result, and determining an optimal prediction model by combining an AHP algorithm to obtain an optimal prediction curve.
2. The method of claim 1, wherein prior to performing the clustering of customers using the K-Prototype algorithm to determine a plurality of customer groups, performing centralized data processing on the electricity sales data set, the data processing comprising:
identifying abnormal values of the electricity selling quantity data set by using a box type graph method, and correcting the abnormal values by using historical mean value data;
and filling the missing value by using an expert filling method and combining a Lagrange interpolation method.
3. The method of claim 1, wherein clustering the clients using the K-Prototype algorithm comprises: clustering is performed according to supply voltage, industry expansion type, net capacity, industry classification, load trend and load property.
4. The method of claim 1, wherein said resolving power sold for each of said customer groups comprises: and decomposing each customer group into a trend item, a seasonal item and a random item by adopting X13 seasonal decomposition.
5. The method of claim 4, wherein said fitting the decomposition results comprises:
adopting a support vector machine regression SVR algorithm, an echo state network ESN algorithm and a gray model GM to the trend item, and combining the industry expansion selling electric quantity data set and the economic data set to predict to obtain a trend item result;
predicting the seasonal item by adopting an L1/2 sparse iterative algorithm to obtain a seasonal item result;
and adopting a linear regression algorithm and a Kalman filtering algorithm for the random term, and predicting by combining weather and holiday data to obtain a trend term result.
6. The method of claim 1 wherein determining an optimal prediction model in conjunction with an AHP algorithm to derive an optimal prediction curve comprises: and comprehensively considering the prediction error, the trend reliability and the prediction similarity to determine the optimal prediction model.
7. An electricity sales amount prediction apparatus based on business expansion big data, comprising:
a data acquisition unit: the system is used for constructing an industry expansion installation data set and an electricity sales data set;
a data clustering unit: the system comprises a business expansion and installation data set, an electricity sales volume data set and a K-Prototype algorithm, wherein the business expansion and installation data set is used for acquiring business expansion and installation data of customers;
decomposition prediction unit: the system is used for decomposing the electricity sales amount of each customer group and determining a decomposition result;
a grouping fitting unit: fitting the decomposition result to determine a fitting result;
a model determination unit: and the method is used for performing summation and reconstruction on the fitting result, obtaining an optimal prediction curve by combining an AHP algorithm, and determining an optimal prediction model.
8. The utility model relates to a sales electricity amount prediction device based on business expansion big data, which is characterized by comprising a memory and a processor;
the memory is to store computer-executable instructions;
the processor is configured to execute the computer-executable instructions to implement the method of any of claims 1-6.
9. A computer-readable storage medium having stored thereon executable instructions that, when executed by a computer, are capable of implementing the method of any one of claims 1-6.
CN202110532108.5A 2021-05-17 2021-05-17 Electricity sales amount prediction method based on business expansion large data Pending CN113361750A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110532108.5A CN113361750A (en) 2021-05-17 2021-05-17 Electricity sales amount prediction method based on business expansion large data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110532108.5A CN113361750A (en) 2021-05-17 2021-05-17 Electricity sales amount prediction method based on business expansion large data

Publications (1)

Publication Number Publication Date
CN113361750A true CN113361750A (en) 2021-09-07

Family

ID=77526462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110532108.5A Pending CN113361750A (en) 2021-05-17 2021-05-17 Electricity sales amount prediction method based on business expansion large data

Country Status (1)

Country Link
CN (1) CN113361750A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114064794A (en) * 2021-12-01 2022-02-18 国网辽宁省电力有限公司葫芦岛供电公司 Business expansion file mining and analyzing method based on big data technology

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999791A (en) * 2012-11-23 2013-03-27 广东电网公司电力科学研究院 Power load forecasting method based on customer segmentation in power industry
CN104537434A (en) * 2014-12-18 2015-04-22 国网冀北电力有限公司 Electricity utilization grow curve extraction system and method based on stable period of business expansion reporting
CN104537433A (en) * 2014-12-18 2015-04-22 国网冀北电力有限公司 Sold electricity quantity prediction method based on inventory capacities and business expansion characteristics
CN105023066A (en) * 2015-07-31 2015-11-04 山东大学 Business expansion analytical prediction system and method based on seasonal adjustment
CN106447108A (en) * 2016-09-28 2017-02-22 国网山东省电力公司电力科学研究院 Power utilization demand analysis prediction method taking business-expansion installation data into consideration
CN106485356A (en) * 2016-10-12 2017-03-08 国家电网公司 A kind of power predicating method based on Business Process System big data
CN107146014A (en) * 2017-05-02 2017-09-08 北京中电普华信息技术有限公司 A kind of industry, which expands, has a net increase of impact analysis method and device of the capacity to electricity sales amount
CN107220851A (en) * 2017-05-25 2017-09-29 北京中电普华信息技术有限公司 Electricity sales amount Forecasting Methodology and device based on X13 seasonal adjustments and Cox regression
CN107220764A (en) * 2017-05-25 2017-09-29 北京中电普华信息技术有限公司 A kind of electricity sales amount Forecasting Methodology compensated based on preamble analysis and factor and device
CN110263995A (en) * 2019-06-18 2019-09-20 广西电网有限责任公司电力科学研究院 Consider the distribution transforming heavy-overload prediction technique of load growth rate and user power utilization characteristic
CN111353529A (en) * 2020-02-23 2020-06-30 北京工业大学 Mixed attribute data set clustering method for automatically determining clustering center
CN111476677A (en) * 2020-04-03 2020-07-31 国网湖南省电力有限公司 Big data-based electricity consumption type electricity sales quantity analysis and prediction method and system

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999791A (en) * 2012-11-23 2013-03-27 广东电网公司电力科学研究院 Power load forecasting method based on customer segmentation in power industry
CN104537434A (en) * 2014-12-18 2015-04-22 国网冀北电力有限公司 Electricity utilization grow curve extraction system and method based on stable period of business expansion reporting
CN104537433A (en) * 2014-12-18 2015-04-22 国网冀北电力有限公司 Sold electricity quantity prediction method based on inventory capacities and business expansion characteristics
CN105023066A (en) * 2015-07-31 2015-11-04 山东大学 Business expansion analytical prediction system and method based on seasonal adjustment
CN106447108A (en) * 2016-09-28 2017-02-22 国网山东省电力公司电力科学研究院 Power utilization demand analysis prediction method taking business-expansion installation data into consideration
CN106485356A (en) * 2016-10-12 2017-03-08 国家电网公司 A kind of power predicating method based on Business Process System big data
CN107146014A (en) * 2017-05-02 2017-09-08 北京中电普华信息技术有限公司 A kind of industry, which expands, has a net increase of impact analysis method and device of the capacity to electricity sales amount
CN107220851A (en) * 2017-05-25 2017-09-29 北京中电普华信息技术有限公司 Electricity sales amount Forecasting Methodology and device based on X13 seasonal adjustments and Cox regression
CN107220764A (en) * 2017-05-25 2017-09-29 北京中电普华信息技术有限公司 A kind of electricity sales amount Forecasting Methodology compensated based on preamble analysis and factor and device
CN110263995A (en) * 2019-06-18 2019-09-20 广西电网有限责任公司电力科学研究院 Consider the distribution transforming heavy-overload prediction technique of load growth rate and user power utilization characteristic
CN111353529A (en) * 2020-02-23 2020-06-30 北京工业大学 Mixed attribute data set clustering method for automatically determining clustering center
CN111476677A (en) * 2020-04-03 2020-07-31 国网湖南省电力有限公司 Big data-based electricity consumption type electricity sales quantity analysis and prediction method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
任禹丞;徐超;赵磊;贾静;彭路;周子馨;: "基于自适应特征权重聚类算法的用电问题分析", 计算机系统应用, no. 01 *
敖培著: "环境特性电网规划关键问题研究", 31 July 2015, 上海大学出版社, pages: 15 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114064794A (en) * 2021-12-01 2022-02-18 国网辽宁省电力有限公司葫芦岛供电公司 Business expansion file mining and analyzing method based on big data technology

Similar Documents

Publication Publication Date Title
McMahan et al. Ad click prediction: a view from the trenches
WO2018214629A1 (en) Electricity sales projection method, device, and computer storage medium
CN109685583B (en) Supply chain demand prediction method based on big data
CN109376971B (en) Load curve prediction method and system for power consumers
US20120303598A1 (en) Real-time adaptive binning
CN110930198A (en) Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment
CN112330077B (en) Power load prediction method, power load prediction device, computer equipment and storage medium
CN111008726B (en) Class picture conversion method in power load prediction
CN111815060A (en) Short-term load prediction method and device for power utilization area
CN115409292A (en) Short-term load prediction method for power system and related device
CN115358461A (en) Natural gas load prediction method, device, equipment and medium
CN114266421B (en) New energy power prediction method based on composite meteorological feature construction and selection
CN113361750A (en) Electricity sales amount prediction method based on business expansion large data
CN114581141A (en) Short-term load prediction method based on feature selection and LSSVR
CN108830603A (en) transaction identification method and device
CN112288187A (en) Big data-based electricity sales amount prediction method
Wang et al. Stull: Unbiased online sampling for visual exploration of large spatiotemporal data
CN116894687A (en) Power consumption analysis method and system based on machine learning and electronic equipment
CN116827950A (en) Cloud resource processing method, device, equipment and storage medium
CN111311318A (en) User loss early warning method, device, equipment and storage medium
CN115687788A (en) Intelligent business opportunity recommendation method and system
CN114925919A (en) Service resource processing method and device, computer equipment and storage medium
CN114139770A (en) Metal industry economic estimation system and method based on Solo growth and stock recursion
CN114372835A (en) Comprehensive energy service potential customer identification method, system and computer equipment
CN112308419A (en) Data processing method, device, equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination