CN110348604A - A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation - Google Patents
A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation Download PDFInfo
- Publication number
- CN110348604A CN110348604A CN201910507085.5A CN201910507085A CN110348604A CN 110348604 A CN110348604 A CN 110348604A CN 201910507085 A CN201910507085 A CN 201910507085A CN 110348604 A CN110348604 A CN 110348604A
- Authority
- CN
- China
- Prior art keywords
- user
- data
- electricity consumption
- linear regression
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/06—Electricity, gas or water supply
Abstract
The present disclosure proposes a kind of linear regression power predicating methods and system based on electricity consumption Specialty aggregation, in the polymorphic type of accumulation, on the basis of magnanimity customer electricity information, subspace clustering is carried out according to user power utilization evaluating characteristics index, obtain plurality of classes, and then form a variety of user power utilization modes, according to the difference with power mode, group's division is carried out to user, its strong relation factor is judged using mutual information matrix to different group of subscribers, and then the prediction of electricity consumption is carried out using arithmetic of linearity regression, population data is established multiple linear regression model (LRM)s and is predicted for each user, prediction result is more accurate, prediction effect is more preferable.
Description
Technical field
This disclosure relates to electric system power supply and distribution correlative technology field, in particular to a kind of be based on using electrical characteristics
The linear regression power predicating method and system of cluster.
Background technique
Only there is provided background technical informations relevant to the disclosure for the statement of this part, not necessarily constitute first
Technology.
With the fast development of national economy and energy industry, demand of the power consumer to electric energy is increasing, for supplying
For electric enterprise, the prediction of user power consumption is particularly important, and the prediction of electricity consumption can not only help Utilities Electric Co. more preferable
Ground understands and service user, is that corresponding planning is formulated in the development of power grid, can specifically carry out the scheduling with power consumption, simultaneously
Also the formulation that Correspondence policy can be helped, as the construction plan of electric system is laid out.It is over time and economical
It continues to develop, it is anticipated that China also will be higher and higher to the degree of dependence of electric power.
The electricity consumption behavior of user has otherness, even the user of same industry, over time, this difference
Different also to become clear day by day, most of existing power consumption prediction carries out pattern-recognition by industrial nature, can not dig well
Dig the information of user.User's is not only related with the correlative factor of the industry with electrical characteristics, also has with other socio-economic factors
It closes, different zones user's is similar with the use electrical property change trend of different industries with electrical characteristics, and user power utilization characteristic presents more
Sample, this forms challenge to related power predicating method.With the development of science and technology, especially intellectual technology is constantly progressive,
Various intelligent power grid technologies emerge one after another, and have also been greatly improved in terms of the construction of power grid, existing electric power big data is enough
Data supporting is provided for the quantization and correlation predictive of user power utilization characteristic, establishing targeted prediction model can not only
Electricity demand forecasting precision is improved, while can also help enterprise understanding user and its population effect.
Summary of the invention
The disclosure to solve the above-mentioned problems, proposes a kind of linear regression power quantity predicting based on electricity consumption Specialty aggregation
Method and system, on the basis of the polymorphic type of accumulation, magnanimity customer electricity information, according to user power utilization evaluating characteristics index
Carry out subspace clustering, obtain plurality of classes, and then form a variety of user power utilization modes, according to the difference with power mode, to
Family carries out group's division, judges its strong relation factor using mutual information matrix to different group of subscribers, and then use polynary line
Property regression algorithm carry out electricity consumption prediction, population data is established multiple linear regression model (LRM)s and is predicted for each user,
Prediction result is more accurate, and prediction effect is more preferable.
To achieve the goals above, the disclosure adopts the following technical scheme that
One or more embodiments provide a kind of linear regression power predicating method based on electricity consumption Specialty aggregation, packet
Include following steps:
Electricity customers data are clustered respectively in multiple dimensions, obtain a variety of cluster results;
Cluster result any combination is obtained into different user power utilization modes, by Electricity customers according to power mode not
With classifying, different user groups is obtained;
For different user groups, the strong pass for influencing each electricity consumption group electricity consumption behavior is determined using Mutual Information Theory
Connection factor;
According to the strong relation factor of the user power utilization data of group and influence in-group electricity consumption, it is directed to each use respectively
Family group establishes corresponding multiple linear regression model, establishes power consumption prediction model according to the multiple linear regression model of foundation;
The strong relation factor data for acquiring each user group input corresponding linear regression model (LRM), to each user group
Electricity consumption data predicted.
One or more embodiments provide a kind of linear regression power quantity predicting system based on electricity consumption Specialty aggregation, packet
It includes:
Cluster module: gather for Electricity customers data to be respectively adopted with AP automatic cluster algorithm in multiple dimensions
Class obtains a variety of cluster results;
User's categorization module: for cluster result any combination to be obtained different user power utilization modes, by Electricity customers
Classify according to the difference of power mode, obtains different user groups;
Strong relation factor determining module: for being directed to different user groups, being determined using Mutual Information Theory is influenced each
The strong relation factor of electricity consumption group electricity consumption behavior;
Power consumption prediction model construction module: for according to the user power utilization data of group and influencing the strong of in-group electricity consumption
Relation factor establishes corresponding multiple linear regression model for each user group respectively, according to the multiple linear of foundation
Regression model establishes power consumption prediction model;
Prediction module: the strong relation factor data for acquiring each user group input corresponding multiple linear regression
Model predicts the electricity consumption data of each user group.
A kind of electronic equipment, characterized in that including memory and processor and storage on a memory and in processor
The computer instruction of upper operation when the computer instruction is run by processor, completes step described in the above method.
A kind of computer readable storage medium, characterized in that for storing computer instruction, the computer instruction quilt
When processor executes, step described in the above method is completed.
The disclosure carries out subspace clustering according to user power utilization evaluating characteristics index, obtains plurality of classes, and then formed more
Kind user power utilization mode carries out group's division to user according to the difference with power mode, to different group of subscribers using mutually
Information matrix judges its strong relation factor, and then the prediction of electricity consumption is carried out using arithmetic of linearity regression, for each use
Family population data establishes multiple linear regression model (LRM)s and is predicted that prediction result is more accurate, and prediction effect is more preferable.
Compared with prior art, the disclosure has the beneficial effect that
(1) a kind of user's power predicating method based on AP clustering algorithm and multiple linear regression that the disclosure proposes, no
Only automatic cluster can be carried out according to user power utilization characteristic, additionally it is possible to the electricity consumption of effectively identification different user group it is associated because
Element, by cluster, available a variety of user power utilization modes, so that different user groups is obtained, it can by calculating mutual information
To obtain the strong relation factor of different user group, it is fitted, is obtained pre- using arithmetic of linearity regression on this basis
It surveys as a result, the disclosure improves prediction effect to classification subtilizedization of user.
(2) the AP automatic cluster algorithm that the disclosure uses, it is not necessary that the number of cluster is manually arranged, in multiple dimensions such as four
It is clustered in a dimension, user group's number obtained is more, and classification is also more fine, classifies more prediction results more
Accurately, accurate data foundation is provided for the power transmission and distribution scheduling of Utilities Electric Co..
(3) disclosure is determined different user group electricity consumption most correlative factor using mutual information method and is applied to it
In power quantity predicting model, so that prediction technique is reasonable.
Detailed description of the invention
The Figure of description for constituting a part of this disclosure is used to provide further understanding of the disclosure, the disclosure
Illustrative embodiments and their description do not constitute the restriction to the disclosure for explaining the disclosure.
Fig. 1 is the power consumption prediction method flow diagram of the embodiment of the present disclosure 1;
Fig. 2 is the power consumption prediction model modeling procedure chart of the embodiment of the present disclosure 1;
Fig. 3 (a)-Fig. 3 (c) is user year electricity consumption cluster result figure in the example of the embodiment of the present disclosure 1;
Fig. 4 (a)-Fig. 4 (d) is user month electricity consumption cluster result figure in the example of the embodiment of the present disclosure 1;
Fig. 5 (a)-Fig. 5 (d) is comprehensive day negative curve dendrogram in the example of the embodiment of the present disclosure 1;
The prediction-error image of different regression model numbers in the example of Fig. 6 embodiment of the present disclosure 1;
Fig. 7 disclosed method and comparison algorithm prediction effect comparison diagram.
Specific embodiment:
The disclosure is described further with embodiment with reference to the accompanying drawing.
It is noted that described further below be all exemplary, it is intended to provide further instruction to the disclosure.Unless
Otherwise indicated, all technical and scientific terms used herein has and disclosure person of an ordinary skill in the technical field
Normally understood identical meanings.
It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root
According to the illustrative embodiments of the disclosure.As used herein, unless the context clearly indicates otherwise, otherwise singular shape
Formula be also intended to include plural form, additionally, it should be understood that, when in the present specification use term "comprising" and/or
When " comprising ", existing characteristics, step, operation, device, component and/or their combination are indicated.It should be noted that not
In the case where conflict, each embodiment in the disclosure and the feature in embodiment be can be combined with each other.Below in conjunction with attached drawing
Embodiment is described in detail.
In the technical solution disclosed in one or more embodiments, as shown in Figure 1, a kind of based on poly- with electrical characteristics
The linear regression power predicating method of class, includes the following steps:
Step 1 clusters Electricity customers data in multiple dimensions, obtains a variety of cluster results;
Cluster result any combination is obtained different user power utilization modes by step 2, by Electricity customers according to electricity consumption mould
The difference of formula is classified, and different user groups is obtained;
Step 3, for different user groups, being determined using Mutual Information Theory influences each electricity consumption group electricity consumption behavior
Strong relation factor;
Step 4 according to the user power utilization data of group and influences the strong relation factor of in-group electricity consumption, respectively for every
A user group establishes corresponding linear regression model (LRM), according to the linear regression model power consumption prediction model of foundation;
Step 5, the strong relation factor data of each user group of acquisition input corresponding linear regression model (LRM), to each use
The electricity consumption data of family group is predicted.
The step 1 is respectively adopted AP automatic cluster algorithm in multiple dimensions to Electricity customers data and clusters,
Obtain a variety of cluster results, step specifically:
Step 11, acquisition electricity consumption data, construct user power utilization performance data collection VD;
Electricity consumption data can may include timing and Fei Shi using the data of the Electricity customers of associate power enterprise accumulation
The evaluation index data of sequence, wherein timing evaluation index includes user year electricity consumption data, moon electricity consumption data and daily load
Data, non-sequential evaluation index include load density, Day average load power, season unbalance factor, number of working hours based on maximum load
Etc. indexs.The user power utilization performance data collection V of buildingDIt can be such that
Vi={ αt1, αt2..., αtu;βt1, βt2..., βtv;δ1, δ2..., δk;γ1, γ2..., γw}∈VD (1)
I=1,2 in above formula ..., m indicates user.αt1,αt2,…,αtuAnd βt1,βt2,…,βtvThere is temporal aspect
Vector respectively indicates user year electricity consumption data and moon electricity consumption data;δ1,δ2,δ3,…,δkIndicate average in a period of time
Daily load data, particular content may include 48 points of (δ1,…,δ48) per day load data.γ1, γ2,…,γwWhen being non-
The feature vector of sequence, particular content may include load density γ1, Day average load power γ2, season unbalance factor γ3, maximum
Load utilizes hourage γ4Equal load indexs.
Data set VD is divided multiple and different subspaces by step 12, and AP is respectively adopted to the data of each sub-spaces
Automatic cluster algorithm is clustered, and the subspace clustering result for corresponding to every sub-spaces data is obtained;
The quantity of subspace is dimension, and the present embodiment can be clustered in 4 dimensions, by data set VDIt divides
For 4 different subspace L1~L4, year electricity consumption sequence, moon electricity consumption sequence, daily load data and part throttle characteristics number
According to AP cluster is carried out on four dimensions, by clustering available corresponding r, s, k and t cluster, and sample number strong point pair is sought
In the degree of membership of each cluster in subspace, can be indicated by following formula:
Above formula meetsWithAnd uα,j, uβ,j, uδ,j, uγ,j∈
[0,1], uα,j、uβ,j、uδ,j、uγ,jUser u is respectively indicated for different in four sub-spaces cluster result r, s, k and t cluster
The degree of membership of cluster.
In the step 2, cluster result any combination is obtained into different user power utilization modes, by Electricity customers according to
Classified with the difference of power mode, obtains different user groups;
Described any combination can gather in each sub-spaces or each dimension for cluster result is carried out permutation and combination
A cluster is extracted in the cluster that class obtains respectively to be combined, so that it is determined that a kind of user power utilization mode is used in resulting cluster
Family power mode total quantity is the product of the quantity of cluster in each cluster result.It is available right that this implementation clusters in 4 dimensions
R, s, the k and t clusters answered, user power utilization mode total quantity N=r × s × k × t.
Electricity customers are classified according to the difference of power mode, the method for obtaining different user groups can have
Body is that Electricity customers are divided the degree of membership of different groups by the electricity consumption data by calculating each sample point, that is, Electricity customers
To the corresponding maximum user group of degree of membership.User is divided into different user group according to formula (3), it is specific to calculate public affairs
Formula is as follows:
In above formula, uα,max、uβ,max、uδ,max、uγ,maxIt respectively indicates user u and corresponds to different clusters persons in servitude in four sub-spaces
The maximum value of category degree.
The step 3, for different user groups, being determined using Mutual Information Theory influences the electricity consumption of each electricity consumption group
The strong relation factor of behavior.For different user groups, using Mutual Information Theory to user power consumption data and potential association
Factor is associated analysis, determines strong relation factor relevant to user power utilization behavior, and specific steps can be such that
Step 31: calculating the mutual information between the electricity consumption data X of user and potential relation factor Y to obtain between the two
Correlation degree, mutual information calculation formula is as follows:
Wherein, M indicate X and all values of Y number and;NiIndicate the section quantity of X;MiIt is expressed as X and falls in i-th of area
Between numerical value number;NjIndicate the section quantity of Y, P (yu) indicate that Y falls in the probability in u-th of section;MuvIndicate that Y falls in u
When a section, X falls in the numerical value number in v-th of section just.
Relation factor Y is the potential relation factor for influencing user power utilization behavior, and user is broadly divided into resident and non-resident use
Two kinds of family, non-resident user covers the industries such as industry, food and drink, communications and transportation, is divided into nonmanufacturing industry and manufacturing industry user, for
Nonmanufacturing industry user chooses the industry gross output value, investment in fixed assets, Industrial Cycle index, the major product producer price of this area
The many factors such as grid index, product library storage choose product yield, original as potential relation factor, for manufacturing industry user
Material price index, product library storage etc. factor are as relation factor, in terms of region, consider totality GDP, first and second,
The factors such as three industry GDP, fixed investment.
Step 32, according to calculating mutual information number between each user power consumption and potential relation factor in user group
It, can be as follows according to breath matrix of building up mutual trust:
Wherein, { X1,X2,…,XpIndicate the data set that p user power consumption data sequence is constituted, { Y1,Y2,…,Yl}
Indicate the data set that potential relation factor is constituted.
Step 33 is directed to each relation factor, calculates average mutual between each relation factor and the electricity consumption data of user
Information obtains the Average Mutual of different groups user Yu each relation factor.Calculate in above-mentioned matrix relation factor Yj with
X1, X2 ..., the Average Mutual between Xp, can be calculated by formula (6):
Wherein, { X1,X2,…,XpIndicate the data set that p user power consumption data sequence is constituted, { Y1,Y2,…,Yl}
Indicate the data set that potential relation factor is constituted.
Step 34: selection Average Mutual numerical value be greater than zero representated by relation factor, and according to mutual information numerical values recited
The relation factor of selection is ranked up, the relation factor list after being sorted, choose the biggish association of mutual information numerical value because
Element, as strong relation factor.If it is according to the descending arrangement of mutual information numerical value, forward relation factor is chosen, it is specific to select
Taking how many can be set.
Strong relation factor constructs training sample set Sk (k=1 ..., n) together with user power utilization data, and combines different
User group Gk (k=1 ..., n) carries out the prediction of electricity consumption.
The step 4 according to the user power utilization data of group and influences the strong relation factor of in-group electricity consumption, using more
First linear regression algorithm establishes corresponding multiple linear regression model (LRM)s for each user group respectively.For each user
Group establishes a linear regression model (LRM), is finally to establish multiple linear regression model (LRM)s.
The step 4 establishes regression model using arithmetic of linearity regression, can use for reference random forests algorithm, for
Different user group GkFrom original training sample collection SkIn obtain multiple training samples at random, each sample is modeled,
Then each model is tested using test set, finally obtains power quantity predicting model.As shown in Fig. 2, step can be specific
Are as follows:
Step 41, according to the electricity consumption data of strong relation factor corresponding data and each user group, establish each user
The set of data samples S of groupk;
Step 42, from SkIn randomly select w training sample (S1,S2,…,Skw), the sampling side Bootstrap can be used
Method, by the training sample of extraction with the corresponding data of strong relation factor to input, the electricity consumption data of corresponding user group is that output is instructed
Practice w multiple linear regression model of building.Multiple linear regression model (LRM)s are established to the same group and carry out prediction result, prediction is accurate
Du Genggao.
Step 43, by set of data samples SkThe data not being extracted input w multiple linear of building as test set
Regression model is tested.Using there is the sampling put back to, part sample is not in training sample subset extraction process
In the sample set of acquisition, referred to as " the outer data of bag ", data are tested as test set using outside bag.Test error exists
Model in certain range is qualified model.
Step 44 carries out the power consumption prediction model that linear combination obtains total user to all Linear Regression Forecasting Models.
Original training data collection SkIn have two class data composition, be user power consumption data and corresponding M respectively
Kind relation factor data.Mode input is M kind relation factor data, and output is the time series data of user power consumption.Establish w
Multiple linear regression model carries out emulation testing by test set, will be with electricity consumption data YkRelevant relation factor data Xk
As input, obtains prediction result and is calculated using formula (7):
In the step C4, HkIt is corresponding GkPower quantity predicting model, fkiIt is single Linear Regression Forecasting Model, by fkInto
The power consumption prediction model of total user can be obtained in row linear combination.
Step 5, the strong relation factor data of each user group of acquisition input corresponding linear regression model (LRM), to each use
The electricity consumption data of family group is predicted.
The corresponding data of the strong relation factor of user are acquired, the linear regression model (LRM) that corresponding user group establishes is input to, it is defeated
The power consumption prediction value of user out.
It is illustrated below with specific example:
Choose electricity consumption of 7832 typical users between 2014~2017 years, day in the region of East Coastal province
The data such as load curve calculate corresponding user's number of working hours based on maximum load γ1, load density γ2, peak-valley electric energy ratio
γ3, season unbalance factor γ4, Day average load power γ5Equal Load characteristics indexes data, and with user power consumption time series data
In conjunction with building user power utilization performance data collection VD。
User is broadly divided into resident and two kinds of non-resident user, and non-resident user covers industry, food and drink, communications and transportation etc.
Industry chooses the industry gross output value of this area, investment in fixed assets, Industrial Cycle index, main for nonmanufacturing industry user
40 kinds of factors such as product export price index, product library storage are chosen manufacturing industry user and are produced as potential relation factor
78 kinds of factors such as product yield, prices of raw materials index, product library storage are as relation factor, in terms of region, consider overall
20 kinds of factors such as GDP, primary ,secondary and tertiary industries GDP, fixed investment.Total 138 kinds of factors are associated with as user power utilization
Factor constructs potential relation factor data set Y by these factor datasD, and to user power utilization data and YDCarry out normalizing
Change processing, see Table 1 for details.
The normalization of table 1 electricity consumption data and relation factor data
VDIncluding user's year electricity consumption time series data, the monthly electricity consumption in electricity consumption user each month, daily load data
(the daily load data for selecting 2014~2017 years 7, August part) and part throttle characteristics data.On the basis of these four types of data
AP cluster is carried out respectively, and the result of cluster is as follows:
As shown in Fig. 3 (a)-Fig. 3 (c), what is showed is the cluster result of user year electricity consumption, is divided into three classes: stable type,
Wave type and growth form, Fig. 3 (a) are Wave type, specially fluctuation growth form, and Fig. 3 (b) is stable type, and Fig. 3 (c) is to increase
Property, stable type includes three classes user: slowly growth, slowly decline and leveling style three classes Electricity customers;What fluctuation included is
The client being affected by business environment, that rapid growth type includes is the good user of developing state.
What Fig. 4 (a)-Fig. 4 (d) showed is the cluster result of user's month electricity consumption, mainly there is four classes: stable type, unimodal
Type, bimodal pattern and multimodal.Fig. 4 (a) is stable type, and Fig. 4 (b) is bimodal pattern, and Fig. 4 (c) is single peak type, and Fig. 4 (d) is more
Peak type.Stable type mainly includes that manufacturing industry etc. uses the stronger user of electric continuity, and what single peak type included is that electricity consumption is high in 1 year
Peak appears in the client in 5~October, and what bimodal pattern included is to be influenced by seasonal factor than heavier client, and multimodal includes
Be the client influenced by multiple factors such as festivals or holidays, production cycle etc. factor.
What Fig. 5 (a)-Fig. 5 (d) showed is the synthesis daily load curve of different industries, can be sentenced according to correlative study data
What disconnected Fig. 5 (a) out was indicated is the power load curve of municipal life kind, and what Fig. 5 (b) was indicated is bearing synthesis day for the tertiary industry
Lotus curve, Fig. 5 (c) and Fig. 5 (d) respectively indicate the synthesis daily load curve of heavy industry and light industry.
What table 2 showed is the cluster result of different load characteristic index, contains lesser season unbalance factor in class one,
Illustrate that such user is more sensitive to seasonal factor, mainly includes some light industry users;In class two class average load rate compared with
It is small, illustrate that such user generally refers to the low energy consumption user such as public service;Load utilizes hour in the power mode of class three
Several and load density is bigger, generally refers to the high energy consumptions user such as heavy industry.
2 Load characteristics index cluster result of table
By the cluster on subspace, user's can be divided into 3 × 4 × 4 × 3=144 kind with power mode in total, and
All users are divided into 144 groups, the group divided in practice mainly there are 50, and containing 96% user, reason is
The different user group that the different industries group and moon electricity consumption clustering divided by daily load curve obtains has weight
Folded property, is reduced so as to cause with power mode, such as is classified by the heavy industry that daily load curve clusters, such moon is negative
Lotus curve is stable type, and the sample appeared in the classification of bimodal pattern and multimodal is few, therefore can be ignored.In this base
On plinth, by the relation factor Average Mutual of each group of subscribers, and choose before ranking 15 relation factor, by this 15 kinds
The moon data of relation factor data establish sample set as output as input, moon electricity consumption, by Bootstrap method from
M training sample subset is chosen in data set, is built into multiple linear regression model respectively, remaining data are as test set
Error testing is carried out to the prediction model, Fig. 6 shows the comprehensive different predictions with number regression models different under power mode
Error is measured using MAPE value, and the more prediction errors of multivariate regression models are smaller.
It carries out predicting available monthly electricity demand forecasting value by regression model, for the electricity for comparing the present embodiment proposition
The prediction effect for measuring prediction algorithm is calculated on same training data with SVM prediction model (SVM) and random forest
Method (RF) is compared, and the results are shown in Table 3, in randomly select the user group showed 6 middle of the month predicted values with
The mean absolute error value of true value, can significantly find out, the absolute percent error for the algorithm that the present embodiment is proposed and
Mean absolute error is superior to compared algorithm, illustrates that the mentioned algorithm of the present embodiment has higher precision.
The comparison of 3 prediction result of table
Grouping | SVM (%) | RF (%) | This paper algorithm (%) |
1 | 2.38 | 2.26 | 2.17 |
2 | 3.59 | 2.49 | 2.32 |
3 | 5.52 | 2.97 | 2.68 |
4 | 4.32 | 1.82 | 1.16 |
5 | 2.77 | 2.32 | 1.63 |
6 | 4.16 | 3.67 | 3.59 |
MAPE (%) | 3.79 | 2.58 | 2.26 |
For the validity for further verifying this paper algorithm, prediction modeling is carried out to 50 user groups, Contrast on effect is as schemed
Shown in 7, the overall effect of this paper algorithm is better than compared algorithm, and reason is to make user group by the cluster on subspace
Body division more refines, and it is more accurate that the relation factor of different groups user is chosen, to improve the prediction effect of algorithm.
Embodiment 2
The linear regression power quantity predicting system based on electricity consumption Specialty aggregation that the present embodiment provides a kind of, comprising:
Cluster module: gather for Electricity customers data to be respectively adopted with AP automatic cluster algorithm in multiple dimensions
Class obtains a variety of cluster results;
User's categorization module: for cluster result any combination to be obtained different user power utilization modes, by Electricity customers
Classify according to the difference of power mode, obtains different user groups;
Strong relation factor determining module: for being directed to different user groups, being determined using Mutual Information Theory is influenced each
The strong relation factor of electricity consumption group electricity consumption behavior;
Power consumption prediction model construction module: for according to the user power utilization data of group and influencing the strong of in-group electricity consumption
Relation factor establishes corresponding multiple linear regression model for each user group respectively, according to the multiple linear of foundation
Regression model establishes power consumption prediction model;
Prediction module: the strong relation factor data for acquiring each user group input corresponding multiple linear regression
Model predicts the electricity consumption data of each user group.
Embodiment 3
The present embodiment provides a kind of electronic equipment, including memory and processor and storage on a memory and are being located
The computer instruction run on reason device when the computer instruction is run by processor, completes step described in 1 method of embodiment
Suddenly.
Embodiment 4
The present embodiment provides a kind of computer readable storage mediums, characterized in that described for storing computer instruction
When computer instruction is executed by processor, step described in 1 method of embodiment is completed.
The foregoing is merely preferred embodiment of the present disclosure, are not limited to the disclosure, for the skill of this field
For art personnel, the disclosure can have various modifications and variations.It is all the disclosure spirit and principle within, it is made any
Modification, equivalent replacement, improvement etc., should be included within the protection scope of the disclosure.
Although above-mentioned be described in conjunction with specific embodiment of the attached drawing to the disclosure, not the disclosure is protected
The limitation of range, those skilled in the art should understand that, on the basis of the technical solution of the disclosure, those skilled in the art
Member does not need to make the creative labor the various modifications or changes that can be made still within the protection scope of the disclosure.
Claims (10)
1. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation, characterized in that include the following steps:
Electricity customers data are clustered respectively in multiple dimensions, obtain a variety of cluster results;
Cluster result any combination is obtained into different user power utilization modes, Electricity customers are carried out according to the difference of power mode
Classification, obtains different user groups;
For different user groups, determined using Mutual Information Theory influence the strong association of each electricity consumption group electricity consumption behavior because
Element;
According to the strong relation factor of the user power utilization data of group and influence in-group electricity consumption, it is directed to each user group respectively
Corresponding multiple linear regression model is established, power consumption prediction model is established according to the multiple linear regression model of foundation;
The strong relation factor data for acquiring each user group input corresponding linear regression model (LRM), the electricity consumption to each user group
Data are predicted.
2. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, it is characterized in that: institute
It states and Electricity customers data are respectively adopted with AP automatic cluster algorithm in multiple dimensions cluster, obtain a variety of cluster results,
Specifically year electricity consumption sequence, moon electricity consumption sequence, carry out on daily load data and part throttle characteristics data four dimensions AP from
Dynamic cluster, obtains four kinds of cluster results.
3. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute
It states and Electricity customers data are respectively adopted with AP automatic cluster algorithm in multiple dimensions cluster, obtain a variety of cluster results,
Specifically comprise the following steps:
Electricity consumption data is acquired, user power utilization performance data collection V is constructedD;
Data set VD is divided to multiple and different subspaces, AP automatic cluster algorithm is respectively adopted to the data of each sub-spaces
It is clustered, obtains the subspace clustering result for corresponding to every sub-spaces data.
4. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute
Stating any combination is that cluster result is carried out permutation and combination, in the cluster clustered in each sub-spaces or each dimension respectively
It extracts a cluster to be combined, so that it is determined that a kind of user power utilization mode, in resulting cluster, user power utilization mode total quantity is
The product of the quantity of cluster in each cluster result.
5. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute
It states for different user groups, the strong relation factor for influencing each electricity consumption group electricity consumption behavior is determined using Mutual Information Theory
Step, specifically:
Calculate the mutual information between the electricity consumption data and potential relation factor of user;
It builds up mutual trust breath according to mutual information data between each user power consumption and potential relation factor in user group are calculated
Matrix;
For each relation factor, the Average Mutual between each relation factor and the electricity consumption data of user is calculated, is obtained not
With the Average Mutual of group of subscribers and each relation factor;
Select Average Mutual numerical value be greater than zero representated by relation factor, and according to mutual information numerical values recited by the association of selection
Factor is ranked up, the relation factor list after being sorted, and chooses the biggish relation factor of mutual information numerical value as strong association
Factor.
6. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute
It states the user power utilization data according to group and influences the strong relation factor of in-group electricity consumption, built respectively for each user group
The step of founding corresponding multiple linear regression model, specifically:
According to the electricity consumption data of strong relation factor corresponding data and each user group, the data sample of each user group is established
This collection;
From the data sample of each user group concentration randomly select w training sample, by the training sample of extraction to close by force
The corresponding data of connection factor are input, and the electricity consumption data of corresponding user group is output training w multiple linear regression mould of building
Type.
7. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute
It states and power consumption prediction model is established according to the multiple linear regression model of foundation, specially to all Multiple Linear Regression Forecasting Models of Chinese
Carry out the power consumption prediction model that linear combination obtains total user.
8. a kind of linear regression power quantity predicting system based on electricity consumption Specialty aggregation, characterized in that include:
Cluster module: it clusters, obtains for Electricity customers data to be respectively adopted with AP automatic cluster algorithm in multiple dimensions
To a variety of cluster results;
User's categorization module: for cluster result any combination to be obtained different user power utilization modes, by Electricity customers according to
Classified with the difference of power mode, obtains different user groups;
Strong relation factor determining module: for being directed to different user groups, being determined using Mutual Information Theory influences each electricity consumption
The strong relation factor of group's electricity consumption behavior;
Power consumption prediction model construction module: for according to the user power utilization data of group and influence the strong association of in-group electricity consumption because
Element establishes corresponding multiple linear regression model for each user group respectively, according to the multiple linear regression mould of foundation
Type establishes power consumption prediction model;
Prediction module: the strong relation factor data for acquiring each user group input corresponding multiple linear regression model,
The electricity consumption data of each user group is predicted.
9. a kind of electronic equipment, characterized in that on a memory and on a processor including memory and processor and storage
The computer instruction of operation when the computer instruction is run by processor, is completed described in any one of claim 1-7 method
Step.
10. a kind of computer readable storage medium, characterized in that for storing computer instruction, the computer instruction is located
When managing device execution, step described in any one of claim 1-7 method is completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910507085.5A CN110348604A (en) | 2019-06-12 | 2019-06-12 | A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910507085.5A CN110348604A (en) | 2019-06-12 | 2019-06-12 | A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110348604A true CN110348604A (en) | 2019-10-18 |
Family
ID=68181858
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910507085.5A Pending CN110348604A (en) | 2019-06-12 | 2019-06-12 | A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110348604A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112380489A (en) * | 2020-11-03 | 2021-02-19 | 武汉光庭信息技术股份有限公司 | Data processing time calculation method, data processing platform evaluation method and system |
CN113052385A (en) * | 2021-03-29 | 2021-06-29 | 国网河北省电力有限公司经济技术研究院 | Method, device, equipment and storage medium for predicting power consumption in steel industry |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104933630A (en) * | 2015-05-21 | 2015-09-23 | 国家电网公司 | Load characteristic analysis method and system |
CN105512768A (en) * | 2015-12-14 | 2016-04-20 | 上海交通大学 | User electricity consumption relevant factor identification and electricity consumption quantity prediction method under environment of big data |
US20170228661A1 (en) * | 2014-04-17 | 2017-08-10 | Sas Institute Inc. | Systems and methods for machine learning using classifying, clustering, and grouping time series data |
CN108171369A (en) * | 2017-12-21 | 2018-06-15 | 国家电网公司 | Short term combination forecasting method based on customer electricity differentiation characteristic |
-
2019
- 2019-06-12 CN CN201910507085.5A patent/CN110348604A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170228661A1 (en) * | 2014-04-17 | 2017-08-10 | Sas Institute Inc. | Systems and methods for machine learning using classifying, clustering, and grouping time series data |
CN104933630A (en) * | 2015-05-21 | 2015-09-23 | 国家电网公司 | Load characteristic analysis method and system |
CN105512768A (en) * | 2015-12-14 | 2016-04-20 | 上海交通大学 | User electricity consumption relevant factor identification and electricity consumption quantity prediction method under environment of big data |
CN108171369A (en) * | 2017-12-21 | 2018-06-15 | 国家电网公司 | Short term combination forecasting method based on customer electricity differentiation characteristic |
Non-Patent Citations (2)
Title |
---|
杨娟 等: "《面板数据聚类的复合方法与应用》", 31 August 2016, 对外经济贸易大学出版社 * |
计明军 等: "《预测与决策方法》", 31 August 2018, 大连海事大学出版社 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112380489A (en) * | 2020-11-03 | 2021-02-19 | 武汉光庭信息技术股份有限公司 | Data processing time calculation method, data processing platform evaluation method and system |
CN112380489B (en) * | 2020-11-03 | 2024-04-16 | 武汉光庭信息技术股份有限公司 | Data processing time calculation method, data processing platform evaluation method and system |
CN113052385A (en) * | 2021-03-29 | 2021-06-29 | 国网河北省电力有限公司经济技术研究院 | Method, device, equipment and storage medium for predicting power consumption in steel industry |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108280479B (en) | Power grid user classification method based on load characteristic index weighted clustering algorithm | |
Soltanifar et al. | The voting analytic hierarchy process method for discriminating among efficient decision making units in data envelopment analysis | |
CN105512768A (en) | User electricity consumption relevant factor identification and electricity consumption quantity prediction method under environment of big data | |
CN106446967A (en) | Novel power system load curve clustering method | |
CN105354595A (en) | Robust visual image classification method and system | |
CN110674993A (en) | User load short-term prediction method and device | |
CN108345908A (en) | Sorting technique, sorting device and the storage medium of electric network data | |
CN107301604A (en) | Multi-model fusion estimation system | |
Liu et al. | A moving shape-based robust fuzzy K-modes clustering algorithm for electricity profiles | |
CN110348604A (en) | A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation | |
CN112149890A (en) | Comprehensive energy load prediction method and system based on user energy label | |
CN113255900A (en) | Impulse load prediction method considering improved spectral clustering and Bi-LSTM neural network | |
Ulengin et al. | A power-based measurement approach to specify macroeconomic competitiveness of countries | |
CN108122173A (en) | A kind of conglomerate load forecasting method based on depth belief network | |
CN106022578A (en) | Residential electricity peak-valley-flat dividing method based on data dimension increasing and K-means clustering | |
Baherifard et al. | Improving the effect of electric vehicle charging on imbalance index in the unbalanced distribution network using demand response considering data mining techniques | |
CN111428766A (en) | Power consumption mode classification method for high-dimensional mass measurement data | |
CN110363384A (en) | Exception electric detection method based on depth weighted neural network | |
CN111737924B (en) | Method for selecting typical load characteristic transformer substation based on multi-source data | |
CN109146553A (en) | Spot Price forecasting system and its method based on multi-density cluster and multicore SVM | |
WO2002027616A1 (en) | Energy descriptors using artificial intelligence to maximize learning from data patterns | |
CN114372835B (en) | Comprehensive energy service potential customer identification method, system and computer equipment | |
CN110852628B (en) | Rural medium-long term load prediction method considering development mode influence | |
Thorve et al. | Fidelity and diversity metrics for validating hierarchical synthetic data: Application to residential energy demand | |
CN113780686A (en) | Distributed power supply-oriented virtual power plant operation scheme optimization method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191018 |