CN110348604A - A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation - Google Patents

A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation Download PDF

Info

Publication number
CN110348604A
CN110348604A CN201910507085.5A CN201910507085A CN110348604A CN 110348604 A CN110348604 A CN 110348604A CN 201910507085 A CN201910507085 A CN 201910507085A CN 110348604 A CN110348604 A CN 110348604A
Authority
CN
China
Prior art keywords
user
data
electricity consumption
linear regression
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910507085.5A
Other languages
Chinese (zh)
Inventor
周翔宇
周建全
苗淑平
董文秀
张宏伟
刘越
孟瑶
李静
孙海彬
宋益睿
秦贞依
唐言宾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Jining Power Supply Co of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Jining Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Jining Power Supply Co of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201910507085.5A priority Critical patent/CN110348604A/en
Publication of CN110348604A publication Critical patent/CN110348604A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The present disclosure proposes a kind of linear regression power predicating methods and system based on electricity consumption Specialty aggregation, in the polymorphic type of accumulation, on the basis of magnanimity customer electricity information, subspace clustering is carried out according to user power utilization evaluating characteristics index, obtain plurality of classes, and then form a variety of user power utilization modes, according to the difference with power mode, group's division is carried out to user, its strong relation factor is judged using mutual information matrix to different group of subscribers, and then the prediction of electricity consumption is carried out using arithmetic of linearity regression, population data is established multiple linear regression model (LRM)s and is predicted for each user, prediction result is more accurate, prediction effect is more preferable.

Description

A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation
Technical field
This disclosure relates to electric system power supply and distribution correlative technology field, in particular to a kind of be based on using electrical characteristics The linear regression power predicating method and system of cluster.
Background technique
Only there is provided background technical informations relevant to the disclosure for the statement of this part, not necessarily constitute first Technology.
With the fast development of national economy and energy industry, demand of the power consumer to electric energy is increasing, for supplying For electric enterprise, the prediction of user power consumption is particularly important, and the prediction of electricity consumption can not only help Utilities Electric Co. more preferable Ground understands and service user, is that corresponding planning is formulated in the development of power grid, can specifically carry out the scheduling with power consumption, simultaneously Also the formulation that Correspondence policy can be helped, as the construction plan of electric system is laid out.It is over time and economical It continues to develop, it is anticipated that China also will be higher and higher to the degree of dependence of electric power.
The electricity consumption behavior of user has otherness, even the user of same industry, over time, this difference Different also to become clear day by day, most of existing power consumption prediction carries out pattern-recognition by industrial nature, can not dig well Dig the information of user.User's is not only related with the correlative factor of the industry with electrical characteristics, also has with other socio-economic factors It closes, different zones user's is similar with the use electrical property change trend of different industries with electrical characteristics, and user power utilization characteristic presents more Sample, this forms challenge to related power predicating method.With the development of science and technology, especially intellectual technology is constantly progressive, Various intelligent power grid technologies emerge one after another, and have also been greatly improved in terms of the construction of power grid, existing electric power big data is enough Data supporting is provided for the quantization and correlation predictive of user power utilization characteristic, establishing targeted prediction model can not only Electricity demand forecasting precision is improved, while can also help enterprise understanding user and its population effect.
Summary of the invention
The disclosure to solve the above-mentioned problems, proposes a kind of linear regression power quantity predicting based on electricity consumption Specialty aggregation Method and system, on the basis of the polymorphic type of accumulation, magnanimity customer electricity information, according to user power utilization evaluating characteristics index Carry out subspace clustering, obtain plurality of classes, and then form a variety of user power utilization modes, according to the difference with power mode, to Family carries out group's division, judges its strong relation factor using mutual information matrix to different group of subscribers, and then use polynary line Property regression algorithm carry out electricity consumption prediction, population data is established multiple linear regression model (LRM)s and is predicted for each user, Prediction result is more accurate, and prediction effect is more preferable.
To achieve the goals above, the disclosure adopts the following technical scheme that
One or more embodiments provide a kind of linear regression power predicating method based on electricity consumption Specialty aggregation, packet Include following steps:
Electricity customers data are clustered respectively in multiple dimensions, obtain a variety of cluster results;
Cluster result any combination is obtained into different user power utilization modes, by Electricity customers according to power mode not With classifying, different user groups is obtained;
For different user groups, the strong pass for influencing each electricity consumption group electricity consumption behavior is determined using Mutual Information Theory Connection factor;
According to the strong relation factor of the user power utilization data of group and influence in-group electricity consumption, it is directed to each use respectively Family group establishes corresponding multiple linear regression model, establishes power consumption prediction model according to the multiple linear regression model of foundation;
The strong relation factor data for acquiring each user group input corresponding linear regression model (LRM), to each user group Electricity consumption data predicted.
One or more embodiments provide a kind of linear regression power quantity predicting system based on electricity consumption Specialty aggregation, packet It includes:
Cluster module: gather for Electricity customers data to be respectively adopted with AP automatic cluster algorithm in multiple dimensions Class obtains a variety of cluster results;
User's categorization module: for cluster result any combination to be obtained different user power utilization modes, by Electricity customers Classify according to the difference of power mode, obtains different user groups;
Strong relation factor determining module: for being directed to different user groups, being determined using Mutual Information Theory is influenced each The strong relation factor of electricity consumption group electricity consumption behavior;
Power consumption prediction model construction module: for according to the user power utilization data of group and influencing the strong of in-group electricity consumption Relation factor establishes corresponding multiple linear regression model for each user group respectively, according to the multiple linear of foundation Regression model establishes power consumption prediction model;
Prediction module: the strong relation factor data for acquiring each user group input corresponding multiple linear regression Model predicts the electricity consumption data of each user group.
A kind of electronic equipment, characterized in that including memory and processor and storage on a memory and in processor The computer instruction of upper operation when the computer instruction is run by processor, completes step described in the above method.
A kind of computer readable storage medium, characterized in that for storing computer instruction, the computer instruction quilt When processor executes, step described in the above method is completed.
The disclosure carries out subspace clustering according to user power utilization evaluating characteristics index, obtains plurality of classes, and then formed more Kind user power utilization mode carries out group's division to user according to the difference with power mode, to different group of subscribers using mutually Information matrix judges its strong relation factor, and then the prediction of electricity consumption is carried out using arithmetic of linearity regression, for each use Family population data establishes multiple linear regression model (LRM)s and is predicted that prediction result is more accurate, and prediction effect is more preferable.
Compared with prior art, the disclosure has the beneficial effect that
(1) a kind of user's power predicating method based on AP clustering algorithm and multiple linear regression that the disclosure proposes, no Only automatic cluster can be carried out according to user power utilization characteristic, additionally it is possible to the electricity consumption of effectively identification different user group it is associated because Element, by cluster, available a variety of user power utilization modes, so that different user groups is obtained, it can by calculating mutual information To obtain the strong relation factor of different user group, it is fitted, is obtained pre- using arithmetic of linearity regression on this basis It surveys as a result, the disclosure improves prediction effect to classification subtilizedization of user.
(2) the AP automatic cluster algorithm that the disclosure uses, it is not necessary that the number of cluster is manually arranged, in multiple dimensions such as four It is clustered in a dimension, user group's number obtained is more, and classification is also more fine, classifies more prediction results more Accurately, accurate data foundation is provided for the power transmission and distribution scheduling of Utilities Electric Co..
(3) disclosure is determined different user group electricity consumption most correlative factor using mutual information method and is applied to it In power quantity predicting model, so that prediction technique is reasonable.
Detailed description of the invention
The Figure of description for constituting a part of this disclosure is used to provide further understanding of the disclosure, the disclosure Illustrative embodiments and their description do not constitute the restriction to the disclosure for explaining the disclosure.
Fig. 1 is the power consumption prediction method flow diagram of the embodiment of the present disclosure 1;
Fig. 2 is the power consumption prediction model modeling procedure chart of the embodiment of the present disclosure 1;
Fig. 3 (a)-Fig. 3 (c) is user year electricity consumption cluster result figure in the example of the embodiment of the present disclosure 1;
Fig. 4 (a)-Fig. 4 (d) is user month electricity consumption cluster result figure in the example of the embodiment of the present disclosure 1;
Fig. 5 (a)-Fig. 5 (d) is comprehensive day negative curve dendrogram in the example of the embodiment of the present disclosure 1;
The prediction-error image of different regression model numbers in the example of Fig. 6 embodiment of the present disclosure 1;
Fig. 7 disclosed method and comparison algorithm prediction effect comparison diagram.
Specific embodiment:
The disclosure is described further with embodiment with reference to the accompanying drawing.
It is noted that described further below be all exemplary, it is intended to provide further instruction to the disclosure.Unless Otherwise indicated, all technical and scientific terms used herein has and disclosure person of an ordinary skill in the technical field Normally understood identical meanings.
It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root According to the illustrative embodiments of the disclosure.As used herein, unless the context clearly indicates otherwise, otherwise singular shape Formula be also intended to include plural form, additionally, it should be understood that, when in the present specification use term "comprising" and/or When " comprising ", existing characteristics, step, operation, device, component and/or their combination are indicated.It should be noted that not In the case where conflict, each embodiment in the disclosure and the feature in embodiment be can be combined with each other.Below in conjunction with attached drawing Embodiment is described in detail.
In the technical solution disclosed in one or more embodiments, as shown in Figure 1, a kind of based on poly- with electrical characteristics The linear regression power predicating method of class, includes the following steps:
Step 1 clusters Electricity customers data in multiple dimensions, obtains a variety of cluster results;
Cluster result any combination is obtained different user power utilization modes by step 2, by Electricity customers according to electricity consumption mould The difference of formula is classified, and different user groups is obtained;
Step 3, for different user groups, being determined using Mutual Information Theory influences each electricity consumption group electricity consumption behavior Strong relation factor;
Step 4 according to the user power utilization data of group and influences the strong relation factor of in-group electricity consumption, respectively for every A user group establishes corresponding linear regression model (LRM), according to the linear regression model power consumption prediction model of foundation;
Step 5, the strong relation factor data of each user group of acquisition input corresponding linear regression model (LRM), to each use The electricity consumption data of family group is predicted.
The step 1 is respectively adopted AP automatic cluster algorithm in multiple dimensions to Electricity customers data and clusters, Obtain a variety of cluster results, step specifically:
Step 11, acquisition electricity consumption data, construct user power utilization performance data collection VD
Electricity consumption data can may include timing and Fei Shi using the data of the Electricity customers of associate power enterprise accumulation The evaluation index data of sequence, wherein timing evaluation index includes user year electricity consumption data, moon electricity consumption data and daily load Data, non-sequential evaluation index include load density, Day average load power, season unbalance factor, number of working hours based on maximum load Etc. indexs.The user power utilization performance data collection V of buildingDIt can be such that
Vi={ αt1, αt2..., αtu;βt1, βt2..., βtv;δ1, δ2..., δk;γ1, γ2..., γw}∈VD (1)
I=1,2 in above formula ..., m indicates user.αt1t2,…,αtuAnd βt1t2,…,βtvThere is temporal aspect Vector respectively indicates user year electricity consumption data and moon electricity consumption data;δ123,…,δkIndicate average in a period of time Daily load data, particular content may include 48 points of (δ1,…,δ48) per day load data.γ1, γ2,…,γwWhen being non- The feature vector of sequence, particular content may include load density γ1, Day average load power γ2, season unbalance factor γ3, maximum Load utilizes hourage γ4Equal load indexs.
Data set VD is divided multiple and different subspaces by step 12, and AP is respectively adopted to the data of each sub-spaces Automatic cluster algorithm is clustered, and the subspace clustering result for corresponding to every sub-spaces data is obtained;
The quantity of subspace is dimension, and the present embodiment can be clustered in 4 dimensions, by data set VDIt divides For 4 different subspace L1~L4, year electricity consumption sequence, moon electricity consumption sequence, daily load data and part throttle characteristics number According to AP cluster is carried out on four dimensions, by clustering available corresponding r, s, k and t cluster, and sample number strong point pair is sought In the degree of membership of each cluster in subspace, can be indicated by following formula:
Above formula meetsWithAnd uα,j, uβ,j, uδ,j, uγ,j∈ [0,1], uα,j、uβ,j、uδ,j、uγ,jUser u is respectively indicated for different in four sub-spaces cluster result r, s, k and t cluster The degree of membership of cluster.
In the step 2, cluster result any combination is obtained into different user power utilization modes, by Electricity customers according to Classified with the difference of power mode, obtains different user groups;
Described any combination can gather in each sub-spaces or each dimension for cluster result is carried out permutation and combination A cluster is extracted in the cluster that class obtains respectively to be combined, so that it is determined that a kind of user power utilization mode is used in resulting cluster Family power mode total quantity is the product of the quantity of cluster in each cluster result.It is available right that this implementation clusters in 4 dimensions R, s, the k and t clusters answered, user power utilization mode total quantity N=r × s × k × t.
Electricity customers are classified according to the difference of power mode, the method for obtaining different user groups can have Body is that Electricity customers are divided the degree of membership of different groups by the electricity consumption data by calculating each sample point, that is, Electricity customers To the corresponding maximum user group of degree of membership.User is divided into different user group according to formula (3), it is specific to calculate public affairs Formula is as follows:
In above formula, uα,max、uβ,max、uδ,max、uγ,maxIt respectively indicates user u and corresponds to different clusters persons in servitude in four sub-spaces The maximum value of category degree.
The step 3, for different user groups, being determined using Mutual Information Theory influences the electricity consumption of each electricity consumption group The strong relation factor of behavior.For different user groups, using Mutual Information Theory to user power consumption data and potential association Factor is associated analysis, determines strong relation factor relevant to user power utilization behavior, and specific steps can be such that
Step 31: calculating the mutual information between the electricity consumption data X of user and potential relation factor Y to obtain between the two Correlation degree, mutual information calculation formula is as follows:
Wherein, M indicate X and all values of Y number and;NiIndicate the section quantity of X;MiIt is expressed as X and falls in i-th of area Between numerical value number;NjIndicate the section quantity of Y, P (yu) indicate that Y falls in the probability in u-th of section;MuvIndicate that Y falls in u When a section, X falls in the numerical value number in v-th of section just.
Relation factor Y is the potential relation factor for influencing user power utilization behavior, and user is broadly divided into resident and non-resident use Two kinds of family, non-resident user covers the industries such as industry, food and drink, communications and transportation, is divided into nonmanufacturing industry and manufacturing industry user, for Nonmanufacturing industry user chooses the industry gross output value, investment in fixed assets, Industrial Cycle index, the major product producer price of this area The many factors such as grid index, product library storage choose product yield, original as potential relation factor, for manufacturing industry user Material price index, product library storage etc. factor are as relation factor, in terms of region, consider totality GDP, first and second, The factors such as three industry GDP, fixed investment.
Step 32, according to calculating mutual information number between each user power consumption and potential relation factor in user group It, can be as follows according to breath matrix of building up mutual trust:
Wherein, { X1,X2,…,XpIndicate the data set that p user power consumption data sequence is constituted, { Y1,Y2,…,Yl} Indicate the data set that potential relation factor is constituted.
Step 33 is directed to each relation factor, calculates average mutual between each relation factor and the electricity consumption data of user Information obtains the Average Mutual of different groups user Yu each relation factor.Calculate in above-mentioned matrix relation factor Yj with X1, X2 ..., the Average Mutual between Xp, can be calculated by formula (6):
Wherein, { X1,X2,…,XpIndicate the data set that p user power consumption data sequence is constituted, { Y1,Y2,…,Yl} Indicate the data set that potential relation factor is constituted.
Step 34: selection Average Mutual numerical value be greater than zero representated by relation factor, and according to mutual information numerical values recited The relation factor of selection is ranked up, the relation factor list after being sorted, choose the biggish association of mutual information numerical value because Element, as strong relation factor.If it is according to the descending arrangement of mutual information numerical value, forward relation factor is chosen, it is specific to select Taking how many can be set.
Strong relation factor constructs training sample set Sk (k=1 ..., n) together with user power utilization data, and combines different User group Gk (k=1 ..., n) carries out the prediction of electricity consumption.
The step 4 according to the user power utilization data of group and influences the strong relation factor of in-group electricity consumption, using more First linear regression algorithm establishes corresponding multiple linear regression model (LRM)s for each user group respectively.For each user Group establishes a linear regression model (LRM), is finally to establish multiple linear regression model (LRM)s.
The step 4 establishes regression model using arithmetic of linearity regression, can use for reference random forests algorithm, for Different user group GkFrom original training sample collection SkIn obtain multiple training samples at random, each sample is modeled, Then each model is tested using test set, finally obtains power quantity predicting model.As shown in Fig. 2, step can be specific Are as follows:
Step 41, according to the electricity consumption data of strong relation factor corresponding data and each user group, establish each user The set of data samples S of groupk
Step 42, from SkIn randomly select w training sample (S1,S2,…,Skw), the sampling side Bootstrap can be used Method, by the training sample of extraction with the corresponding data of strong relation factor to input, the electricity consumption data of corresponding user group is that output is instructed Practice w multiple linear regression model of building.Multiple linear regression model (LRM)s are established to the same group and carry out prediction result, prediction is accurate Du Genggao.
Step 43, by set of data samples SkThe data not being extracted input w multiple linear of building as test set Regression model is tested.Using there is the sampling put back to, part sample is not in training sample subset extraction process In the sample set of acquisition, referred to as " the outer data of bag ", data are tested as test set using outside bag.Test error exists Model in certain range is qualified model.
Step 44 carries out the power consumption prediction model that linear combination obtains total user to all Linear Regression Forecasting Models.
Original training data collection SkIn have two class data composition, be user power consumption data and corresponding M respectively Kind relation factor data.Mode input is M kind relation factor data, and output is the time series data of user power consumption.Establish w Multiple linear regression model carries out emulation testing by test set, will be with electricity consumption data YkRelevant relation factor data Xk As input, obtains prediction result and is calculated using formula (7):
In the step C4, HkIt is corresponding GkPower quantity predicting model, fkiIt is single Linear Regression Forecasting Model, by fkInto The power consumption prediction model of total user can be obtained in row linear combination.
Step 5, the strong relation factor data of each user group of acquisition input corresponding linear regression model (LRM), to each use The electricity consumption data of family group is predicted.
The corresponding data of the strong relation factor of user are acquired, the linear regression model (LRM) that corresponding user group establishes is input to, it is defeated The power consumption prediction value of user out.
It is illustrated below with specific example:
Choose electricity consumption of 7832 typical users between 2014~2017 years, day in the region of East Coastal province The data such as load curve calculate corresponding user's number of working hours based on maximum load γ1, load density γ2, peak-valley electric energy ratio γ3, season unbalance factor γ4, Day average load power γ5Equal Load characteristics indexes data, and with user power consumption time series data In conjunction with building user power utilization performance data collection VD
User is broadly divided into resident and two kinds of non-resident user, and non-resident user covers industry, food and drink, communications and transportation etc. Industry chooses the industry gross output value of this area, investment in fixed assets, Industrial Cycle index, main for nonmanufacturing industry user 40 kinds of factors such as product export price index, product library storage are chosen manufacturing industry user and are produced as potential relation factor 78 kinds of factors such as product yield, prices of raw materials index, product library storage are as relation factor, in terms of region, consider overall 20 kinds of factors such as GDP, primary ,secondary and tertiary industries GDP, fixed investment.Total 138 kinds of factors are associated with as user power utilization Factor constructs potential relation factor data set Y by these factor datasD, and to user power utilization data and YDCarry out normalizing Change processing, see Table 1 for details.
The normalization of table 1 electricity consumption data and relation factor data
VDIncluding user's year electricity consumption time series data, the monthly electricity consumption in electricity consumption user each month, daily load data (the daily load data for selecting 2014~2017 years 7, August part) and part throttle characteristics data.On the basis of these four types of data AP cluster is carried out respectively, and the result of cluster is as follows:
As shown in Fig. 3 (a)-Fig. 3 (c), what is showed is the cluster result of user year electricity consumption, is divided into three classes: stable type, Wave type and growth form, Fig. 3 (a) are Wave type, specially fluctuation growth form, and Fig. 3 (b) is stable type, and Fig. 3 (c) is to increase Property, stable type includes three classes user: slowly growth, slowly decline and leveling style three classes Electricity customers;What fluctuation included is The client being affected by business environment, that rapid growth type includes is the good user of developing state.
What Fig. 4 (a)-Fig. 4 (d) showed is the cluster result of user's month electricity consumption, mainly there is four classes: stable type, unimodal Type, bimodal pattern and multimodal.Fig. 4 (a) is stable type, and Fig. 4 (b) is bimodal pattern, and Fig. 4 (c) is single peak type, and Fig. 4 (d) is more Peak type.Stable type mainly includes that manufacturing industry etc. uses the stronger user of electric continuity, and what single peak type included is that electricity consumption is high in 1 year Peak appears in the client in 5~October, and what bimodal pattern included is to be influenced by seasonal factor than heavier client, and multimodal includes Be the client influenced by multiple factors such as festivals or holidays, production cycle etc. factor.
What Fig. 5 (a)-Fig. 5 (d) showed is the synthesis daily load curve of different industries, can be sentenced according to correlative study data What disconnected Fig. 5 (a) out was indicated is the power load curve of municipal life kind, and what Fig. 5 (b) was indicated is bearing synthesis day for the tertiary industry Lotus curve, Fig. 5 (c) and Fig. 5 (d) respectively indicate the synthesis daily load curve of heavy industry and light industry.
What table 2 showed is the cluster result of different load characteristic index, contains lesser season unbalance factor in class one, Illustrate that such user is more sensitive to seasonal factor, mainly includes some light industry users;In class two class average load rate compared with It is small, illustrate that such user generally refers to the low energy consumption user such as public service;Load utilizes hour in the power mode of class three Several and load density is bigger, generally refers to the high energy consumptions user such as heavy industry.
2 Load characteristics index cluster result of table
By the cluster on subspace, user's can be divided into 3 × 4 × 4 × 3=144 kind with power mode in total, and All users are divided into 144 groups, the group divided in practice mainly there are 50, and containing 96% user, reason is The different user group that the different industries group and moon electricity consumption clustering divided by daily load curve obtains has weight Folded property, is reduced so as to cause with power mode, such as is classified by the heavy industry that daily load curve clusters, such moon is negative Lotus curve is stable type, and the sample appeared in the classification of bimodal pattern and multimodal is few, therefore can be ignored.In this base On plinth, by the relation factor Average Mutual of each group of subscribers, and choose before ranking 15 relation factor, by this 15 kinds The moon data of relation factor data establish sample set as output as input, moon electricity consumption, by Bootstrap method from M training sample subset is chosen in data set, is built into multiple linear regression model respectively, remaining data are as test set Error testing is carried out to the prediction model, Fig. 6 shows the comprehensive different predictions with number regression models different under power mode Error is measured using MAPE value, and the more prediction errors of multivariate regression models are smaller.
It carries out predicting available monthly electricity demand forecasting value by regression model, for the electricity for comparing the present embodiment proposition The prediction effect for measuring prediction algorithm is calculated on same training data with SVM prediction model (SVM) and random forest Method (RF) is compared, and the results are shown in Table 3, in randomly select the user group showed 6 middle of the month predicted values with The mean absolute error value of true value, can significantly find out, the absolute percent error for the algorithm that the present embodiment is proposed and Mean absolute error is superior to compared algorithm, illustrates that the mentioned algorithm of the present embodiment has higher precision.
The comparison of 3 prediction result of table
Grouping SVM (%) RF (%) This paper algorithm (%)
1 2.38 2.26 2.17
2 3.59 2.49 2.32
3 5.52 2.97 2.68
4 4.32 1.82 1.16
5 2.77 2.32 1.63
6 4.16 3.67 3.59
MAPE (%) 3.79 2.58 2.26
For the validity for further verifying this paper algorithm, prediction modeling is carried out to 50 user groups, Contrast on effect is as schemed Shown in 7, the overall effect of this paper algorithm is better than compared algorithm, and reason is to make user group by the cluster on subspace Body division more refines, and it is more accurate that the relation factor of different groups user is chosen, to improve the prediction effect of algorithm.
Embodiment 2
The linear regression power quantity predicting system based on electricity consumption Specialty aggregation that the present embodiment provides a kind of, comprising:
Cluster module: gather for Electricity customers data to be respectively adopted with AP automatic cluster algorithm in multiple dimensions Class obtains a variety of cluster results;
User's categorization module: for cluster result any combination to be obtained different user power utilization modes, by Electricity customers Classify according to the difference of power mode, obtains different user groups;
Strong relation factor determining module: for being directed to different user groups, being determined using Mutual Information Theory is influenced each The strong relation factor of electricity consumption group electricity consumption behavior;
Power consumption prediction model construction module: for according to the user power utilization data of group and influencing the strong of in-group electricity consumption Relation factor establishes corresponding multiple linear regression model for each user group respectively, according to the multiple linear of foundation Regression model establishes power consumption prediction model;
Prediction module: the strong relation factor data for acquiring each user group input corresponding multiple linear regression Model predicts the electricity consumption data of each user group.
Embodiment 3
The present embodiment provides a kind of electronic equipment, including memory and processor and storage on a memory and are being located The computer instruction run on reason device when the computer instruction is run by processor, completes step described in 1 method of embodiment Suddenly.
Embodiment 4
The present embodiment provides a kind of computer readable storage mediums, characterized in that described for storing computer instruction When computer instruction is executed by processor, step described in 1 method of embodiment is completed.
The foregoing is merely preferred embodiment of the present disclosure, are not limited to the disclosure, for the skill of this field For art personnel, the disclosure can have various modifications and variations.It is all the disclosure spirit and principle within, it is made any Modification, equivalent replacement, improvement etc., should be included within the protection scope of the disclosure.
Although above-mentioned be described in conjunction with specific embodiment of the attached drawing to the disclosure, not the disclosure is protected The limitation of range, those skilled in the art should understand that, on the basis of the technical solution of the disclosure, those skilled in the art Member does not need to make the creative labor the various modifications or changes that can be made still within the protection scope of the disclosure.

Claims (10)

1. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation, characterized in that include the following steps:
Electricity customers data are clustered respectively in multiple dimensions, obtain a variety of cluster results;
Cluster result any combination is obtained into different user power utilization modes, Electricity customers are carried out according to the difference of power mode Classification, obtains different user groups;
For different user groups, determined using Mutual Information Theory influence the strong association of each electricity consumption group electricity consumption behavior because Element;
According to the strong relation factor of the user power utilization data of group and influence in-group electricity consumption, it is directed to each user group respectively Corresponding multiple linear regression model is established, power consumption prediction model is established according to the multiple linear regression model of foundation;
The strong relation factor data for acquiring each user group input corresponding linear regression model (LRM), the electricity consumption to each user group Data are predicted.
2. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, it is characterized in that: institute It states and Electricity customers data are respectively adopted with AP automatic cluster algorithm in multiple dimensions cluster, obtain a variety of cluster results, Specifically year electricity consumption sequence, moon electricity consumption sequence, carry out on daily load data and part throttle characteristics data four dimensions AP from Dynamic cluster, obtains four kinds of cluster results.
3. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute It states and Electricity customers data are respectively adopted with AP automatic cluster algorithm in multiple dimensions cluster, obtain a variety of cluster results, Specifically comprise the following steps:
Electricity consumption data is acquired, user power utilization performance data collection V is constructedD
Data set VD is divided to multiple and different subspaces, AP automatic cluster algorithm is respectively adopted to the data of each sub-spaces It is clustered, obtains the subspace clustering result for corresponding to every sub-spaces data.
4. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute Stating any combination is that cluster result is carried out permutation and combination, in the cluster clustered in each sub-spaces or each dimension respectively It extracts a cluster to be combined, so that it is determined that a kind of user power utilization mode, in resulting cluster, user power utilization mode total quantity is The product of the quantity of cluster in each cluster result.
5. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute It states for different user groups, the strong relation factor for influencing each electricity consumption group electricity consumption behavior is determined using Mutual Information Theory Step, specifically:
Calculate the mutual information between the electricity consumption data and potential relation factor of user;
It builds up mutual trust breath according to mutual information data between each user power consumption and potential relation factor in user group are calculated Matrix;
For each relation factor, the Average Mutual between each relation factor and the electricity consumption data of user is calculated, is obtained not With the Average Mutual of group of subscribers and each relation factor;
Select Average Mutual numerical value be greater than zero representated by relation factor, and according to mutual information numerical values recited by the association of selection Factor is ranked up, the relation factor list after being sorted, and chooses the biggish relation factor of mutual information numerical value as strong association Factor.
6. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute It states the user power utilization data according to group and influences the strong relation factor of in-group electricity consumption, built respectively for each user group The step of founding corresponding multiple linear regression model, specifically:
According to the electricity consumption data of strong relation factor corresponding data and each user group, the data sample of each user group is established This collection;
From the data sample of each user group concentration randomly select w training sample, by the training sample of extraction to close by force The corresponding data of connection factor are input, and the electricity consumption data of corresponding user group is output training w multiple linear regression mould of building Type.
7. a kind of linear regression power predicating method based on electricity consumption Specialty aggregation as described in claim 1, characterized in that institute It states and power consumption prediction model is established according to the multiple linear regression model of foundation, specially to all Multiple Linear Regression Forecasting Models of Chinese Carry out the power consumption prediction model that linear combination obtains total user.
8. a kind of linear regression power quantity predicting system based on electricity consumption Specialty aggregation, characterized in that include:
Cluster module: it clusters, obtains for Electricity customers data to be respectively adopted with AP automatic cluster algorithm in multiple dimensions To a variety of cluster results;
User's categorization module: for cluster result any combination to be obtained different user power utilization modes, by Electricity customers according to Classified with the difference of power mode, obtains different user groups;
Strong relation factor determining module: for being directed to different user groups, being determined using Mutual Information Theory influences each electricity consumption The strong relation factor of group's electricity consumption behavior;
Power consumption prediction model construction module: for according to the user power utilization data of group and influence the strong association of in-group electricity consumption because Element establishes corresponding multiple linear regression model for each user group respectively, according to the multiple linear regression mould of foundation Type establishes power consumption prediction model;
Prediction module: the strong relation factor data for acquiring each user group input corresponding multiple linear regression model, The electricity consumption data of each user group is predicted.
9. a kind of electronic equipment, characterized in that on a memory and on a processor including memory and processor and storage The computer instruction of operation when the computer instruction is run by processor, is completed described in any one of claim 1-7 method Step.
10. a kind of computer readable storage medium, characterized in that for storing computer instruction, the computer instruction is located When managing device execution, step described in any one of claim 1-7 method is completed.
CN201910507085.5A 2019-06-12 2019-06-12 A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation Pending CN110348604A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910507085.5A CN110348604A (en) 2019-06-12 2019-06-12 A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910507085.5A CN110348604A (en) 2019-06-12 2019-06-12 A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation

Publications (1)

Publication Number Publication Date
CN110348604A true CN110348604A (en) 2019-10-18

Family

ID=68181858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910507085.5A Pending CN110348604A (en) 2019-06-12 2019-06-12 A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation

Country Status (1)

Country Link
CN (1) CN110348604A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380489A (en) * 2020-11-03 2021-02-19 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN113052385A (en) * 2021-03-29 2021-06-29 国网河北省电力有限公司经济技术研究院 Method, device, equipment and storage medium for predicting power consumption in steel industry

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933630A (en) * 2015-05-21 2015-09-23 国家电网公司 Load characteristic analysis method and system
CN105512768A (en) * 2015-12-14 2016-04-20 上海交通大学 User electricity consumption relevant factor identification and electricity consumption quantity prediction method under environment of big data
US20170228661A1 (en) * 2014-04-17 2017-08-10 Sas Institute Inc. Systems and methods for machine learning using classifying, clustering, and grouping time series data
CN108171369A (en) * 2017-12-21 2018-06-15 国家电网公司 Short term combination forecasting method based on customer electricity differentiation characteristic

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170228661A1 (en) * 2014-04-17 2017-08-10 Sas Institute Inc. Systems and methods for machine learning using classifying, clustering, and grouping time series data
CN104933630A (en) * 2015-05-21 2015-09-23 国家电网公司 Load characteristic analysis method and system
CN105512768A (en) * 2015-12-14 2016-04-20 上海交通大学 User electricity consumption relevant factor identification and electricity consumption quantity prediction method under environment of big data
CN108171369A (en) * 2017-12-21 2018-06-15 国家电网公司 Short term combination forecasting method based on customer electricity differentiation characteristic

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨娟 等: "《面板数据聚类的复合方法与应用》", 31 August 2016, 对外经济贸易大学出版社 *
计明军 等: "《预测与决策方法》", 31 August 2018, 大连海事大学出版社 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380489A (en) * 2020-11-03 2021-02-19 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN112380489B (en) * 2020-11-03 2024-04-16 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN113052385A (en) * 2021-03-29 2021-06-29 国网河北省电力有限公司经济技术研究院 Method, device, equipment and storage medium for predicting power consumption in steel industry

Similar Documents

Publication Publication Date Title
CN108280479B (en) Power grid user classification method based on load characteristic index weighted clustering algorithm
Soltanifar et al. The voting analytic hierarchy process method for discriminating among efficient decision making units in data envelopment analysis
CN105512768A (en) User electricity consumption relevant factor identification and electricity consumption quantity prediction method under environment of big data
CN106446967A (en) Novel power system load curve clustering method
CN105354595A (en) Robust visual image classification method and system
CN110674993A (en) User load short-term prediction method and device
CN108345908A (en) Sorting technique, sorting device and the storage medium of electric network data
CN107301604A (en) Multi-model fusion estimation system
Liu et al. A moving shape-based robust fuzzy K-modes clustering algorithm for electricity profiles
CN110348604A (en) A kind of linear regression power predicating method and system based on electricity consumption Specialty aggregation
CN112149890A (en) Comprehensive energy load prediction method and system based on user energy label
CN113255900A (en) Impulse load prediction method considering improved spectral clustering and Bi-LSTM neural network
Ulengin et al. A power-based measurement approach to specify macroeconomic competitiveness of countries
CN108122173A (en) A kind of conglomerate load forecasting method based on depth belief network
CN106022578A (en) Residential electricity peak-valley-flat dividing method based on data dimension increasing and K-means clustering
Baherifard et al. Improving the effect of electric vehicle charging on imbalance index in the unbalanced distribution network using demand response considering data mining techniques
CN111428766A (en) Power consumption mode classification method for high-dimensional mass measurement data
CN110363384A (en) Exception electric detection method based on depth weighted neural network
CN111737924B (en) Method for selecting typical load characteristic transformer substation based on multi-source data
CN109146553A (en) Spot Price forecasting system and its method based on multi-density cluster and multicore SVM
WO2002027616A1 (en) Energy descriptors using artificial intelligence to maximize learning from data patterns
CN114372835B (en) Comprehensive energy service potential customer identification method, system and computer equipment
CN110852628B (en) Rural medium-long term load prediction method considering development mode influence
Thorve et al. Fidelity and diversity metrics for validating hierarchical synthetic data: Application to residential energy demand
CN113780686A (en) Distributed power supply-oriented virtual power plant operation scheme optimization method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191018