CN105447767A - Power consumer subdivision method based on combined matrix decomposition model - Google Patents
Power consumer subdivision method based on combined matrix decomposition model Download PDFInfo
- Publication number
- CN105447767A CN105447767A CN201510801889.8A CN201510801889A CN105447767A CN 105447767 A CN105447767 A CN 105447767A CN 201510801889 A CN201510801889 A CN 201510801889A CN 105447767 A CN105447767 A CN 105447767A
- Authority
- CN
- China
- Prior art keywords
- user
- matrix
- users
- sigma
- stage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000011159 matrix material Substances 0.000 title claims abstract description 181
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000000354 decomposition reaction Methods 0.000 title claims abstract description 43
- 230000001105 regulatory effect Effects 0.000 claims abstract description 7
- 230000005611 electricity Effects 0.000 claims description 22
- 230000011218 segmentation Effects 0.000 claims description 13
- 239000004615 ingredient Substances 0.000 claims description 12
- 238000009795 derivation Methods 0.000 claims description 8
- 230000003203 everyday effect Effects 0.000 claims description 8
- 238000005259 measurement Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 4
- 230000014759 maintenance of location Effects 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 3
- 230000000452 restraining effect Effects 0.000 claims description 3
- 238000007796 conventional method Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 45
- 238000004458 analytical method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 3
- 238000007418 data mining Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 206010033307 Overweight Diseases 0.000 description 1
- 230000002354 daily effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000000491 multivariate analysis Methods 0.000 description 1
- 235000020825 overweight Nutrition 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/06—Electricity, gas or water supply
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Abstract
The invention discloses a power consumer subdivision method based on a combined matrix decomposition model. The method comprises the steps: inputting user power utilization record data, and constructing a user power utilization record matrix; inputting the geographical position information of the user, representing the geographical position information through hierarchy, constructing a user geographical position information similarity matrix, and regulating the weights of different parts in the geographical position information at different hierarchies; constructing a target function of the combined matrix decomposition model according to the user power utilization record matrix, and selecting a reasonable target function solving algorithm for solving, so as to obtain the power utilization demands of users; and carrying out the subdividing of the users according to the power utilization demands of the users. Compared with a conventional method, the method enables the representation of the users in a demand space to be more abundant in meaning. The method employs a clustering algorithm, integrates the characteristics of the users in a demand hiding space and the relation among the users for clustering of the users, and enables the correlation of users in each cluster to be closer. The power utilization demands and geographic positions of the users of different clusters are more different.
Description
Technical field
The invention belongs to Computer Applied Technology, data mining, electric power data analysis technical field, particularly relate to a kind of power consumer divided method based on confederate matrix decomposition model.
Background technology
Along with the raising of electric network information level, in electric system, create a large amount of data, bring new challenge also to the analysis of electric power data simultaneously.Traditional electric power data analysis side overweights producing and producing the research of data in power supply link, to the analysis of user data often for all users, thus have ignored the incidence relation between the feature of user itself and user.Usually the information such as electricity consumption behavior, geographic position, time, date of user is comprised in user data, and current Users'Data Analysis, the commonplace practice uses clustering algorithm to divide user according to electrographic recording, lack the comprehensive analysis to other information, and in these information, often comprise the key factor affecting user power utilization behavior.
Such as, Wang Lei analyzes power industry customer action feature, adopts k-means algorithm according to the electricity consumption behavior of user, carries out cluster to user behavior.And Wu Ling etc. adopt lifetime value theoretical, by extensively investigating and expert consulting, construct power consumer Value accounting system, and use analytical hierarchy process to be worth to assess user.Song Caihua etc. use customer general value appraisal procedure to establish customer segmentation model, construct the segmentation index system based on customer general value, and the mode using entropy assessment and expertise to combine carries out subscriber segmentation.Some researchist requires from user dependability, user is worth and user behavior three dimensions, sets up segmentation index system, utilizes k-means algorithm to carry out cluster to user.Also some researchist is by building large power customers behavior and value assessment index system, obtain demand characteristic and the Economic Value Evaluation of large power customers, and design big customer's credit comprehensive test system, realize the more comprehensive and accurate classification that becomes more meticulous of large power customers.
Matrix decomposition model is multivariate analysis model comparatively popular in recent years, because it has good interpretation in the data such as text and image, has apply very widely at Data Mining.Data matrix is decomposed into the product of hiding eigenmatrix and matrix of coefficients by matrix decomposition model, obtains original input data and hides expression in feature space in low-dimensional.The people such as Lee and Seung propose a kind of Non-negative Matrix Factorization model, by hiding eigenmatrix and matrix of coefficients employing nonnegativity restrictions to input data matrix and output, obtain the expression of text and image, and carry out text cluster and Postprocessing technique.And the people such as Cai and He on this basis, adopt graph of a relation to carry out the expression of bound data in implicit space, improve the performance of Non-negative Matrix Factorization model in text cluster.And some researchist proposes a kind of matrix decomposition model of relation constraint, incidence relation and data content between fused data, extract hidden feature, and obtain good result in text classification data.Multiple nonnegative matrix is then decomposed by the people such as Takeuchi jointly, merges user record, user social contact relation and song label, obtain user and song unified representation, and carry out song recommendations.
Summary of the invention
In order to solve the problem, the object of the present invention is to provide a kind of power consumer divided method based on confederate matrix decomposition model.
In order to achieve the above object, the power consumer divided method based on confederate matrix decomposition model provided by the invention comprises the following step performed in order:
Step 1) input user use electrographic recording data, according to using electrographic recording matrix with electrographic recording data construct user; The geographical location information of input user, represents the geographical location information of user by level, builds the geographical location information similarity matrix of user, and regulates the weight of different ingredient in various level geographical location information;
Step 2) according to step 1) objective function building confederate matrix decomposition model with electrographic recording matrix of the user that obtains, the time factor of analyzing influence user power utilization behavior and date factor, and select rational objective function derivation algorithm to solve, to obtain user power utilization demand;
Step 3) according to above-mentioned user power utilization demand, user is segmented:
In subscriber segmentation process, need calculating two basic indexs: the 1) measurement index of the similarity of the need for electricity matrix of different user, 2) in a user grouping, the calculating of the overall need for electricity matrix of all users.
In step 1) in, described input user uses electrographic recording data, according to user by the method for electrographic recording data construct electrographic recording matrix is:
All users in the user power utilization record data inputted are represented as set:
U={u
1,u
2,…,u
N}
Wherein N represents the user's number comprised in data, u
irepresent i-th user.
Electrographic recording matrix is used in being built as by electrographic recording data of i-th user:
Wherein D represents with the number of days that electrographic recording comprises in data, and T represents the number of the uniform sampling point comprised with electrographic recording of each user every day,
represent the nonnegative real number matrix of the capable D row of T.Meanwhile, use
with
representing matrix U respectively
it capable and d row, i.e. user u
iall electrographic recording on every day t time point and d days with electrographic recording, and to use
representing matrix U
ithe capable d row of t on element;
Finally, what export all users uses electrographic recording matrix:
In step 1) in, the geographical location information of described input user, the geographical location information of user is represented by level, builds the geographical location information similarity matrix of user, and regulate the method for the weight of different ingredient in various level geographical location information to be:
The geographical location information of i-th user is represented as structure:
Wherein
for the string representation of certain ingredient in place of abode,
by administrative unit, i.e. province, city, district, small towns, street, community etc., order arrangement from big to small;
The geographical location information calculating formula of similarity of i-th user and a jth user is:
Wherein e
ijrepresent the Similarity value of two user's geographical location information, δ () is logical function, the value 1 when two character strings are identical, otherwise is 0, λ
k∈ (0,1) is balance parameters, for regulating the weight of different ingredient in geographical location information, is carried out the selection of adjustment parameter by the result of system on verification msg collection;
Finally, the geographical location information similarity matrix of all users is exported:
In step 2) in, the method that described structure electrographic recording matrix combines the objective function of decomposition is:
(1) what reduce each user most possibly with electrographic recording matrix being decomposed into the loss formula in time factor matrix and date factor matrix process is:
minl
1=||U
i-V
iS
i T||
2
Wherein V
irepresent the time factor matrix affecting i-th user power utilization behavior, S
irepresent the date factor matrix affecting i-th user power utilization behavior;
(2) consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user is kept:
At this, the decomposition goal function of all N number of users is fused to together, and shares date factor matrix;
(3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly:
(4) retention time factor matrix most possibly is level and smooth:
(5) the level and smooth of date factor matrix is kept most possibly:
minl
5=||S||
2
At this, adopt the L of matrix
2normal form square || ||
2matrix is made to keep level and smooth.
Finally, the objective function of confederate matrix decomposition model is obtained by the objectives function in fusion steps (2), (3), (4), (5):
Wherein α, β and γ are balance parameters, for regulating the weight between the objectives function, carry out the selection of adjustment parameter by the result of system on verification msg collection.
In step 2) in, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:
Step 2.1) initialization matrix
with
the S2.1 stage:
K represents the number of user's request variable in concealed space, matrix V
1, V
2... V
nwith each element in S is the real number between 0 to 1 by random initializtion;
Step 2.2) to matrix V
iin each element carry out S2.2 stage of differentiate:
According to final goal function to V
iin each element differentiate, concrete formula is:
Step 2.3) to matrix V
iin each element carry out S2.3 stage of upgrading:
Each
deduct step-length respectively and be multiplied by gradient, concrete more new formula is
η is the step-length of artificial setting;
Step 2.4) judge all V
iwhether matrix upgrades the complete S2.4 stage:
If all matrix V
iall upgrade complete, then carry out step 2.5) stage, otherwise return step 2.2) to next V
iupgrade;
Step 2.5) element each in matrix S is carried out to S2.5 stage of differentiate:
According to final goal function to element differentiate each in S, concrete formula is
Step 2.6) S2.6 stage that element each in matrix S is upgraded:
Each
deduct step-length respectively and be multiplied by gradient, concrete more new formula is
τ is the step-length of artificial setting;
Step 2.7) evaluation algorithm S2.7 stage of whether restraining:
If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);
Step 2.8) S2.8 stage of Output rusults:
Export user power utilization demand result, this flow process so far terminates.
In step 3) in, the computing formula of two described basic indexs is as follows:
(1) measurement index of the similarity of the need for electricity matrix of different user is as follows:
Sim(V
i,V
j)=tr(V
i,V
j T)/(||V
i||||V
j||)
On the diagonal line of wherein tr () representing matrix element and, || || the L of representing matrix
2normal form.
In (2) user grouping c, the computing formula of the overall need for electricity matrix of all users is as follows:
Wherein | c| represents the number comprising user in grouping c.
In step 3) in, described subscriber segmentation method is as follows:
Step 3.1) be k bunch c by user's random division
1, c
2..., c
kthe S3.1 stage:
Wherein k represents the number of the user bunch of artificial setting;
Step 3.2) S3.2 stage of representing of the entirety that calculates each bunch:
Adopt above-mentioned in step 3) in step (2) in the formula entirety that calculates each bunch represent;
Step 3.3) the S3.3 stage of similarity between calculating i-th user and all bunches:
Adopt above-mentioned in step 3) in step (1) in formula, the entirety of each bunch is represented and regards special " user " as, calculate user and bunch between similarity;
Step 3.4) S3.4 stage of bunch label of adjustment i-th user:
According to step 3.3) in result, bunch label of i-th user is adjusted to the most similar bunch with it;
Step 3.5) judge whether all users adjust the complete S3.5 stage:
Judge whether that bunch label of all users all adjusts complete, if so, carry out step 3.6), if not, return step 3.2);
Step 3.6) judge the S3.6 stage whether user's bunch label restrains:
Judge whether have a user's bunch label to change in last round of user bunch label adjustment, if so, to return step 3.2), if not, carry out step 3.7);
Step 3.7) S3.7 stage of Output rusults:
Export all users bunch label result, this flow process so far terminates.
The effect of the power consumer divided method based on confederate matrix decomposition model provided by the invention:
The present invention, to the demand analysis of user and expression, comprises the information such as the electricity consumption behavior of user, time, date and geographic position, and compared to classic method, the expression of user in demand space has more abundant connotation.
The present invention adopts clustering algorithm, and in implicit demand space, consider relation between the characteristic of user itself and user carry out cluster to user, the incidence relation of the user in each bunch is tightr.Between different bunches, all there is larger difference in user in need for electricity and geographic position etc.
Accompanying drawing explanation
Fig. 1 is the overall system structure schematic diagram of the power consumer divided method based on confederate matrix decomposition model provided by the invention.
Fig. 2 is the power consumer divided method process flow diagram based on confederate matrix decomposition model provided by the invention.
Fig. 3 is provided by the invention based on confederate matrix decomposition model method for solving process flow diagram in the power consumer divided method of confederate matrix decomposition model.
Fig. 4 is provided by the invention based on subscriber segmentation method flow diagram in the power consumer divided method of confederate matrix decomposition model.
Embodiment
Below in conjunction with the drawings and specific embodiments, the power consumer divided method based on confederate matrix decomposition model provided by the invention is described in detail.
As shown in Figure 1, the present invention mainly adopts data mining theories and method to analyze the user in electric power data, in order to ensure the normal operation of system, in concrete enforcement, require that the computer platform used is equipped with the internal memory being not less than 8G, core cpu number is not less than 4 and dominant frequency is not less than 64 bit manipulation systems of 2.6GHz, Windows7 and above version, and installs the Kinds of Essential Software environment such as oracle database, Java1.7 and above version, Matlab2011b and above version.
As shown in Figure 2, the power consumer divided method based on confederate matrix decomposition model provided by the invention comprises the following step performed in order:
Step 1) input user use electrographic recording data, according to using electrographic recording matrix with electrographic recording data construct user; The geographical location information of input user, represents the geographical location information of user by level, builds the geographical location information similarity matrix of user, and regulates the weight of different ingredient in various level geographical location information;
User comprises with electrographic recording matrix two key factors affecting user power utilization behavior: 1) row of matrix represents the user data sampling time point that every day is different, 2) matrix column represents the every day comprised in user record data.
Step 2) according to step 1) objective function building confederate matrix decomposition model with electrographic recording matrix of the user that obtains, the time factor of analyzing influence user power utilization behavior and date factor, and select rational objective function derivation algorithm to solve, to obtain user power utilization demand;
Described objective function comprises five aspects: that 1) reduces each user most possibly is being decomposed into the loss in time factor matrix and date factor matrix process, 2 with electrographic recording matrix) keep the consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user; 3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly; 4) retention time factor matrix and date factor matrix most possibly is level and smooth, 5) keep the convexity of objective function, make model have globally optimal solution; Derivation algorithm should meet three pacing itemss: 1) algorithm can be restrained within the acceptable time, 2) algorithm will control in the reasonable scope the demand of storage space, 3) algorithm can executed in parallel, to improve counting yield.
Step 3) according to above-mentioned user power utilization demand, user is segmented:
According to user power utilization demand, segmentation being carried out to user and should meet two pacing itemss: the user 1) in same group should have similar need for electricity matrix, 2) the need for electricity matrix of user in different groups should be different as far as possible; Therefore, in order to improve the quality of subscriber segmentation result, in subscriber segmentation process, need calculating two basic indexs: the 1) measurement index of the similarity of the need for electricity matrix of different user, 2) in a user grouping, the calculating of the overall need for electricity matrix of all users; Meanwhile, algorithm should be able to be optimized voluntarily, adjusts the user be divided in the grouping of mistake.
In step 1) in, described input user uses electrographic recording data, according to user by the method for electrographic recording data construct electrographic recording matrix is:
All users in the user power utilization record data inputted are represented as set:
U={u
1,u
2,…,u
N}
Wherein N represents the user's number comprised in data, u
irepresent i-th user.
Electrographic recording matrix is used in being built as by electrographic recording data of i-th user:
Wherein D represents with the number of days that electrographic recording comprises in data, and T represents the number of the uniform sampling point comprised with electrographic recording of each user every day,
represent the nonnegative real number matrix of the capable D row of T.Meanwhile, use
with
representing matrix U respectively
it capable and d row, i.e. user u
iall electrographic recording on every day t time point and d days with electrographic recording, and to use
representing matrix U
ithe capable d row of t on element;
Finally, what export all users uses electrographic recording matrix:
In step 1) in, the geographical location information of described input user, the geographical location information of user is represented by level, builds the geographical location information similarity matrix of user, and regulate the method for the weight of different ingredient in various level geographical location information to be:
The geographical location information of i-th user is represented as structure:
Wherein
for the string representation of certain ingredient in place of abode,
by administrative unit, i.e. province, city, district, small towns, street, community etc., order arrangement from big to small;
The geographical location information calculating formula of similarity of i-th user and a jth user is:
Wherein e
ijrepresent the Similarity value of two user's geographical location information, δ () is logical function, the value 1 when two character strings are identical, otherwise is 0, λ
k∈ (0,1) is balance parameters, for regulating the weight of different ingredient in geographical location information, is carried out the selection of adjustment parameter by the result of system on verification msg collection;
Finally, the geographical location information similarity matrix of all users is exported:
In step 2) in, the method that described structure electrographic recording matrix combines the objective function of decomposition is:
(1) what reduce each user most possibly with electrographic recording matrix being decomposed into the loss formula in time factor matrix and date factor matrix process is:
minl
1=||U
i-V
iS
i T||
2
Wherein V
irepresent the time factor matrix affecting i-th user power utilization behavior, S
irepresent the date factor matrix affecting i-th user power utilization behavior;
(2) consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user is kept:
At this, the decomposition goal function of all N number of users is fused to together, and shares date factor matrix;
(3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly:
(4) retention time factor matrix most possibly is level and smooth:
(5) the level and smooth of date factor matrix is kept most possibly:
minl
5=||S||
2
At this, adopt the L of matrix
2normal form square || ||
2matrix is made to keep level and smooth.
Finally, the objective function of confederate matrix decomposition model is obtained by the objectives function in fusion steps (2), (3), (4), (5):
Wherein α, β and γ are balance parameters, for regulating the weight between the objectives function, carry out the selection of adjustment parameter by the result of system on verification msg collection.
In step 2) in, adopt the L of matrix
2normal form square || ||
2weigh the loss in each step and matrix level and smooth, the convexity of final objective function l can be ensured, make model have globally optimal solution.Following simple derivation is had to the convexity of objective function:
Prove 1: objective function l is V
iconvex function.
First objective function is rewritten as:
Wherein C
1be and V
iirrelevant constant.Make l=f
1+ α f
2+ γ f
3+ C
1, wherein:
Below f is proved respectively
1, f
2, f
3v
iconvex function.
(1)
Wherein C
2be and V
iirrelevant constant.To V
ievery a line differentiate, can obtain
and
With
Represent V
ithe higher-dimension row vector be extended to by row, then f
1to v
ithe gloomy matrix in sea be a block diagonal matrix
again because,
Tie up non-vanishing vector z to any T all to set up, then have det (G
tthe determinant of)>=0, det () representing matrix.Therefore,
f
1v
iconvex function.
(2) v in above-mentioned steps (1) is adopted
i, can obtain
so f
2v
iconvex function.
(3) make
c
2be and V
iirrelevant constant.From the step (2) in proof 1,
v
iconvex function.Again
and e
ij>=0, so f
3v
iconvex function.
To sum up, and have alpha, gamma>=0, objective function l is V
iconvex function, card finish.
Prove that 2: objective function l is the convex function of S.
Equally, first objective function is rewritten as:
C
1the constant irrelevant with S.Order
wherein:
C
2the constant irrelevant with S.Similar with the proof of theorem 1, use
represent the higher-dimension row vector be extended to by row by S, can f be obtained
ia block diagonal matrix to the gloomy matrix in the sea of S
Wherein
Then f
iit is the convex function of S.Again || S||
2the convex function of S, and β>=0, objective function l is the convex function of S, and card is finished.
In step 2) in, objective function l is V simultaneously
iwith the convex function of S, in order to improve the velocities solved of algorithm, and reducing the demand of algorithm to storage space, adopting the iteration based on Gradient Descent more to newly arrive and solving V
iand S, as shown in Figure 3, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:
Step 2.1) initialization matrix
with
the S2.1 stage:
K represents the number of user's request variable in concealed space, matrix V
1, V
2... V
nwith each element in S is the real number between 0 to 1 by random initializtion;
Step 2.2) to matrix V
iin each element carry out S2.2 stage of differentiate:
According to final goal function to V
iin each element differentiate, concrete formula is:
Step 2.3) to matrix V
iin each element carry out S2.3 stage of upgrading:
Each
deduct step-length respectively and be multiplied by gradient, concrete more new formula is
η is the step-length of artificial setting;
Step 2.4) judge all V
iwhether matrix upgrades the complete S2.4 stage:
If all matrix V
iall upgrade complete, then carry out step 2.5) stage, otherwise return step 2.2) to next V
iupgrade;
Step 2.5) element each in matrix S is carried out to S2.5 stage of differentiate:
According to final goal function to element differentiate each in S, concrete formula is
Step 2.6) S2.6 stage that element each in matrix S is upgraded:
Each
deduct step-length respectively and be multiplied by gradient, concrete more new formula is
τ is the step-length of artificial setting;
Step 2.7) evaluation algorithm S2.7 stage of whether restraining:
If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);
Step 2.8) S2.8 stage of Output rusults:
Export user power utilization demand result, this flow process so far terminates.
In step 3) in, the computing formula of two described basic indexs is as follows:
(1) measurement index of the similarity of the need for electricity matrix of different user is as follows:
Sim(V
i,V
j)=tr(V
i,V
j T)/(||V
i||||V
j||)
On the diagonal line of wherein tr () representing matrix element and, || || the L of representing matrix
2normal form.
In (2) user grouping c, the computing formula of the overall need for electricity matrix of all users is as follows:
Wherein | c| represents the number comprising user in grouping c.
As shown in Figure 4, in step 3) in, described subscriber segmentation method is as follows:
Step 3.1) be k bunch c by user's random division
1, c
2..., c
kthe S3.1 stage:
Wherein k represents the number of the user bunch of artificial setting;
Step 3.2) S3.2 stage of representing of the entirety that calculates each bunch:
Adopt above-mentioned in step 3) in step (2) in the formula entirety that calculates each bunch represent;
Step 3.3) the S3.3 stage of similarity between calculating i-th user and all bunches:
Adopt above-mentioned in step 3) in step (1) in formula, the entirety of each bunch is represented and regards special " user " as, calculate user and bunch between similarity;
Step 3.4) S3.4 stage of bunch label of adjustment i-th user:
According to step 3.3) in result, bunch label of i-th user is adjusted to the most similar bunch with it;
Step 3.5) judge whether all users adjust the complete S3.5 stage:
Judge whether that bunch label of all users all adjusts complete, if so, carry out step 3.6), if not, return step 3.2);
Step 3.6) judge the S3.6 stage whether user's bunch label restrains:
Judge whether have a user's bunch label to change in last round of user bunch label adjustment, if so, to return step 3.2), if not, carry out step 3.7);
Step 3.7) S3.7 stage of Output rusults:
Export all users bunch label result, this flow process so far terminates.
The present invention adopts the dwelling places information of user power utilization record data in electric power data and user, is each user structure electrographic recording matrix respectively, and structure basedization contrast algorithm calculates the similarity of user's geographical location information.Confederate matrix decomposition model is adopted to carry out modeling to the need for electricity of user, the time factor of analyzing influence user power utilization behavior and date factor.Then, according to the geographic position similarity of user, merge geographical location information further, make the expression of user in implicit demand space comprise need for electricity information and geographical location information simultaneously.Finally, adopt clustering algorithm, according to the expression of user in implicit demand space, user is segmented, represent the incidence relation on electricity consumption behavior and place of abode formally, relation between user's request and user is understood to power department, adjusts electrical production, safeguard that the daily management activities such as electricity consumption facility have important reference value.
It is emphasized that; embodiment of the present invention is illustrative; instead of it is determinate; therefore the present invention is not limited to the embodiment described in embodiment; every other embodiments drawn by those skilled in the art's technical scheme according to the present invention, belong to the scope of protection of the invention equally.
Claims (7)
1. based on a power consumer divided method for confederate matrix decomposition model, it is characterized in that: the described power consumer divided method based on confederate matrix decomposition model comprises the following step performed in order:
Step 1) input user use electrographic recording data, according to using electrographic recording matrix with electrographic recording data construct user; The geographical location information of input user, represents the geographical location information of user by level, builds the geographical location information similarity matrix of user, and regulates the weight of different ingredient in various level geographical location information;
Step 2) according to step 1) objective function building confederate matrix decomposition model with electrographic recording matrix of the user that obtains, the time factor of analyzing influence user power utilization behavior and date factor, and select rational objective function derivation algorithm to solve, to obtain user power utilization demand;
Step 3) according to above-mentioned user power utilization demand, user is segmented:
In subscriber segmentation process, need calculating two basic indexs: the 1) measurement index of the similarity of the need for electricity matrix of different user, 2) in a user grouping, the calculating of the overall need for electricity matrix of all users.
2. the power consumer divided method based on confederate matrix decomposition model according to claim 1, it is characterized in that: in step 1) in, described input user uses electrographic recording data, according to user by the method for electrographic recording data construct electrographic recording matrix is:
All users in the user power utilization record data inputted are represented as set:
U={u
1,u
2,…,u
N}
Wherein N represents the user's number comprised in data, u
irepresent i-th user;
Electrographic recording matrix is used in being built as by electrographic recording data of i-th user:
Wherein D represents with the number of days that electrographic recording comprises in data, and T represents the number of the uniform sampling point comprised with electrographic recording of each user every day,
represent the nonnegative real number matrix of the capable D row of T; Meanwhile, use
with
representing matrix U respectively
it capable and d row, i.e. user u
iall electrographic recording on every day t time point and d days with electrographic recording, and to use
representing matrix U
ithe capable d row of t on element;
Finally, what export all users uses electrographic recording matrix:
3. the power consumer divided method based on confederate matrix decomposition model according to claim 1, it is characterized in that: in step 1) in, the geographical location information of described input user, the geographical location information of user is represented by level, build the geographical location information similarity matrix of user, and regulate the method for the weight of different ingredient in various level geographical location information to be:
The geographical location information of i-th user is represented as structure:
Wherein
for the string representation of certain ingredient in place of abode,
by administrative unit, i.e. province, city, district, small towns, street, community etc., order arrangement from big to small;
The geographical location information calculating formula of similarity of i-th user and a jth user is:
Wherein e
ijrepresent the Similarity value of two user's geographical location information, δ () is logical function, the value 1 when two character strings are identical, otherwise is 0, λ
k∈ (0,1) is balance parameters, for regulating the weight of different ingredient in geographical location information, is carried out the selection of adjustment parameter by the result of system on verification msg collection;
Finally, the geographical location information similarity matrix of all users is exported:
4. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 2) in, the method that described structure electrographic recording matrix combines the objective function of decomposition is:
(1) what reduce each user most possibly with electrographic recording matrix being decomposed into the loss formula in time factor matrix and date factor matrix process is:
Wherein V
irepresent the time factor matrix affecting i-th user power utilization behavior, S
irepresent the date factor matrix affecting i-th user power utilization behavior;
(2) consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user is kept:
At this, the decomposition goal function of all N number of users is fused to together, and shares date factor matrix;
(3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly:
(4) retention time factor matrix most possibly is level and smooth:
(5) the level and smooth of date factor matrix is kept most possibly:
minl
5=||S||
2
At this, adopt the L of matrix
2normal form square || ||
2matrix is made to keep level and smooth.
Finally, the objective function of confederate matrix decomposition model is obtained by the objectives function in fusion steps (2), (3), (4), (5):
Wherein α, β and γ are balance parameters, for regulating the weight between the objectives function, carry out the selection of adjustment parameter by the result of system on verification msg collection.
5. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 2) in, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:
Step 2.1) initialization matrix
with
the S2.1 stage:
K represents the number of user's request variable in concealed space, matrix V
1, V
2... V
nwith each element in S is the real number between 0 to 1 by random initializtion;
Step 2.2) to matrix V
iin each element carry out S2.2 stage of differentiate:
According to final goal function to V
iin each element differentiate, concrete formula is:
Step 2.3) to matrix V
iin each element carry out S2.3 stage of upgrading:
Each
deduct step-length respectively and be multiplied by gradient, concrete more new formula is
η is the step-length of artificial setting;
Step 2.4) judge all V
iwhether matrix upgrades the complete S2.4 stage:
If all matrix V
iall upgrade complete, then carry out step 2.5) stage, otherwise return step 2.2) to next V
iupgrade;
Step 2.5) element each in matrix S is carried out to S2.5 stage of differentiate:
According to final goal function to element differentiate each in S, concrete formula is
Step 2.6) S2.6 stage that element each in matrix S is upgraded:
Each
deduct step-length respectively and be multiplied by gradient, concrete more new formula is
τ is the step-length of artificial setting;
Step 2.7) evaluation algorithm S2.7 stage of whether restraining:
If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);
Step 2.8) S2.8 stage of Output rusults:
Export user power utilization demand result, this flow process so far terminates.
6. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 3) in, the computing formula of two described basic indexs is as follows:
(1) measurement index of the similarity of the need for electricity matrix of different user is as follows:
Sim(V
i,V
j)=tr(V
i,V
j T)/(||V
i||||V
j||)
On the diagonal line of wherein tr () representing matrix element and, || || the L of representing matrix
2normal form;
In (2) user grouping c, the computing formula of the overall need for electricity matrix of all users is as follows:
Wherein | c| represents the number comprising user in grouping c.
7. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 3) in, described subscriber segmentation method is as follows:
Step 3.1) be k bunch c by user's random division
1, c
2..., c
kthe S3.1 stage:
Wherein k represents the number of the user bunch of artificial setting;
Step 3.2) S3.2 stage of representing of the entirety that calculates each bunch:
Adopt above-mentioned in step 3) in step (2) in the formula entirety that calculates each bunch represent;
Step 3.3) the S3.3 stage of similarity between calculating i-th user and all bunches:
Adopt above-mentioned in step 3) in step (1) in formula, the entirety of each bunch is represented and regards special " user " as, calculate user and bunch between similarity;
Step 3.4) S3.4 stage of bunch label of adjustment i-th user:
According to step 3.3) in result, bunch label of i-th user is adjusted to the most similar bunch with it;
Step 3.5) judge whether all users adjust the complete S3.5 stage:
Judge whether that bunch label of all users all adjusts complete, if so, carry out step 3.6), if not, return step 3.2);
Step 3.6) judge the S3.6 stage whether user's bunch label restrains:
Judge whether have a user's bunch label to change in last round of user bunch label adjustment, if so, to return step 3.2), if not, carry out step 3.7);
Step 3.7) S3.7 stage of Output rusults:
Export all users bunch label result, this flow process so far terminates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510801889.8A CN105447767A (en) | 2015-11-19 | 2015-11-19 | Power consumer subdivision method based on combined matrix decomposition model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510801889.8A CN105447767A (en) | 2015-11-19 | 2015-11-19 | Power consumer subdivision method based on combined matrix decomposition model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105447767A true CN105447767A (en) | 2016-03-30 |
Family
ID=55557901
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510801889.8A Pending CN105447767A (en) | 2015-11-19 | 2015-11-19 | Power consumer subdivision method based on combined matrix decomposition model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105447767A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108021538A (en) * | 2017-11-15 | 2018-05-11 | 国网甘肃省电力公司信息通信公司 | A kind of electric quantity data restoration methods based on joint Weather information matrix decomposition |
CN108734216A (en) * | 2018-05-22 | 2018-11-02 | 广东工业大学 | Classification of power customers method, apparatus and storage medium based on load curve form |
CN109740790A (en) * | 2018-11-28 | 2019-05-10 | 国网天津市电力公司 | A kind of user power consumption prediction technique extracted based on temporal aspect |
CN116805785A (en) * | 2023-08-17 | 2023-09-26 | 国网浙江省电力有限公司金华供电公司 | Power load hierarchy time sequence prediction method based on random clustering |
-
2015
- 2015-11-19 CN CN201510801889.8A patent/CN105447767A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108021538A (en) * | 2017-11-15 | 2018-05-11 | 国网甘肃省电力公司信息通信公司 | A kind of electric quantity data restoration methods based on joint Weather information matrix decomposition |
CN108021538B (en) * | 2017-11-15 | 2021-06-04 | 国网甘肃省电力公司信息通信公司 | Electric quantity data recovery method based on joint weather information matrix decomposition |
CN108734216A (en) * | 2018-05-22 | 2018-11-02 | 广东工业大学 | Classification of power customers method, apparatus and storage medium based on load curve form |
CN109740790A (en) * | 2018-11-28 | 2019-05-10 | 国网天津市电力公司 | A kind of user power consumption prediction technique extracted based on temporal aspect |
CN116805785A (en) * | 2023-08-17 | 2023-09-26 | 国网浙江省电力有限公司金华供电公司 | Power load hierarchy time sequence prediction method based on random clustering |
CN116805785B (en) * | 2023-08-17 | 2023-11-28 | 国网浙江省电力有限公司金华供电公司 | Power load hierarchy time sequence prediction method based on random clustering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113962364B (en) | Multi-factor power load prediction method based on deep learning | |
Domínguez-Muñoz et al. | Selection of typical demand days for CHP optimization | |
CN108446794A (en) | One kind being based on multiple convolutional neural networks combination framework deep learning prediction techniques | |
CN104951425A (en) | Cloud service performance adaptive action type selection method based on deep learning | |
CN105447767A (en) | Power consumer subdivision method based on combined matrix decomposition model | |
CN111724039B (en) | Recommendation method for recommending customer service personnel to power users | |
CN106952027A (en) | A kind of 10kV distribution network lines plan access capacity computational methods | |
CN104376502A (en) | Electric power customer credit comprehensive evaluation method based on grey relational degree | |
CN109117872A (en) | A kind of user power utilization behavior analysis method based on automatic Optimal Clustering | |
CN105335800A (en) | Method for forecasting electricity consumption of power consumers based on joint learning | |
CN112614011A (en) | Power distribution network material demand prediction method and device, storage medium and electronic equipment | |
CN108960488A (en) | A kind of accurate prediction technique of saturation loading spatial distribution based on deep learning and Multi-source Information Fusion | |
CN104636834B (en) | A kind of improved joint probability plan model system optimization method | |
CN111027741A (en) | Method for constructing space-time dimension-oriented generalized load model analysis library | |
CN113591368A (en) | Comprehensive energy system multi-energy load prediction method and system | |
Zhao et al. | Short-term microgrid load probability density forecasting method based on k-means-deep learning quantile regression | |
CN101807218B (en) | Heterogeneous network-based land pattern succession simulation system | |
CN109657846A (en) | Power grid alternative subsidy scale impact factor screening technique | |
Guan et al. | Customer load forecasting method based on the industry electricity consumption behavior portrait | |
CN116701965A (en) | BIRCH clustering algorithm-based panoramic carbon representation method for enterprise users | |
CN114839586B (en) | Low-voltage station metering device misalignment calculation method based on EM algorithm | |
Zhang et al. | A segmented evaluation model for building energy performance considering seasonal dynamic fluctuations | |
Guo et al. | Mobile user credit prediction based on lightgbm | |
CN108615091A (en) | Electric power meteorology load data prediction technique based on cluster screening and neural network | |
CN105741143A (en) | Load characteristic and cluster analysis based electric power commodity pricing model establishment method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160330 |
|
RJ01 | Rejection of invention patent application after publication |