CN105447767A - Power consumer subdivision method based on combined matrix decomposition model - Google Patents

Power consumer subdivision method based on combined matrix decomposition model Download PDF

Info

Publication number
CN105447767A
CN105447767A CN201510801889.8A CN201510801889A CN105447767A CN 105447767 A CN105447767 A CN 105447767A CN 201510801889 A CN201510801889 A CN 201510801889A CN 105447767 A CN105447767 A CN 105447767A
Authority
CN
China
Prior art keywords
user
matrix
users
sigma
stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510801889.8A
Other languages
Chinese (zh)
Inventor
王扬
刘杰
吴凡
章斌
魏睐
杨得博
梅振鹏
郎赫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Tianjin Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201510801889.8A priority Critical patent/CN105447767A/en
Publication of CN105447767A publication Critical patent/CN105447767A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Abstract

The invention discloses a power consumer subdivision method based on a combined matrix decomposition model. The method comprises the steps: inputting user power utilization record data, and constructing a user power utilization record matrix; inputting the geographical position information of the user, representing the geographical position information through hierarchy, constructing a user geographical position information similarity matrix, and regulating the weights of different parts in the geographical position information at different hierarchies; constructing a target function of the combined matrix decomposition model according to the user power utilization record matrix, and selecting a reasonable target function solving algorithm for solving, so as to obtain the power utilization demands of users; and carrying out the subdividing of the users according to the power utilization demands of the users. Compared with a conventional method, the method enables the representation of the users in a demand space to be more abundant in meaning. The method employs a clustering algorithm, integrates the characteristics of the users in a demand hiding space and the relation among the users for clustering of the users, and enables the correlation of users in each cluster to be closer. The power utilization demands and geographic positions of the users of different clusters are more different.

Description

A kind of power consumer divided method based on confederate matrix decomposition model
Technical field
The invention belongs to Computer Applied Technology, data mining, electric power data analysis technical field, particularly relate to a kind of power consumer divided method based on confederate matrix decomposition model.
Background technology
Along with the raising of electric network information level, in electric system, create a large amount of data, bring new challenge also to the analysis of electric power data simultaneously.Traditional electric power data analysis side overweights producing and producing the research of data in power supply link, to the analysis of user data often for all users, thus have ignored the incidence relation between the feature of user itself and user.Usually the information such as electricity consumption behavior, geographic position, time, date of user is comprised in user data, and current Users'Data Analysis, the commonplace practice uses clustering algorithm to divide user according to electrographic recording, lack the comprehensive analysis to other information, and in these information, often comprise the key factor affecting user power utilization behavior.
Such as, Wang Lei analyzes power industry customer action feature, adopts k-means algorithm according to the electricity consumption behavior of user, carries out cluster to user behavior.And Wu Ling etc. adopt lifetime value theoretical, by extensively investigating and expert consulting, construct power consumer Value accounting system, and use analytical hierarchy process to be worth to assess user.Song Caihua etc. use customer general value appraisal procedure to establish customer segmentation model, construct the segmentation index system based on customer general value, and the mode using entropy assessment and expertise to combine carries out subscriber segmentation.Some researchist requires from user dependability, user is worth and user behavior three dimensions, sets up segmentation index system, utilizes k-means algorithm to carry out cluster to user.Also some researchist is by building large power customers behavior and value assessment index system, obtain demand characteristic and the Economic Value Evaluation of large power customers, and design big customer's credit comprehensive test system, realize the more comprehensive and accurate classification that becomes more meticulous of large power customers.
Matrix decomposition model is multivariate analysis model comparatively popular in recent years, because it has good interpretation in the data such as text and image, has apply very widely at Data Mining.Data matrix is decomposed into the product of hiding eigenmatrix and matrix of coefficients by matrix decomposition model, obtains original input data and hides expression in feature space in low-dimensional.The people such as Lee and Seung propose a kind of Non-negative Matrix Factorization model, by hiding eigenmatrix and matrix of coefficients employing nonnegativity restrictions to input data matrix and output, obtain the expression of text and image, and carry out text cluster and Postprocessing technique.And the people such as Cai and He on this basis, adopt graph of a relation to carry out the expression of bound data in implicit space, improve the performance of Non-negative Matrix Factorization model in text cluster.And some researchist proposes a kind of matrix decomposition model of relation constraint, incidence relation and data content between fused data, extract hidden feature, and obtain good result in text classification data.Multiple nonnegative matrix is then decomposed by the people such as Takeuchi jointly, merges user record, user social contact relation and song label, obtain user and song unified representation, and carry out song recommendations.
Summary of the invention
In order to solve the problem, the object of the present invention is to provide a kind of power consumer divided method based on confederate matrix decomposition model.
In order to achieve the above object, the power consumer divided method based on confederate matrix decomposition model provided by the invention comprises the following step performed in order:
Step 1) input user use electrographic recording data, according to using electrographic recording matrix with electrographic recording data construct user; The geographical location information of input user, represents the geographical location information of user by level, builds the geographical location information similarity matrix of user, and regulates the weight of different ingredient in various level geographical location information;
Step 2) according to step 1) objective function building confederate matrix decomposition model with electrographic recording matrix of the user that obtains, the time factor of analyzing influence user power utilization behavior and date factor, and select rational objective function derivation algorithm to solve, to obtain user power utilization demand;
Step 3) according to above-mentioned user power utilization demand, user is segmented:
In subscriber segmentation process, need calculating two basic indexs: the 1) measurement index of the similarity of the need for electricity matrix of different user, 2) in a user grouping, the calculating of the overall need for electricity matrix of all users.
In step 1) in, described input user uses electrographic recording data, according to user by the method for electrographic recording data construct electrographic recording matrix is:
All users in the user power utilization record data inputted are represented as set:
U={u 1,u 2,…,u N}
Wherein N represents the user's number comprised in data, u irepresent i-th user.
Electrographic recording matrix is used in being built as by electrographic recording data of i-th user:
Wherein D represents with the number of days that electrographic recording comprises in data, and T represents the number of the uniform sampling point comprised with electrographic recording of each user every day, represent the nonnegative real number matrix of the capable D row of T.Meanwhile, use with representing matrix U respectively it capable and d row, i.e. user u iall electrographic recording on every day t time point and d days with electrographic recording, and to use representing matrix U ithe capable d row of t on element;
Finally, what export all users uses electrographic recording matrix:
In step 1) in, the geographical location information of described input user, the geographical location information of user is represented by level, builds the geographical location information similarity matrix of user, and regulate the method for the weight of different ingredient in various level geographical location information to be:
The geographical location information of i-th user is represented as structure:
g i = { g i 1 , g i 2 , ... , g i n }
Wherein for the string representation of certain ingredient in place of abode, by administrative unit, i.e. province, city, district, small towns, street, community etc., order arrangement from big to small;
The geographical location information calculating formula of similarity of i-th user and a jth user is:
e i j = Σ k = 1 n λ k Π m = 1 k δ ( g i k , g j k ) / Σ k = 1 n λ k
Wherein e ijrepresent the Similarity value of two user's geographical location information, δ () is logical function, the value 1 when two character strings are identical, otherwise is 0, λ k∈ (0,1) is balance parameters, for regulating the weight of different ingredient in geographical location information, is carried out the selection of adjustment parameter by the result of system on verification msg collection;
Finally, the geographical location information similarity matrix of all users is exported:
In step 2) in, the method that described structure electrographic recording matrix combines the objective function of decomposition is:
(1) what reduce each user most possibly with electrographic recording matrix being decomposed into the loss formula in time factor matrix and date factor matrix process is:
minl 1=||U i-V iS i T|| 2
Wherein V irepresent the time factor matrix affecting i-th user power utilization behavior, S irepresent the date factor matrix affecting i-th user power utilization behavior;
(2) consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user is kept:
min l 2 = Σ i = 1 N | | U i - V i S T | | 2
At this, the decomposition goal function of all N number of users is fused to together, and shares date factor matrix;
(3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly:
min l 3 = Σ i = 1 N Σ j = 1 N e i j | | V i - V j | | 2
(4) retention time factor matrix most possibly is level and smooth:
min l 4 = Σ i = 1 N | | V i | | 2
(5) the level and smooth of date factor matrix is kept most possibly:
minl 5=||S|| 2
At this, adopt the L of matrix 2normal form square || || 2matrix is made to keep level and smooth.
Finally, the objective function of confederate matrix decomposition model is obtained by the objectives function in fusion steps (2), (3), (4), (5):
min l = 1 2 Σ i = 1 N | | U i - V i S T | | 2 + α 2 Σ i = 1 N | | V i | | 2 + β 2 | | S | | 2 + γ 2 Σ i = 1 N Σ j = 1 N e i j | | V i - V j | | 2
Wherein α, β and γ are balance parameters, for regulating the weight between the objectives function, carry out the selection of adjustment parameter by the result of system on verification msg collection.
In step 2) in, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:
Step 2.1) initialization matrix with the S2.1 stage:
K represents the number of user's request variable in concealed space, matrix V 1, V 2... V nwith each element in S is the real number between 0 to 1 by random initializtion;
Step 2.2) to matrix V iin each element carry out S2.2 stage of differentiate:
According to final goal function to V iin each element differentiate, concrete formula is:
∂ l / ∂ v i t k = ( V i t * S T - U i t * ) S * k + ( α + γΣ j = 1 , j ≠ i T e i j ) v i t k - γΣ j = 1 , j ≠ i T e i j v j t k ;
Step 2.3) to matrix V iin each element carry out S2.3 stage of upgrading:
Each deduct step-length respectively and be multiplied by gradient, concrete more new formula is η is the step-length of artificial setting;
Step 2.4) judge all V iwhether matrix upgrades the complete S2.4 stage:
If all matrix V iall upgrade complete, then carry out step 2.5) stage, otherwise return step 2.2) to next V iupgrade;
Step 2.5) element each in matrix S is carried out to S2.5 stage of differentiate:
According to final goal function to element differentiate each in S, concrete formula is ∂ l / ∂ s d k = Σ i = 1 N ( V i * k ) T [ V ( S d * ) T - U i * d ] + βs d k ;
Step 2.6) S2.6 stage that element each in matrix S is upgraded:
Each deduct step-length respectively and be multiplied by gradient, concrete more new formula is τ is the step-length of artificial setting;
Step 2.7) evaluation algorithm S2.7 stage of whether restraining:
If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);
Step 2.8) S2.8 stage of Output rusults:
Export user power utilization demand result, this flow process so far terminates.
In step 3) in, the computing formula of two described basic indexs is as follows:
(1) measurement index of the similarity of the need for electricity matrix of different user is as follows:
Sim(V i,V j)=tr(V i,V j T)/(||V i||||V j||)
On the diagonal line of wherein tr () representing matrix element and, || || the L of representing matrix 2normal form.
In (2) user grouping c, the computing formula of the overall need for electricity matrix of all users is as follows:
C e n t e r ( c ) = Σ u i ∈ c V i / | c |
Wherein | c| represents the number comprising user in grouping c.
In step 3) in, described subscriber segmentation method is as follows:
Step 3.1) be k bunch c by user's random division 1, c 2..., c kthe S3.1 stage:
Wherein k represents the number of the user bunch of artificial setting;
Step 3.2) S3.2 stage of representing of the entirety that calculates each bunch:
Adopt above-mentioned in step 3) in step (2) in the formula entirety that calculates each bunch represent;
Step 3.3) the S3.3 stage of similarity between calculating i-th user and all bunches:
Adopt above-mentioned in step 3) in step (1) in formula, the entirety of each bunch is represented and regards special " user " as, calculate user and bunch between similarity;
Step 3.4) S3.4 stage of bunch label of adjustment i-th user:
According to step 3.3) in result, bunch label of i-th user is adjusted to the most similar bunch with it;
Step 3.5) judge whether all users adjust the complete S3.5 stage:
Judge whether that bunch label of all users all adjusts complete, if so, carry out step 3.6), if not, return step 3.2);
Step 3.6) judge the S3.6 stage whether user's bunch label restrains:
Judge whether have a user's bunch label to change in last round of user bunch label adjustment, if so, to return step 3.2), if not, carry out step 3.7);
Step 3.7) S3.7 stage of Output rusults:
Export all users bunch label result, this flow process so far terminates.
The effect of the power consumer divided method based on confederate matrix decomposition model provided by the invention:
The present invention, to the demand analysis of user and expression, comprises the information such as the electricity consumption behavior of user, time, date and geographic position, and compared to classic method, the expression of user in demand space has more abundant connotation.
The present invention adopts clustering algorithm, and in implicit demand space, consider relation between the characteristic of user itself and user carry out cluster to user, the incidence relation of the user in each bunch is tightr.Between different bunches, all there is larger difference in user in need for electricity and geographic position etc.
Accompanying drawing explanation
Fig. 1 is the overall system structure schematic diagram of the power consumer divided method based on confederate matrix decomposition model provided by the invention.
Fig. 2 is the power consumer divided method process flow diagram based on confederate matrix decomposition model provided by the invention.
Fig. 3 is provided by the invention based on confederate matrix decomposition model method for solving process flow diagram in the power consumer divided method of confederate matrix decomposition model.
Fig. 4 is provided by the invention based on subscriber segmentation method flow diagram in the power consumer divided method of confederate matrix decomposition model.
Embodiment
Below in conjunction with the drawings and specific embodiments, the power consumer divided method based on confederate matrix decomposition model provided by the invention is described in detail.
As shown in Figure 1, the present invention mainly adopts data mining theories and method to analyze the user in electric power data, in order to ensure the normal operation of system, in concrete enforcement, require that the computer platform used is equipped with the internal memory being not less than 8G, core cpu number is not less than 4 and dominant frequency is not less than 64 bit manipulation systems of 2.6GHz, Windows7 and above version, and installs the Kinds of Essential Software environment such as oracle database, Java1.7 and above version, Matlab2011b and above version.
As shown in Figure 2, the power consumer divided method based on confederate matrix decomposition model provided by the invention comprises the following step performed in order:
Step 1) input user use electrographic recording data, according to using electrographic recording matrix with electrographic recording data construct user; The geographical location information of input user, represents the geographical location information of user by level, builds the geographical location information similarity matrix of user, and regulates the weight of different ingredient in various level geographical location information;
User comprises with electrographic recording matrix two key factors affecting user power utilization behavior: 1) row of matrix represents the user data sampling time point that every day is different, 2) matrix column represents the every day comprised in user record data.
Step 2) according to step 1) objective function building confederate matrix decomposition model with electrographic recording matrix of the user that obtains, the time factor of analyzing influence user power utilization behavior and date factor, and select rational objective function derivation algorithm to solve, to obtain user power utilization demand;
Described objective function comprises five aspects: that 1) reduces each user most possibly is being decomposed into the loss in time factor matrix and date factor matrix process, 2 with electrographic recording matrix) keep the consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user; 3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly; 4) retention time factor matrix and date factor matrix most possibly is level and smooth, 5) keep the convexity of objective function, make model have globally optimal solution; Derivation algorithm should meet three pacing itemss: 1) algorithm can be restrained within the acceptable time, 2) algorithm will control in the reasonable scope the demand of storage space, 3) algorithm can executed in parallel, to improve counting yield.
Step 3) according to above-mentioned user power utilization demand, user is segmented:
According to user power utilization demand, segmentation being carried out to user and should meet two pacing itemss: the user 1) in same group should have similar need for electricity matrix, 2) the need for electricity matrix of user in different groups should be different as far as possible; Therefore, in order to improve the quality of subscriber segmentation result, in subscriber segmentation process, need calculating two basic indexs: the 1) measurement index of the similarity of the need for electricity matrix of different user, 2) in a user grouping, the calculating of the overall need for electricity matrix of all users; Meanwhile, algorithm should be able to be optimized voluntarily, adjusts the user be divided in the grouping of mistake.
In step 1) in, described input user uses electrographic recording data, according to user by the method for electrographic recording data construct electrographic recording matrix is:
All users in the user power utilization record data inputted are represented as set:
U={u 1,u 2,…,u N}
Wherein N represents the user's number comprised in data, u irepresent i-th user.
Electrographic recording matrix is used in being built as by electrographic recording data of i-th user:
Wherein D represents with the number of days that electrographic recording comprises in data, and T represents the number of the uniform sampling point comprised with electrographic recording of each user every day, represent the nonnegative real number matrix of the capable D row of T.Meanwhile, use with representing matrix U respectively it capable and d row, i.e. user u iall electrographic recording on every day t time point and d days with electrographic recording, and to use representing matrix U ithe capable d row of t on element;
Finally, what export all users uses electrographic recording matrix:
In step 1) in, the geographical location information of described input user, the geographical location information of user is represented by level, builds the geographical location information similarity matrix of user, and regulate the method for the weight of different ingredient in various level geographical location information to be:
The geographical location information of i-th user is represented as structure:
g i = { g i 1 , g i 2 , ... , g i n }
Wherein for the string representation of certain ingredient in place of abode, by administrative unit, i.e. province, city, district, small towns, street, community etc., order arrangement from big to small;
The geographical location information calculating formula of similarity of i-th user and a jth user is:
e i j = Σ k = 1 n λ k Π m = 1 k δ ( g i k , g j k ) / Σ k = 1 n λ k
Wherein e ijrepresent the Similarity value of two user's geographical location information, δ () is logical function, the value 1 when two character strings are identical, otherwise is 0, λ k∈ (0,1) is balance parameters, for regulating the weight of different ingredient in geographical location information, is carried out the selection of adjustment parameter by the result of system on verification msg collection;
Finally, the geographical location information similarity matrix of all users is exported:
In step 2) in, the method that described structure electrographic recording matrix combines the objective function of decomposition is:
(1) what reduce each user most possibly with electrographic recording matrix being decomposed into the loss formula in time factor matrix and date factor matrix process is:
minl 1=||U i-V iS i T|| 2
Wherein V irepresent the time factor matrix affecting i-th user power utilization behavior, S irepresent the date factor matrix affecting i-th user power utilization behavior;
(2) consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user is kept:
min l 2 = Σ i = 1 N | | U i - V i S T | | 2
At this, the decomposition goal function of all N number of users is fused to together, and shares date factor matrix;
(3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly:
min l 3 = Σ i = 1 N Σ j = 1 N e i j | | V i - V j | | 2
(4) retention time factor matrix most possibly is level and smooth:
min l 4 = Σ i = 1 N | | V i | | 2
(5) the level and smooth of date factor matrix is kept most possibly:
minl 5=||S|| 2
At this, adopt the L of matrix 2normal form square || || 2matrix is made to keep level and smooth.
Finally, the objective function of confederate matrix decomposition model is obtained by the objectives function in fusion steps (2), (3), (4), (5):
min l = 1 2 Σ i = 1 N | | U i - V i S T | | 2 + α 2 Σ i = 1 N | | V i | | 2 + β 2 | | S | | 2 + γ 2 Σ i = 1 N Σ j = 1 N e i j | | V i - V j | | 2
Wherein α, β and γ are balance parameters, for regulating the weight between the objectives function, carry out the selection of adjustment parameter by the result of system on verification msg collection.
In step 2) in, adopt the L of matrix 2normal form square || || 2weigh the loss in each step and matrix level and smooth, the convexity of final objective function l can be ensured, make model have globally optimal solution.Following simple derivation is had to the convexity of objective function:
Prove 1: objective function l is V iconvex function.
First objective function is rewritten as:
l = 1 2 | | U i - V i S T | | 2 + α 2 | | V i | | 2 + γ 2 Σ j = 1 , j ≠ i N e i j | | V i - V j | | 2 + C 1
Wherein C 1be and V iirrelevant constant.Make l=f 1+ α f 2+ γ f 3+ C 1, wherein:
f 1 = 1 2 | | U i - V i S T | | 2 , f 2 = 1 2 | | V i | | 2 , f 3 = 1 2 Σ j = 1 , j ≠ i N e i j | | V i - V j | | 2
Below f is proved respectively 1, f 2, f 3v iconvex function.
(1) f 1 = 1 2 Σ t = 1 T Σ d = 1 D [ u i t d - V i t * ( S d * ) T ] 2 = C 2 + 1 2 Σ t = 1 T { V i t * [ Σ d = 1 D ( S d * ) T S d * ] ( V i t * ) T - 2 ( Σ d = 1 D u i t d S d * ) ( V i t * ) T } , Wherein C 2be and V iirrelevant constant.To V ievery a line differentiate, can obtain and ∂ 2 f 1 / ∂ ( V i t * ) T ∂ V i p * = 0 ( t ≠ p ) . With v i = ( V i 1 * , V i 2 * , ... , V i T * ) T Represent V ithe higher-dimension row vector be extended to by row, then f 1to v ithe gloomy matrix in sea be a block diagonal matrix again because, z T [ Σ d = 1 D ( S d * ) T S d * ] z = Σ d = 1 D z T ( S d * ) T S d * z = Σ d = 1 D ( S d * z ) 2 ≥ 0 Tie up non-vanishing vector z to any T all to set up, then have det (G tthe determinant of)>=0, det () representing matrix.Therefore, f 1v iconvex function.
(2) v in above-mentioned steps (1) is adopted i, can obtain so f 2v iconvex function.
(3) make c 2be and V iirrelevant constant.From the step (2) in proof 1, v iconvex function.Again and e ij>=0, so f 3v iconvex function.
To sum up, and have alpha, gamma>=0, objective function l is V iconvex function, card finish.
Prove that 2: objective function l is the convex function of S.
Equally, first objective function is rewritten as:
l = 1 2 Σ i = 1 N | | U i - V i S T | | 2 + β 2 | | S | | 2 + C 1
C 1the constant irrelevant with S.Order wherein:
f i = 1 2 | | U i - V i S T | | 2 = 1 2 Σ d = 1 D { S d * [ Σ t = 1 T ( V i t * ) T V i t * ] ( S d * ) T - 2 ( Σ t = 1 T u i t d V i t * ) ( S d * ) T } + C 2
C 2the constant irrelevant with S.Similar with the proof of theorem 1, use represent the higher-dimension row vector be extended to by row by S, can f be obtained ia block diagonal matrix to the gloomy matrix in the sea of S G s = Δ ∂ 2 f i / ∂ s ∂ s T = d i a g [ G 1 , G 2 , ... , G D ] , Wherein G d = Δ Σ t = 1 T ( V i t * ) T V i t * , Then f iit is the convex function of S.Again || S|| 2the convex function of S, and β>=0, objective function l is the convex function of S, and card is finished.
In step 2) in, objective function l is V simultaneously iwith the convex function of S, in order to improve the velocities solved of algorithm, and reducing the demand of algorithm to storage space, adopting the iteration based on Gradient Descent more to newly arrive and solving V iand S, as shown in Figure 3, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:
Step 2.1) initialization matrix with the S2.1 stage:
K represents the number of user's request variable in concealed space, matrix V 1, V 2... V nwith each element in S is the real number between 0 to 1 by random initializtion;
Step 2.2) to matrix V iin each element carry out S2.2 stage of differentiate:
According to final goal function to V iin each element differentiate, concrete formula is:
∂ l / ∂ v i t k = ( V i t * S T - U i t * ) S * k + ( α + γΣ j = 1 , j ≠ i T e i j ) v i t k - γΣ j = 1 , j ≠ i T e i j v j t k ;
Step 2.3) to matrix V iin each element carry out S2.3 stage of upgrading:
Each deduct step-length respectively and be multiplied by gradient, concrete more new formula is η is the step-length of artificial setting;
Step 2.4) judge all V iwhether matrix upgrades the complete S2.4 stage:
If all matrix V iall upgrade complete, then carry out step 2.5) stage, otherwise return step 2.2) to next V iupgrade;
Step 2.5) element each in matrix S is carried out to S2.5 stage of differentiate:
According to final goal function to element differentiate each in S, concrete formula is ∂ l / ∂ s d k = Σ i = 1 N ( V i * k ) T [ V ( S d * ) T - U i * d ] + βs d k ;
Step 2.6) S2.6 stage that element each in matrix S is upgraded:
Each deduct step-length respectively and be multiplied by gradient, concrete more new formula is τ is the step-length of artificial setting;
Step 2.7) evaluation algorithm S2.7 stage of whether restraining:
If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);
Step 2.8) S2.8 stage of Output rusults:
Export user power utilization demand result, this flow process so far terminates.
In step 3) in, the computing formula of two described basic indexs is as follows:
(1) measurement index of the similarity of the need for electricity matrix of different user is as follows:
Sim(V i,V j)=tr(V i,V j T)/(||V i||||V j||)
On the diagonal line of wherein tr () representing matrix element and, || || the L of representing matrix 2normal form.
In (2) user grouping c, the computing formula of the overall need for electricity matrix of all users is as follows:
C e n t e r ( c ) = Σ u i ∈ c V i / | c |
Wherein | c| represents the number comprising user in grouping c.
As shown in Figure 4, in step 3) in, described subscriber segmentation method is as follows:
Step 3.1) be k bunch c by user's random division 1, c 2..., c kthe S3.1 stage:
Wherein k represents the number of the user bunch of artificial setting;
Step 3.2) S3.2 stage of representing of the entirety that calculates each bunch:
Adopt above-mentioned in step 3) in step (2) in the formula entirety that calculates each bunch represent;
Step 3.3) the S3.3 stage of similarity between calculating i-th user and all bunches:
Adopt above-mentioned in step 3) in step (1) in formula, the entirety of each bunch is represented and regards special " user " as, calculate user and bunch between similarity;
Step 3.4) S3.4 stage of bunch label of adjustment i-th user:
According to step 3.3) in result, bunch label of i-th user is adjusted to the most similar bunch with it;
Step 3.5) judge whether all users adjust the complete S3.5 stage:
Judge whether that bunch label of all users all adjusts complete, if so, carry out step 3.6), if not, return step 3.2);
Step 3.6) judge the S3.6 stage whether user's bunch label restrains:
Judge whether have a user's bunch label to change in last round of user bunch label adjustment, if so, to return step 3.2), if not, carry out step 3.7);
Step 3.7) S3.7 stage of Output rusults:
Export all users bunch label result, this flow process so far terminates.
The present invention adopts the dwelling places information of user power utilization record data in electric power data and user, is each user structure electrographic recording matrix respectively, and structure basedization contrast algorithm calculates the similarity of user's geographical location information.Confederate matrix decomposition model is adopted to carry out modeling to the need for electricity of user, the time factor of analyzing influence user power utilization behavior and date factor.Then, according to the geographic position similarity of user, merge geographical location information further, make the expression of user in implicit demand space comprise need for electricity information and geographical location information simultaneously.Finally, adopt clustering algorithm, according to the expression of user in implicit demand space, user is segmented, represent the incidence relation on electricity consumption behavior and place of abode formally, relation between user's request and user is understood to power department, adjusts electrical production, safeguard that the daily management activities such as electricity consumption facility have important reference value.
It is emphasized that; embodiment of the present invention is illustrative; instead of it is determinate; therefore the present invention is not limited to the embodiment described in embodiment; every other embodiments drawn by those skilled in the art's technical scheme according to the present invention, belong to the scope of protection of the invention equally.

Claims (7)

1. based on a power consumer divided method for confederate matrix decomposition model, it is characterized in that: the described power consumer divided method based on confederate matrix decomposition model comprises the following step performed in order:
Step 1) input user use electrographic recording data, according to using electrographic recording matrix with electrographic recording data construct user; The geographical location information of input user, represents the geographical location information of user by level, builds the geographical location information similarity matrix of user, and regulates the weight of different ingredient in various level geographical location information;
Step 2) according to step 1) objective function building confederate matrix decomposition model with electrographic recording matrix of the user that obtains, the time factor of analyzing influence user power utilization behavior and date factor, and select rational objective function derivation algorithm to solve, to obtain user power utilization demand;
Step 3) according to above-mentioned user power utilization demand, user is segmented:
In subscriber segmentation process, need calculating two basic indexs: the 1) measurement index of the similarity of the need for electricity matrix of different user, 2) in a user grouping, the calculating of the overall need for electricity matrix of all users.
2. the power consumer divided method based on confederate matrix decomposition model according to claim 1, it is characterized in that: in step 1) in, described input user uses electrographic recording data, according to user by the method for electrographic recording data construct electrographic recording matrix is:
All users in the user power utilization record data inputted are represented as set:
U={u 1,u 2,…,u N}
Wherein N represents the user's number comprised in data, u irepresent i-th user;
Electrographic recording matrix is used in being built as by electrographic recording data of i-th user:
Wherein D represents with the number of days that electrographic recording comprises in data, and T represents the number of the uniform sampling point comprised with electrographic recording of each user every day, represent the nonnegative real number matrix of the capable D row of T; Meanwhile, use with representing matrix U respectively it capable and d row, i.e. user u iall electrographic recording on every day t time point and d days with electrographic recording, and to use representing matrix U ithe capable d row of t on element;
Finally, what export all users uses electrographic recording matrix:
3. the power consumer divided method based on confederate matrix decomposition model according to claim 1, it is characterized in that: in step 1) in, the geographical location information of described input user, the geographical location information of user is represented by level, build the geographical location information similarity matrix of user, and regulate the method for the weight of different ingredient in various level geographical location information to be:
The geographical location information of i-th user is represented as structure:
g i = { g i 1 , g i 2 , ... , g i n }
Wherein for the string representation of certain ingredient in place of abode, by administrative unit, i.e. province, city, district, small towns, street, community etc., order arrangement from big to small;
The geographical location information calculating formula of similarity of i-th user and a jth user is:
e i j = Σ k = 1 n λ k Π m = 1 k δ ( g i k , g j k ) / Σ k = 1 n λ k
Wherein e ijrepresent the Similarity value of two user's geographical location information, δ () is logical function, the value 1 when two character strings are identical, otherwise is 0, λ k∈ (0,1) is balance parameters, for regulating the weight of different ingredient in geographical location information, is carried out the selection of adjustment parameter by the result of system on verification msg collection;
Finally, the geographical location information similarity matrix of all users is exported:
4. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 2) in, the method that described structure electrographic recording matrix combines the objective function of decomposition is:
(1) what reduce each user most possibly with electrographic recording matrix being decomposed into the loss formula in time factor matrix and date factor matrix process is:
min l 1 = | | U i - V i S i T | | 2
Wherein V irepresent the time factor matrix affecting i-th user power utilization behavior, S irepresent the date factor matrix affecting i-th user power utilization behavior;
(2) consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user is kept:
min l 2 = Σ i = 1 N | | U i - V i S T | | 2
At this, the decomposition goal function of all N number of users is fused to together, and shares date factor matrix;
(3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly:
min l 3 = Σ i = 1 N Σ j = 1 N e i j | | V i - V j | | 2
(4) retention time factor matrix most possibly is level and smooth:
min l 4 = Σ i = 1 N | | V i | | 2
(5) the level and smooth of date factor matrix is kept most possibly:
minl 5=||S|| 2
At this, adopt the L of matrix 2normal form square || || 2matrix is made to keep level and smooth.
Finally, the objective function of confederate matrix decomposition model is obtained by the objectives function in fusion steps (2), (3), (4), (5):
min l = 1 2 Σ i = 1 N | | U i - V i S T | | 2 + α 2 Σ i = 1 N | | V i | | 2 + β 2 | | S | | 2 + γ 2 Σ i = 1 N Σ j = 1 N e i j | | V i - V j | | 2
Wherein α, β and γ are balance parameters, for regulating the weight between the objectives function, carry out the selection of adjustment parameter by the result of system on verification msg collection.
5. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 2) in, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:
Step 2.1) initialization matrix with the S2.1 stage:
K represents the number of user's request variable in concealed space, matrix V 1, V 2... V nwith each element in S is the real number between 0 to 1 by random initializtion;
Step 2.2) to matrix V iin each element carry out S2.2 stage of differentiate:
According to final goal function to V iin each element differentiate, concrete formula is:
∂ l / ∂ v i t k = ( V i t * S T - U i t * ) S * k + ( α + γ Σ j = 1 , j ≠ i T e i j ) v i t k - γ Σ j = 1 , j ≠ i T e i j v j t k ;
Step 2.3) to matrix V iin each element carry out S2.3 stage of upgrading:
Each deduct step-length respectively and be multiplied by gradient, concrete more new formula is η is the step-length of artificial setting;
Step 2.4) judge all V iwhether matrix upgrades the complete S2.4 stage:
If all matrix V iall upgrade complete, then carry out step 2.5) stage, otherwise return step 2.2) to next V iupgrade;
Step 2.5) element each in matrix S is carried out to S2.5 stage of differentiate:
According to final goal function to element differentiate each in S, concrete formula is ∂ l / ∂ s d k = Σ i = 1 N ( V i * k ) T [ V ( S d * ) T - U i * d ] + βs d k ;
Step 2.6) S2.6 stage that element each in matrix S is upgraded:
Each deduct step-length respectively and be multiplied by gradient, concrete more new formula is τ is the step-length of artificial setting;
Step 2.7) evaluation algorithm S2.7 stage of whether restraining:
If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);
Step 2.8) S2.8 stage of Output rusults:
Export user power utilization demand result, this flow process so far terminates.
6. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 3) in, the computing formula of two described basic indexs is as follows:
(1) measurement index of the similarity of the need for electricity matrix of different user is as follows:
Sim(V i,V j)=tr(V i,V j T)/(||V i||||V j||)
On the diagonal line of wherein tr () representing matrix element and, || || the L of representing matrix 2normal form;
In (2) user grouping c, the computing formula of the overall need for electricity matrix of all users is as follows:
C e n t e r ( c ) = Σ u i ∈ c V i / | c |
Wherein | c| represents the number comprising user in grouping c.
7. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 3) in, described subscriber segmentation method is as follows:
Step 3.1) be k bunch c by user's random division 1, c 2..., c kthe S3.1 stage:
Wherein k represents the number of the user bunch of artificial setting;
Step 3.2) S3.2 stage of representing of the entirety that calculates each bunch:
Adopt above-mentioned in step 3) in step (2) in the formula entirety that calculates each bunch represent;
Step 3.3) the S3.3 stage of similarity between calculating i-th user and all bunches:
Adopt above-mentioned in step 3) in step (1) in formula, the entirety of each bunch is represented and regards special " user " as, calculate user and bunch between similarity;
Step 3.4) S3.4 stage of bunch label of adjustment i-th user:
According to step 3.3) in result, bunch label of i-th user is adjusted to the most similar bunch with it;
Step 3.5) judge whether all users adjust the complete S3.5 stage:
Judge whether that bunch label of all users all adjusts complete, if so, carry out step 3.6), if not, return step 3.2);
Step 3.6) judge the S3.6 stage whether user's bunch label restrains:
Judge whether have a user's bunch label to change in last round of user bunch label adjustment, if so, to return step 3.2), if not, carry out step 3.7);
Step 3.7) S3.7 stage of Output rusults:
Export all users bunch label result, this flow process so far terminates.
CN201510801889.8A 2015-11-19 2015-11-19 Power consumer subdivision method based on combined matrix decomposition model Pending CN105447767A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510801889.8A CN105447767A (en) 2015-11-19 2015-11-19 Power consumer subdivision method based on combined matrix decomposition model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510801889.8A CN105447767A (en) 2015-11-19 2015-11-19 Power consumer subdivision method based on combined matrix decomposition model

Publications (1)

Publication Number Publication Date
CN105447767A true CN105447767A (en) 2016-03-30

Family

ID=55557901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510801889.8A Pending CN105447767A (en) 2015-11-19 2015-11-19 Power consumer subdivision method based on combined matrix decomposition model

Country Status (1)

Country Link
CN (1) CN105447767A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021538A (en) * 2017-11-15 2018-05-11 国网甘肃省电力公司信息通信公司 A kind of electric quantity data restoration methods based on joint Weather information matrix decomposition
CN108734216A (en) * 2018-05-22 2018-11-02 广东工业大学 Classification of power customers method, apparatus and storage medium based on load curve form
CN109740790A (en) * 2018-11-28 2019-05-10 国网天津市电力公司 A kind of user power consumption prediction technique extracted based on temporal aspect
CN116805785A (en) * 2023-08-17 2023-09-26 国网浙江省电力有限公司金华供电公司 Power load hierarchy time sequence prediction method based on random clustering

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021538A (en) * 2017-11-15 2018-05-11 国网甘肃省电力公司信息通信公司 A kind of electric quantity data restoration methods based on joint Weather information matrix decomposition
CN108021538B (en) * 2017-11-15 2021-06-04 国网甘肃省电力公司信息通信公司 Electric quantity data recovery method based on joint weather information matrix decomposition
CN108734216A (en) * 2018-05-22 2018-11-02 广东工业大学 Classification of power customers method, apparatus and storage medium based on load curve form
CN109740790A (en) * 2018-11-28 2019-05-10 国网天津市电力公司 A kind of user power consumption prediction technique extracted based on temporal aspect
CN116805785A (en) * 2023-08-17 2023-09-26 国网浙江省电力有限公司金华供电公司 Power load hierarchy time sequence prediction method based on random clustering
CN116805785B (en) * 2023-08-17 2023-11-28 国网浙江省电力有限公司金华供电公司 Power load hierarchy time sequence prediction method based on random clustering

Similar Documents

Publication Publication Date Title
CN113962364B (en) Multi-factor power load prediction method based on deep learning
Domínguez-Muñoz et al. Selection of typical demand days for CHP optimization
CN108446794A (en) One kind being based on multiple convolutional neural networks combination framework deep learning prediction techniques
CN104951425A (en) Cloud service performance adaptive action type selection method based on deep learning
CN105447767A (en) Power consumer subdivision method based on combined matrix decomposition model
CN111724039B (en) Recommendation method for recommending customer service personnel to power users
CN106952027A (en) A kind of 10kV distribution network lines plan access capacity computational methods
CN104376502A (en) Electric power customer credit comprehensive evaluation method based on grey relational degree
CN109117872A (en) A kind of user power utilization behavior analysis method based on automatic Optimal Clustering
CN105335800A (en) Method for forecasting electricity consumption of power consumers based on joint learning
CN112614011A (en) Power distribution network material demand prediction method and device, storage medium and electronic equipment
CN108960488A (en) A kind of accurate prediction technique of saturation loading spatial distribution based on deep learning and Multi-source Information Fusion
CN104636834B (en) A kind of improved joint probability plan model system optimization method
CN111027741A (en) Method for constructing space-time dimension-oriented generalized load model analysis library
CN113591368A (en) Comprehensive energy system multi-energy load prediction method and system
Zhao et al. Short-term microgrid load probability density forecasting method based on k-means-deep learning quantile regression
CN101807218B (en) Heterogeneous network-based land pattern succession simulation system
CN109657846A (en) Power grid alternative subsidy scale impact factor screening technique
Guan et al. Customer load forecasting method based on the industry electricity consumption behavior portrait
CN116701965A (en) BIRCH clustering algorithm-based panoramic carbon representation method for enterprise users
CN114839586B (en) Low-voltage station metering device misalignment calculation method based on EM algorithm
Zhang et al. A segmented evaluation model for building energy performance considering seasonal dynamic fluctuations
Guo et al. Mobile user credit prediction based on lightgbm
CN108615091A (en) Electric power meteorology load data prediction technique based on cluster screening and neural network
CN105741143A (en) Load characteristic and cluster analysis based electric power commodity pricing model establishment method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160330

RJ01 Rejection of invention patent application after publication