CN105447767A

CN105447767A - Power consumer subdivision method based on combined matrix decomposition model

Info

Publication number: CN105447767A
Application number: CN201510801889.8A
Authority: CN
Inventors: 王扬; 刘杰; 吴凡; 章斌; 魏睐; 杨得博; 梅振鹏; 郎赫
Original assignee: State Grid Corp of China SGCC; State Grid Tianjin Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Tianjin Electric Power Co Ltd
Priority date: 2015-11-19
Filing date: 2015-11-19
Publication date: 2016-03-30

Abstract

The invention discloses a power consumer subdivision method based on a combined matrix decomposition model. The method comprises the steps: inputting user power utilization record data, and constructing a user power utilization record matrix; inputting the geographical position information of the user, representing the geographical position information through hierarchy, constructing a user geographical position information similarity matrix, and regulating the weights of different parts in the geographical position information at different hierarchies; constructing a target function of the combined matrix decomposition model according to the user power utilization record matrix, and selecting a reasonable target function solving algorithm for solving, so as to obtain the power utilization demands of users; and carrying out the subdividing of the users according to the power utilization demands of the users. Compared with a conventional method, the method enables the representation of the users in a demand space to be more abundant in meaning. The method employs a clustering algorithm, integrates the characteristics of the users in a demand hiding space and the relation among the users for clustering of the users, and enables the correlation of users in each cluster to be closer. The power utilization demands and geographic positions of the users of different clusters are more different.

Description

A kind of power consumer divided method based on confederate matrix decomposition model

Technical field

The invention belongs to Computer Applied Technology, data mining, electric power data analysis technical field, particularly relate to a kind of power consumer divided method based on confederate matrix decomposition model.

Background technology

Along with the raising of electric network information level, in electric system, create a large amount of data, bring new challenge also to the analysis of electric power data simultaneously.Traditional electric power data analysis side overweights producing and producing the research of data in power supply link, to the analysis of user data often for all users, thus have ignored the incidence relation between the feature of user itself and user.Usually the information such as electricity consumption behavior, geographic position, time, date of user is comprised in user data, and current Users'Data Analysis, the commonplace practice uses clustering algorithm to divide user according to electrographic recording, lack the comprehensive analysis to other information, and in these information, often comprise the key factor affecting user power utilization behavior.

Such as, Wang Lei analyzes power industry customer action feature, adopts k-means algorithm according to the electricity consumption behavior of user, carries out cluster to user behavior.And Wu Ling etc. adopt lifetime value theoretical, by extensively investigating and expert consulting, construct power consumer Value accounting system, and use analytical hierarchy process to be worth to assess user.Song Caihua etc. use customer general value appraisal procedure to establish customer segmentation model, construct the segmentation index system based on customer general value, and the mode using entropy assessment and expertise to combine carries out subscriber segmentation.Some researchist requires from user dependability, user is worth and user behavior three dimensions, sets up segmentation index system, utilizes k-means algorithm to carry out cluster to user.Also some researchist is by building large power customers behavior and value assessment index system, obtain demand characteristic and the Economic Value Evaluation of large power customers, and design big customer's credit comprehensive test system, realize the more comprehensive and accurate classification that becomes more meticulous of large power customers.

Matrix decomposition model is multivariate analysis model comparatively popular in recent years, because it has good interpretation in the data such as text and image, has apply very widely at Data Mining.Data matrix is decomposed into the product of hiding eigenmatrix and matrix of coefficients by matrix decomposition model, obtains original input data and hides expression in feature space in low-dimensional.The people such as Lee and Seung propose a kind of Non-negative Matrix Factorization model, by hiding eigenmatrix and matrix of coefficients employing nonnegativity restrictions to input data matrix and output, obtain the expression of text and image, and carry out text cluster and Postprocessing technique.And the people such as Cai and He on this basis, adopt graph of a relation to carry out the expression of bound data in implicit space, improve the performance of Non-negative Matrix Factorization model in text cluster.And some researchist proposes a kind of matrix decomposition model of relation constraint, incidence relation and data content between fused data, extract hidden feature, and obtain good result in text classification data.Multiple nonnegative matrix is then decomposed by the people such as Takeuchi jointly, merges user record, user social contact relation and song label, obtain user and song unified representation, and carry out song recommendations.

Summary of the invention

In order to solve the problem, the object of the present invention is to provide a kind of power consumer divided method based on confederate matrix decomposition model.

In order to achieve the above object, the power consumer divided method based on confederate matrix decomposition model provided by the invention comprises the following step performed in order:

Step 1) input user use electrographic recording data, according to using electrographic recording matrix with electrographic recording data construct user; The geographical location information of input user, represents the geographical location information of user by level, builds the geographical location information similarity matrix of user, and regulates the weight of different ingredient in various level geographical location information;

Step 2) according to step 1) objective function building confederate matrix decomposition model with electrographic recording matrix of the user that obtains, the time factor of analyzing influence user power utilization behavior and date factor, and select rational objective function derivation algorithm to solve, to obtain user power utilization demand;

Step 3) according to above-mentioned user power utilization demand, user is segmented:

In subscriber segmentation process, need calculating two basic indexs: the 1) measurement index of the similarity of the need for electricity matrix of different user, 2) in a user grouping, the calculating of the overall need for electricity matrix of all users.

In step 1) in, described input user uses electrographic recording data, according to user by the method for electrographic recording data construct electrographic recording matrix is:

All users in the user power utilization record data inputted are represented as set:

U＝{u ₁,u ₂,…,u _N}

Wherein N represents the user's number comprised in data, u _irepresent i-th user.

Electrographic recording matrix is used in being built as by electrographic recording data of i-th user:

Wherein D represents with the number of days that electrographic recording comprises in data, and T represents the number of the uniform sampling point comprised with electrographic recording of each user every day, represent the nonnegative real number matrix of the capable D row of T.Meanwhile, use with representing matrix U respectively _it capable and d row, i.e. user u _iall electrographic recording on every day t time point and d days with electrographic recording, and to use representing matrix U _ithe capable d row of t on element;

Finally, what export all users uses electrographic recording matrix:

In step 1) in, the geographical location information of described input user, the geographical location information of user is represented by level, builds the geographical location information similarity matrix of user, and regulate the method for the weight of different ingredient in various level geographical location information to be:

The geographical location information of i-th user is represented as structure:

g_{i} = {g_{i}^{1}, g_{i}^{2}, ..., g_{i}^{n}}

Wherein for the string representation of certain ingredient in place of abode, by administrative unit, i.e. province, city, district, small towns, street, community etc., order arrangement from big to small;

The geographical location information calculating formula of similarity of i-th user and a jth user is:

e^{i j} = Σ_{k = 1}^{n} λ_{k} Π_{m = 1}^{k} δ (g_{i}^{k}, g_{j}^{k}) / Σ_{k = 1}^{n} λ_{k}

Wherein e ^ijrepresent the Similarity value of two user's geographical location information, δ () is logical function, the value 1 when two character strings are identical, otherwise is 0, λ _k∈ (0,1) is balance parameters, for regulating the weight of different ingredient in geographical location information, is carried out the selection of adjustment parameter by the result of system on verification msg collection;

Finally, the geographical location information similarity matrix of all users is exported:

In step 2) in, the method that described structure electrographic recording matrix combines the objective function of decomposition is:

(1) what reduce each user most possibly with electrographic recording matrix being decomposed into the loss formula in time factor matrix and date factor matrix process is:

minl ₁＝||U _i-V _iS _i ^T|| ²

Wherein V _irepresent the time factor matrix affecting i-th user power utilization behavior, S _irepresent the date factor matrix affecting i-th user power utilization behavior;

(2) consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user is kept:

\min l_{2} = Σ_{i = 1}^{N} | | U_{i} - V_{i} S^{T} | |^{2}

At this, the decomposition goal function of all N number of users is fused to together, and shares date factor matrix;

(3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly:

\min l_{3} = Σ_{i = 1}^{N} Σ_{j = 1}^{N} e^{i j} | | V_{i} - V_{j} | |^{2}

(4) retention time factor matrix most possibly is level and smooth:

\min l_{4} = Σ_{i = 1}^{N} | | V_{i} | |^{2}

(5) the level and smooth of date factor matrix is kept most possibly:

minl ₅＝||S|| ²

At this, adopt the L of matrix ₂normal form square || || ²matrix is made to keep level and smooth.

Finally, the objective function of confederate matrix decomposition model is obtained by the objectives function in fusion steps (2), (3), (4), (5):

\min l = \frac{1}{2} Σ_{i = 1}^{N} | | U_{i} - V_{i} S^{T} | |^{2} + \frac{α}{2} Σ_{i = 1}^{N} | | V_{i} | |^{2} + \frac{β}{2} | | S | |^{2} + \frac{γ}{2} Σ_{i = 1}^{N} Σ_{j = 1}^{N} e^{i j} | | V_{i} - V_{j} | |^{2}

Wherein α, β and γ are balance parameters, for regulating the weight between the objectives function, carry out the selection of adjustment parameter by the result of system on verification msg collection.

In step 2) in, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:

Step 2.1) initialization matrix with the S2.1 stage:

K represents the number of user's request variable in concealed space, matrix V ₁, V ₂... V _nwith each element in S is the real number between 0 to 1 by random initializtion;

Step 2.2) to matrix V _iin each element carry out S2.2 stage of differentiate:

According to final goal function to V _iin each element differentiate, concrete formula is:

\partial l / \partial v_{i}^{t k} = (V_{i}^{t *} S^{T} - U_{i}^{t *}) S^{* k} + (α + {γΣ}_{j = 1, j &NotEqual; i}^{T} e^{i j}) v_{i}^{t k} - {γΣ}_{j = 1, j &NotEqual; i}^{T} e^{i j} v_{j}^{t k};

Step 2.3) to matrix V _iin each element carry out S2.3 stage of upgrading:

Each deduct step-length respectively and be multiplied by gradient, concrete more new formula is η is the step-length of artificial setting;

Step 2.4) judge all V _iwhether matrix upgrades the complete S2.4 stage:

If all matrix V _iall upgrade complete, then carry out step 2.5) stage, otherwise return step 2.2) to next V _iupgrade;

Step 2.5) element each in matrix S is carried out to S2.5 stage of differentiate:

According to final goal function to element differentiate each in S, concrete formula is

\partial l / \partial s^{d k} = Σ_{i = 1}^{N} {(V_{i}^{* k})}^{T} [V {(S^{d *})}^{T} - U_{i}^{* d}] + {βs}^{d k};

Step 2.6) S2.6 stage that element each in matrix S is upgraded:

Each deduct step-length respectively and be multiplied by gradient, concrete more new formula is τ is the step-length of artificial setting;

Step 2.7) evaluation algorithm S2.7 stage of whether restraining:

If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);

Step 2.8) S2.8 stage of Output rusults:

Export user power utilization demand result, this flow process so far terminates.

In step 3) in, the computing formula of two described basic indexs is as follows:

(1) measurement index of the similarity of the need for electricity matrix of different user is as follows:

Sim(V _i,V _j)＝tr(V _i,V _j ^T)/(||V _i||||V _j||)

On the diagonal line of wherein tr () representing matrix element and, || || the L of representing matrix ₂normal form.

In (2) user grouping c, the computing formula of the overall need for electricity matrix of all users is as follows:

C e n t e r (c) = Σ_{u_{i} &Element; c} V_{i} / | c |

Wherein | c| represents the number comprising user in grouping c.

In step 3) in, described subscriber segmentation method is as follows:

Step 3.1) be k bunch c by user's random division ₁, c ₂..., c _kthe S3.1 stage:

Wherein k represents the number of the user bunch of artificial setting;

Step 3.2) S3.2 stage of representing of the entirety that calculates each bunch:

Adopt above-mentioned in step 3) in step (2) in the formula entirety that calculates each bunch represent;

Step 3.3) the S3.3 stage of similarity between calculating i-th user and all bunches:

Adopt above-mentioned in step 3) in step (1) in formula, the entirety of each bunch is represented and regards special " user " as, calculate user and bunch between similarity;

Step 3.4) S3.4 stage of bunch label of adjustment i-th user:

According to step 3.3) in result, bunch label of i-th user is adjusted to the most similar bunch with it;

Step 3.5) judge whether all users adjust the complete S3.5 stage:

Judge whether that bunch label of all users all adjusts complete, if so, carry out step 3.6), if not, return step 3.2);

Step 3.6) judge the S3.6 stage whether user's bunch label restrains:

Judge whether have a user's bunch label to change in last round of user bunch label adjustment, if so, to return step 3.2), if not, carry out step 3.7);

Step 3.7) S3.7 stage of Output rusults:

Export all users bunch label result, this flow process so far terminates.

The effect of the power consumer divided method based on confederate matrix decomposition model provided by the invention:

The present invention, to the demand analysis of user and expression, comprises the information such as the electricity consumption behavior of user, time, date and geographic position, and compared to classic method, the expression of user in demand space has more abundant connotation.

The present invention adopts clustering algorithm, and in implicit demand space, consider relation between the characteristic of user itself and user carry out cluster to user, the incidence relation of the user in each bunch is tightr.Between different bunches, all there is larger difference in user in need for electricity and geographic position etc.

Accompanying drawing explanation

Fig. 1 is the overall system structure schematic diagram of the power consumer divided method based on confederate matrix decomposition model provided by the invention.

Fig. 2 is the power consumer divided method process flow diagram based on confederate matrix decomposition model provided by the invention.

Fig. 3 is provided by the invention based on confederate matrix decomposition model method for solving process flow diagram in the power consumer divided method of confederate matrix decomposition model.

Fig. 4 is provided by the invention based on subscriber segmentation method flow diagram in the power consumer divided method of confederate matrix decomposition model.

Embodiment

Below in conjunction with the drawings and specific embodiments, the power consumer divided method based on confederate matrix decomposition model provided by the invention is described in detail.

As shown in Figure 1, the present invention mainly adopts data mining theories and method to analyze the user in electric power data, in order to ensure the normal operation of system, in concrete enforcement, require that the computer platform used is equipped with the internal memory being not less than 8G, core cpu number is not less than 4 and dominant frequency is not less than 64 bit manipulation systems of 2.6GHz, Windows7 and above version, and installs the Kinds of Essential Software environment such as oracle database, Java1.7 and above version, Matlab2011b and above version.

As shown in Figure 2, the power consumer divided method based on confederate matrix decomposition model provided by the invention comprises the following step performed in order:

User comprises with electrographic recording matrix two key factors affecting user power utilization behavior: 1) row of matrix represents the user data sampling time point that every day is different, 2) matrix column represents the every day comprised in user record data.

Described objective function comprises five aspects: that 1) reduces each user most possibly is being decomposed into the loss in time factor matrix and date factor matrix process, 2 with electrographic recording matrix) keep the consistance of the date factor matrix obtained by electrographic recording matrix decomposition of different user; 3) difference of the time factor matrix of user adjacent on geographic position is reduced in most possibly; 4) retention time factor matrix and date factor matrix most possibly is level and smooth, 5) keep the convexity of objective function, make model have globally optimal solution; Derivation algorithm should meet three pacing itemss: 1) algorithm can be restrained within the acceptable time, 2) algorithm will control in the reasonable scope the demand of storage space, 3) algorithm can executed in parallel, to improve counting yield.

According to user power utilization demand, segmentation being carried out to user and should meet two pacing itemss: the user 1) in same group should have similar need for electricity matrix, 2) the need for electricity matrix of user in different groups should be different as far as possible; Therefore, in order to improve the quality of subscriber segmentation result, in subscriber segmentation process, need calculating two basic indexs: the 1) measurement index of the similarity of the need for electricity matrix of different user, 2) in a user grouping, the calculating of the overall need for electricity matrix of all users; Meanwhile, algorithm should be able to be optimized voluntarily, adjusts the user be divided in the grouping of mistake.

U＝{u ₁,u ₂,…,u _N}

Finally, what export all users uses electrographic recording matrix:

The geographical location information of i-th user is represented as structure:

g_{i} = {g_{i}^{1}, g_{i}^{2}, ..., g_{i}^{n}}

e^{i j} = Σ_{k = 1}^{n} λ_{k} Π_{m = 1}^{k} δ (g_{i}^{k}, g_{j}^{k}) / Σ_{k = 1}^{n} λ_{k}

minl ₁＝||U _i-V _iS _i ^T|| ²

\min l_{2} = Σ_{i = 1}^{N} | | U_{i} - V_{i} S^{T} | |^{2}

\min l_{3} = Σ_{i = 1}^{N} Σ_{j = 1}^{N} e^{i j} | | V_{i} - V_{j} | |^{2}

(4) retention time factor matrix most possibly is level and smooth:

\min l_{4} = Σ_{i = 1}^{N} | | V_{i} | |^{2}

(5) the level and smooth of date factor matrix is kept most possibly:

minl ₅＝||S|| ²

\min l = \frac{1}{2} Σ_{i = 1}^{N} | | U_{i} - V_{i} S^{T} | |^{2} + \frac{α}{2} Σ_{i = 1}^{N} | | V_{i} | |^{2} + \frac{β}{2} | | S | |^{2} + \frac{γ}{2} Σ_{i = 1}^{N} Σ_{j = 1}^{N} e^{i j} | | V_{i} - V_{j} | |^{2}

In step 2) in, adopt the L of matrix ₂normal form square || || ²weigh the loss in each step and matrix level and smooth, the convexity of final objective function l can be ensured, make model have globally optimal solution.Following simple derivation is had to the convexity of objective function:

Prove 1: objective function l is V _iconvex function.

First objective function is rewritten as:

l = \frac{1}{2} | | U_{i} - V_{i} S^{T} | |^{2} + \frac{α}{2} | | V_{i} | |^{2} + \frac{γ}{2} Σ_{j = 1, j &NotEqual; i}^{N} e^{i j} | | V_{i} - V_{j} | |^{2} + C_{1}

Wherein C ₁be and V _iirrelevant constant.Make l=f ₁+ α f ₂+ γ f ₃+ C ₁, wherein:

f_{1} = \frac{1}{2} | | U_{i} - V_{i} S^{T} | |^{2}, f_{2} = \frac{1}{2} | | V_{i} | |^{2}, f_{3} = \frac{1}{2} Σ_{j = 1, j &NotEqual; i}^{N} e^{i j} | | V_{i} - V_{j} | |^{2}

Below f is proved respectively ₁, f ₂, f ₃v _iconvex function.

(1)

f_{1} = \frac{1}{2} Σ_{t = 1}^{T} Σ_{d = 1}^{D} {[u_{i}^{t d} - V_{i}^{t^{*}} {(S^{d *})}^{T}]}^{2} = C_{2} + \frac{1}{2} Σ_{t = 1}^{T} {V_{i}^{t^{*}} [Σ_{d = 1}^{D} {(S^{d *})}^{T} S^{d *}] {(V_{i}^{t *})}^{T} - 2 (Σ_{d = 1}^{D} u_{i}^{t d} S^{d *}) {(V_{i}^{t *})}^{T}},

Wherein C ₂be and V _iirrelevant constant.To V _ievery a line differentiate, can obtain and

\partial^{2} f_{1} / \partial {(V_{i}^{t *})}^{T} \partial V_{i}^{p *} = 0 (t &NotEqual; p) .

With

v_{i} = {(V_{i}^{1 *}, V_{i}^{2 *}, ..., V_{i}^{T *})}^{T}

Represent V _ithe higher-dimension row vector be extended to by row, then f ₁to v _ithe gloomy matrix in sea be a block diagonal matrix again because,

z^{T} [Σ_{d = 1}^{D} {(S^{d *})}^{T} S^{d *}] z = Σ_{d = 1}^{D} z^{T} {(S^{d *})}^{T} S^{d *} z = Σ_{d = 1}^{D} {(S^{d *} z)}^{2} &GreaterEqual; 0

Tie up non-vanishing vector z to any T all to set up, then have det (G _tthe determinant of)>=0, det () representing matrix.Therefore, f ₁v _iconvex function.

(2) v in above-mentioned steps (1) is adopted _i, can obtain so f ₂v _iconvex function.

(3) make c ₂be and V _iirrelevant constant.From the step (2) in proof 1, v _iconvex function.Again and e ^ij>=0, so f ₃v _iconvex function.

To sum up, and have alpha, gamma>=0, objective function l is V _iconvex function, card finish.

Prove that 2: objective function l is the convex function of S.

Equally, first objective function is rewritten as:

l = \frac{1}{2} Σ_{i = 1}^{N} | | U_{i} - V_{i} S^{T} | |^{2} + \frac{β}{2} | | S | |^{2} + C_{1}

C ₁the constant irrelevant with S.Order wherein:

f_{i} = \frac{1}{2} | | U_{i} - V_{i} S^{T} | |^{2} = \frac{1}{2} Σ_{d = 1}^{D} {S^{d *} [Σ_{t = 1}^{T} {(V_{i}^{t *})}^{T} V_{i}^{t *}] {(S^{d *})}^{T} - 2 (Σ_{t = 1}^{T} u_{i}^{t d} V_{i}^{t *}) {(S^{d *})}^{T}} + C_{2}

C ₂the constant irrelevant with S.Similar with the proof of theorem 1, use represent the higher-dimension row vector be extended to by row by S, can f be obtained _ia block diagonal matrix to the gloomy matrix in the sea of S

G_{s} \overset{Δ}{=} \partial^{2} f_{i} / \partial s \partial s^{T} = d i a g [G_{1}, G_{2}, ..., G_{D}],

Wherein

G_{d} \overset{Δ}{=} Σ_{t = 1}^{T} {(V_{i}^{t *})}^{T} V_{i}^{t *},

Then f _iit is the convex function of S.Again || S|| ²the convex function of S, and β>=0, objective function l is the convex function of S, and card is finished.

In step 2) in, objective function l is V simultaneously _iwith the convex function of S, in order to improve the velocities solved of algorithm, and reducing the demand of algorithm to storage space, adopting the iteration based on Gradient Descent more to newly arrive and solving V _iand S, as shown in Figure 3, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:

Step 2.1) initialization matrix with the S2.1 stage:

Step 2.2) to matrix V _iin each element carry out S2.2 stage of differentiate:

\partial l / \partial v_{i}^{t k} = (V_{i}^{t *} S^{T} - U_{i}^{t *}) S^{* k} + (α + {γΣ}_{j = 1, j &NotEqual; i}^{T} e^{i j}) v_{i}^{t k} - {γΣ}_{j = 1, j &NotEqual; i}^{T} e^{i j} v_{j}^{t k};

Step 2.3) to matrix V _iin each element carry out S2.3 stage of upgrading:

Step 2.4) judge all V _iwhether matrix upgrades the complete S2.4 stage:

\partial l / \partial s^{d k} = Σ_{i = 1}^{N} {(V_{i}^{* k})}^{T} [V {(S^{d *})}^{T} - U_{i}^{* d}] + {βs}^{d k};

Step 2.6) S2.6 stage that element each in matrix S is upgraded:

Step 2.7) evaluation algorithm S2.7 stage of whether restraining:

If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);

Step 2.8) S2.8 stage of Output rusults:

Sim(V _i,V _j)＝tr(V _i,V _j ^T)/(||V _i||||V _j||)

C e n t e r (c) = Σ_{u_{i} &Element; c} V_{i} / | c |

Wherein | c| represents the number comprising user in grouping c.

As shown in Figure 4, in step 3) in, described subscriber segmentation method is as follows:

Wherein k represents the number of the user bunch of artificial setting;

Step 3.4) S3.4 stage of bunch label of adjustment i-th user:

Step 3.5) judge whether all users adjust the complete S3.5 stage:

Step 3.6) judge the S3.6 stage whether user's bunch label restrains:

Step 3.7) S3.7 stage of Output rusults:

Export all users bunch label result, this flow process so far terminates.

The present invention adopts the dwelling places information of user power utilization record data in electric power data and user, is each user structure electrographic recording matrix respectively, and structure basedization contrast algorithm calculates the similarity of user's geographical location information.Confederate matrix decomposition model is adopted to carry out modeling to the need for electricity of user, the time factor of analyzing influence user power utilization behavior and date factor.Then, according to the geographic position similarity of user, merge geographical location information further, make the expression of user in implicit demand space comprise need for electricity information and geographical location information simultaneously.Finally, adopt clustering algorithm, according to the expression of user in implicit demand space, user is segmented, represent the incidence relation on electricity consumption behavior and place of abode formally, relation between user's request and user is understood to power department, adjusts electrical production, safeguard that the daily management activities such as electricity consumption facility have important reference value.

It is emphasized that; embodiment of the present invention is illustrative; instead of it is determinate; therefore the present invention is not limited to the embodiment described in embodiment; every other embodiments drawn by those skilled in the art's technical scheme according to the present invention, belong to the scope of protection of the invention equally.

Claims

1. based on a power consumer divided method for confederate matrix decomposition model, it is characterized in that: the described power consumer divided method based on confederate matrix decomposition model comprises the following step performed in order:

2. the power consumer divided method based on confederate matrix decomposition model according to claim 1, it is characterized in that: in step 1) in, described input user uses electrographic recording data, according to user by the method for electrographic recording data construct electrographic recording matrix is:

U＝{u ₁,u ₂,…,u _N}

Wherein N represents the user's number comprised in data, u _irepresent i-th user;

Wherein D represents with the number of days that electrographic recording comprises in data, and T represents the number of the uniform sampling point comprised with electrographic recording of each user every day, represent the nonnegative real number matrix of the capable D row of T; Meanwhile, use with representing matrix U respectively _it capable and d row, i.e. user u _iall electrographic recording on every day t time point and d days with electrographic recording, and to use representing matrix U _ithe capable d row of t on element;

Finally, what export all users uses electrographic recording matrix:

3. the power consumer divided method based on confederate matrix decomposition model according to claim 1, it is characterized in that: in step 1) in, the geographical location information of described input user, the geographical location information of user is represented by level, build the geographical location information similarity matrix of user, and regulate the method for the weight of different ingredient in various level geographical location information to be:

The geographical location information of i-th user is represented as structure:

g_{i} = {g_{i}^{1}, g_{i}^{2}, ..., g_{i}^{n}}

e^{i j} = Σ_{k = 1}^{n} λ_{k} Π_{m = 1}^{k} δ (g_{i}^{k}, g_{j}^{k}) / Σ_{k = 1}^{n} λ_{k}

4. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 2) in, the method that described structure electrographic recording matrix combines the objective function of decomposition is:

\min l_{1} = | | U_{i} - V_{i} {S_{i}}^{T} | |^{2}

\min l_{2} = Σ_{i = 1}^{N} | | U_{i} - V_{i} S^{T} | |^{2}

\min l_{3} = Σ_{i = 1}^{N} Σ_{j = 1}^{N} e^{i j} | | V_{i} - V_{j} | |^{2}

(4) retention time factor matrix most possibly is level and smooth:

\min l_{4} = Σ_{i = 1}^{N} | | V_{i} | |^{2}

(5) the level and smooth of date factor matrix is kept most possibly:

minl ₅＝||S|| ²

\min l = \frac{1}{2} Σ_{i = 1}^{N} | | U_{i} - V_{i} S^{T} | |^{2} + \frac{α}{2} Σ_{i = 1}^{N} | | V_{i} | |^{2} + \frac{β}{2} | | S | |^{2} + \frac{γ}{2} Σ_{i = 1}^{N} Σ_{j = 1}^{N} e^{i j} | | V_{i} - V_{j} | |^{2}

5. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 2) in, the method that the rational objective function derivation algorithm of described selection carries out solving comprises the steps:

Step 2.1) initialization matrix with the S2.1 stage:

Step 2.2) to matrix V _iin each element carry out S2.2 stage of differentiate:

\partial l / \partial v_{i}^{t k} = (V_{i}^{t *} S^{T} - U_{i}^{t *}) S^{* k} + (α + γ Σ_{j = 1, j &NotEqual; i}^{T} e^{i j}) v_{i}^{t k} - γ Σ_{j = 1, j &NotEqual; i}^{T} e^{i j} v_{j}^{t k};

Step 2.3) to matrix V _iin each element carry out S2.3 stage of upgrading:

Step 2.4) judge all V _iwhether matrix upgrades the complete S2.4 stage:

\partial l / \partial s^{d k} = Σ_{i = 1}^{N} {(V_{i}^{* k})}^{T} [V {(S^{d *})}^{T} - U_{i}^{* d}] + {βs}^{d k};

Step 2.6) S2.6 stage that element each in matrix S is upgraded:

Step 2.7) evaluation algorithm S2.7 stage of whether restraining:

If algorithm convergence, then carry out step 2.8), otherwise return step 2.2);

Step 2.8) S2.8 stage of Output rusults:

6. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 3) in, the computing formula of two described basic indexs is as follows:

Sim(V _i,V _j)＝tr(V _i,V _j ^T)/(||V _i||||V _j||)

On the diagonal line of wherein tr () representing matrix element and, || || the L of representing matrix ₂normal form;

C e n t e r (c) = Σ_{u_{i} &Element; c} V_{i} / | c |

Wherein | c| represents the number comprising user in grouping c.

7. the power consumer divided method based on confederate matrix decomposition model according to claim 1, is characterized in that: in step 3) in, described subscriber segmentation method is as follows:

Wherein k represents the number of the user bunch of artificial setting;

Step 3.4) S3.4 stage of bunch label of adjustment i-th user:

Step 3.5) judge whether all users adjust the complete S3.5 stage:

Step 3.6) judge the S3.6 stage whether user's bunch label restrains:

Step 3.7) S3.7 stage of Output rusults:

Export all users bunch label result, this flow process so far terminates.