Summary of the invention
The technical problem to be solved in the present invention and the technical assignment of proposition are to be improved and improved to prior art,
The change of title Electricity customers localization method for being used for power marketing is provided, with purpose.For this purpose, the present invention takes following technical side
Case.
Change of title Electricity customers localization method for power marketing comprising following steps:
1) preliminary index selection obtains target data group needed for modeling, and to acquisition according to change of title business finding
Data carry out the pretreatment of data, preliminary to choose modeling index;
From basic information, behavior of paying dues, 7 indexs are extracted with three dimensions of electrical feature carry out model construction, respectively
Town and country classification, family age, whether contact details change, whether gathering department changes, whether way to pay dues changes, transfer ownership first 3 months
Electricity consumption, transfer front and back electricity maximum fluctuation;
2) index analysis
201) gathering department, way to pay dues, the analysis of contact details situation of change
By gathering department before and after timing node in three months of change of title Electricity customers and common Electricity customers, pay dues
Mode and the changed client's accounting of contact details compare and analyze, and obtain change of title client and common Electricity customers
The difference degree of corresponding index;
202) transfer front and back analysis of electric power consumption
By 3 months electricity consumptions before the transfer of change of title Electricity customers and common Electricity customers be continuously 0 client's accounting,
After transfer 3 months electricity consumptions be continuously 0 client's accounting and transfer front and back 3 months electricity consumptions be continuously 0 client's accounting into
Row comparative analysis obtains the difference degree of change of title client index corresponding with common Electricity customers, finds out change of title client
Empty window phase feature;
203) family age is analyzed
Change of title Electricity customers and the average family age of common Electricity customers are compared and analyzed, change of title visitor is obtained
Family and common Electricity customers are averaged the difference degree in family age;
3) index determines
Primary election index is adjusted according to index analysis result, selects change of title Electricity customers and common Electricity customers
The correspondence index that difference degree is greater than the set value is to determine index, determines final modeling index;
4) potential change of title Electricity customers prediction model building
401) according to determining modeling index, a part is used as training set, another part conduct in random screening sample set
Test set constructs potential change of title Electricity customers prediction model using Logistic logistic regression algorithm;
402) output model is based on Logistic logistic regression algorithm, is trained to training set and learns to generate potential
Change of title Electricity customers prediction model coefficient results simultaneously obtain each index to the influence degree of model and prediction confusion matrix;
403) model measurement is carried out as a result, model is applied on test set according to training set model prediction, training of judgement
Whether the prediction effect of collection and test set has reached ideal effect, if so, determining that the model is potential change of title electricity consumption visitor
Family prediction model, otherwise, return step 1) readjust data and index carries out the building of model;
5) according to determining potential change of title Electricity customers prediction model, potential change of title Electricity customers prediction is carried out
As a result it exports, positions potential change of title Electricity customers.
The technical program is based on the detailed data in sales service system, power information acquisition system, in conjunction with 95598 works
Single, integration is paid dues platform data, signature analysis is carried out to the user that transfer behavior has occurred, from basic information and row of paying dues
For, set out with the big dimension of electrical feature three, refine relative influence index, and then establish Logic Regression Models, pass through models fitting
Parameter can refine some features that the high user of change of title potentiality occurs, the last case verification prediction model it is effective
Property, potential electric power transfer client is verified in analysis, if change of title has occurred for reality, it can be reminded to handle electric power transfer,
Update basic information.It if practical change of title does not occur, can confirm whether it has transfer intention, it is related to inform that it handles transfer
Process and convenient channel remind it to transfer ownership in property right while handling electric power transfer process.
As optimization technique means:In step 401),
Logistic function formula is:
Wherein e is natural logrithm, and z is the steepness of curve,
The case where for linear classification, boundary regime is as follows:
Wherein θ0It is constant, θ1,θ2,...θnRefer to the coefficient of variable, x1,x2,...xnIt is specific variable, wherein variable is
It can influence the specific power consumption index of user's transfer Potential Prediction result, including town and country classification, family age, electricity consumption classification, the side of paying dues
Formula;
It is in conjunction with the anticipation function that (1), (2) formula construct:
Wherein g (θTX)=g (z), g (θTX) for whether transfer ownership two as a result, be the probability for taking 1, otherwise take 0 it is general
Rate, therefore above formula can be converted into again:
P (y=1 | x;θ)=hθ(x) (4)
P (y=0 | x;θ)=1-hθ(x) (5)
Formula (4), (5) formula integrate for:
P(y|x;θ)=(hθ(x))y(1-hθ(x))1-y (6)
Logarithm is taken to (6) formula, can derive maximal possibility estimation Cost function is:
Cost loss function maximal possibility estimation here, therefore Cost value is smaller, function is more restrained, and is estimated to model
It is better to count effect;
It asks Cost functional minimum value using gradient descent method, the renewal process of θ can be obtained according to gradient descent method:
By gradient descent method, (7) formula can be write as:
By (9) formula it is found that when Cost minimum, as residual error (hθ(x(i))-y(i)) and minimum, entire pattern function and reality
Border result fitting effect is best.
As optimization technique means:In step 402), the model of generation includes:
A) whether z (x)=- 3.165-0.594* gathering department changes before whether+6.183* contact method change+3.873*
Whether electricity is that the family 0+0.753* age, whether less than the 15 years town and country -0.417* classification -0.636* way to pay dues occurred within 3 months
Variation;
IfThen it is considered change of title Electricity customers;IfThen think to be non-change of title Electricity customers.
As optimization technique means:In step 401), 70% is used as training set, 30% conduct in random screening sample set
Test set.
As optimization technique means:In step 4), increase the model parameter of house region position, according to house institute
Each region practical development situation is combined to be further analyzed in regional location, to improve the accuracy rate and recall rate of model.
As optimization technique means:It is real to potential change of title Electricity customers combination change of title client in step 5)
Border situation is confirmed, generates feature tag to confirmation change of title client, using derivative label information, to live for precision marketing
It is dynamic to support.
Beneficial effect:The technical program based on the detailed data in sales service system, power information acquisition system, in conjunction with
95598 work orders, integration are paid dues platform data, carry out signature analysis to the user that transfer behavior has occurred, from basic information and
Pay dues behavior, set out with the big dimension of electrical feature three, refine relative influence index, excavate potential transfer client, verify customer basis
Information improves basic information accuracy rate, to customer complaint rate is reduced, increases customer satisfaction degree, realizes accurate electricity charge electricity short message
Push, power off notifying, channel promotion, have very important significance.The achievement of Work Flow Optimizing can be allowed to obtain simultaneously
To overall application, the optimal balance of intensive efficiency and user's perception is realized.
Specific embodiment
Technical solution of the present invention is described in further detail below in conjunction with Figure of description.
As shown in Figure 1, the present invention includes the following steps:
1) preliminary index selection obtains target data group needed for modeling, and to acquisition according to change of title business finding
Data carry out the pretreatment of data, preliminary to choose modeling index;
From basic information, behavior of paying dues, 7 indexs are extracted with three dimensions of electrical feature carry out model construction, respectively
Town and country classification, family age, whether contact details change, whether gathering department changes, whether way to pay dues changes, transfer ownership first 3 months
Electricity consumption, transfer front and back electricity maximum fluctuation;Wherein town and country classification:Town dweller, rural resident;Family age:2017- registers for a household residence card year
Part;Whether contact details change:Whether 3 months contact methods change before and after timing node;Whether gathering department changes:When
Whether 3 months gathering departments change before and after intermediate node;Whether way to pay dues changes:3 months sides of paying dues before and after timing node
Whether formula changes;Transfer ownership preceding 3 months electricity consumptions:Whether the preceding 3 months electricity consumptions that transfer ownership continuously are 0, main to consider before transferring ownership
It is possible that a period of time empty window phase;Transfer front and back electricity maximum fluctuation:(before the maximum electricity consumption-transfer in 3 months of transfer front and back
3 months minimum electricity consumptions afterwards) the 3 months minimum electricity consumptions in/transfer front and back;
2) index analysis
201) gathering department, way to pay dues, the analysis of contact details situation of change
By gathering department before and after timing node in three months of change of title Electricity customers and common Electricity customers, pay dues
Mode and the changed client's accounting of contact details compare and analyze, and obtain change of title client and common Electricity customers
The difference degree of corresponding index;
202) transfer front and back analysis of electric power consumption
By 3 months electricity consumptions before the transfer of change of title Electricity customers and common Electricity customers be continuously 0 client's accounting,
After transfer 3 months electricity consumptions be continuously 0 client's accounting and transfer front and back 3 months electricity consumptions be continuously 0 client's accounting into
Row comparative analysis obtains the difference degree of change of title client index corresponding with common Electricity customers, finds out change of title client
Empty window phase feature;
203) family age is analyzed
Change of title Electricity customers and the average family age of common Electricity customers are compared and analyzed, change of title visitor is obtained
Family and common Electricity customers are averaged the difference degree in family age;
3) index determines
Primary election index is adjusted according to index analysis result, selects change of title Electricity customers and common Electricity customers
The correspondence index that difference degree is greater than the set value is to determine index, determines final modeling index;
4) potential change of title Electricity customers prediction model building
401) according to determining modeling index, a part is used as training set, another part conduct in random screening sample set
Test set constructs potential change of title Electricity customers prediction model using Logistic logistic regression algorithm;
402) output model is based on Logistic logistic regression algorithm, is trained to training set and learns to generate potential
Change of title Electricity customers prediction model coefficient results simultaneously obtain each index to the influence degree of model and prediction confusion matrix;
403) model measurement is carried out as a result, model is applied on test set according to training set model prediction, training of judgement
Whether the prediction effect of collection and test set has reached ideal effect, if so, determining that the model is potential change of title electricity consumption visitor
Family prediction model, otherwise, return step 1) readjust data and index carries out the building of model;
5) according to determining potential change of title Electricity customers prediction model, potential change of title Electricity customers prediction is carried out
As a result it exports, positions potential change of title Electricity customers.
Illustrate specific embodiment by taking Ningbo as an example below:
1 model analysis of Influential Factors and variable determine
The technical program is mentioned according to power customer data from basic information and the dimensions such as behavior, electricity consumption behavior of paying dues
Relative influence index is refined, model is established.It extracts and does not handle transfer user data with similar characteristics, establish test sample, lead to
It crosses model and excavates potential transferred ownership client or the potential client that will transfer ownership.
1.1 index explanation
By repeatedly adjusting, finally extracted from basic information, behavior of paying dues, with three dimensions of electrical feature 7 indexs into
Row model construction, respectively town and country classification, family age, whether contact details change, whether gathering department changes, way to pay dues whether
Variation, the preceding 3 months electricity consumptions of transfer, transfer front and back electricity maximum fluctuation.
Town and country classification:Town dweller, cottar;
Family age:2017- registers for a household residence card the time;
Whether contact details change:Whether 3 months contact methods change before and after timing node;
Whether gathering department changes:Whether 3 months gathering departments change before and after timing node;
Whether way to pay dues changes:Whether 3 months way to pay dues change before and after timing node;
Transfer ownership preceding 3 months electricity consumptions:Whether the preceding 3 months electricity consumptions that transfer ownership continuously are 0, main to consider to go out before transferring ownership
Now a period of time empty window phase;
Transfer front and back electricity maximum fluctuation:(minimum electricity consumption in 3 months before and after the maximum electricity consumption-transfer in 3 months of transfer front and back
Amount) the 3 months minimum electricity consumptions in/transfer front and back.
1.2 index analysis
Using the client of nearly one-year age generation change of title Ningbo City's in July, 2016 in June, 2017 as research object, from
6 gathering department, way to pay dues, contact method, transfer front and back electricity, family age, town and country classification aspect progress depth analysis, and with
The common non-transfer client in Ningbo City low pressure resident in June, 2017 analyzes as check sample, excavates transfer user's
Characteristic feature.Wherein, common non-transfer client includes that the client of change of title does not occur and change of title has occurred but has not handled
The client of electric power transfer.Client's number that the nearly 1 year low pressure resident in Ningbo City is transferred ownership is 103644 families, ends in June, 2017,
The client of transfer behavior totally 2857571 family is not handled in Ningbo City low pressure resident commonly.As shown in Figure 2,3, 4.
(1) gathering department's situation of change analysis
In the client that change of title occurs for nearly 1 year, there is 75602 families department's information of collect money before and after transfer to change, receipts
Money department change rate is 72.93%;End in June, 2017, in Ningbo City's low pressure resident's normal client, there are 315321 families transferring ownership
Gathering department's information is changed in front and back 3 months, and gathering department's change rate is 11.03%.It can be seen that transfer client with
Whether the gathering department of normal client changes this index, and there are notable differences.
(2) way to pay dues situation of change is analyzed
Occur within nearly 1 year there is 69049 families way to pay dues before and after transfer to change in the client of change of title, the side of paying dues
Formula change rate is 66.62%;In the common non-transfer client of Ningbo City low pressure resident, there are 248992 families before and after transfer in 3 months
Way to pay dues is changed, and way to pay dues change rate is 8.71%.It can be seen that the side of paying dues of transfer client and normal client
Whether formula changes this index, and there are notable differences.
(3) contact details situation of change is analyzed
Occur within nearly 1 year there is 88925 families contact method before and after transfer to change in the client of change of title, correspondent party
Formula change rate is 85.79%;In the common non-transfer client of Ningbo City low pressure resident, have 30333 families before and after transfer 3 months it is inline
It is that mode information change is changed, contact information change rate is 1.06%.It can be seen that transfer client and common visitor
Whether the contact details at family change this index, and there are notable differences.
Transfer ownership client and the gathering department of normal client, way to pay dues, contact details situation of change such as table 1.
The gathering of table 1 department, way to pay dues, contact details situation of change
(4) transfer front and back analysis of electric power consumption
In the client that change of title occurs for nearly 1 year, having 62799 family clients, 3 months electricity consumptions before transfer is continuously 0,
It is 0 that have 9390 family clients, 3 months electricity consumptions after transfer, which be continuously 0,7236 families, 3 months continuous electricity before and after transfer,;
In the common non-transfer client of Ningbo City low pressure resident, having 63912 families, 3 months electricity before transfer is continuously 0, there is 138949 families use
It is 0 that 3 months electricity consumptions, which are continuously 0,11178 families, 3 months continuous electricity before and after transfer, after the transfer of family.It can be seen that mistake
Family client can have the empty window phase of a period of time before transfer, and the empty window phase is unobvious after transfer;Common non-transfer client is not bright
The aobvious empty window phase.
The transfer of table 2 front and back analysis of electric power consumption
(5) family age is analyzed
It is 6.95 that its average family age of the client of change of title occurs Ningbo City's in July, 2016 to nearly 1 year of in June, 2017.
The average family age of the common non-transfer client of Ningbo City low pressure resident is 20.43.It can be found that the average family age of transfer client is remote
Less than the average family age of common non-transfer client.Change of title occurs for family age situation such as the following table 3 of transfer client and normal client
Client in about 73% client family age be no more than 10 years, and commonly do not handle in the client of transfer behavior about 29% client family
Age was less than 10 years.It can be seen that the family age of transfer client and normal client, there are notable differences.
3 Electricity customers family age analytical table of table
(6) town and country category analysis
In the client that change of title occurs for nearly 1 year, having 45491 family clients is town dweller, and having 58153 family clients is agriculture
Village resident;In the common non-transfer client of Ningbo City low pressure resident, having 738266 family clients is town dweller, has 2119305 families objective
Family is rural resident.It can be seen that the town and country classification of transfer client and normal client has a certain difference.Transfer ownership client with it is general
The town and country category analysis situation of logical client see the table below 4.
4 Electricity customers town and country category analysis table of table
As shown in figure 5, by repetition test, the index that model finally enters such as the following table 5:
5 model variation table of table
2 potential change Electricity customers prediction model buildings
2.1 modelling technique principle explanations
Logistic function (or being Sigmoid function) specific formula is as follows:
Wherein e is natural logrithm, and z is the steepness of curve,
The case where for linear classification, boundary regime is as follows:
θ here0It is constant, θ1,θ2,...θnRefer to the coefficient of variable, x1,x2,...xnIt is specific variable, in the technical program
In, variable be can influence user transfer ownership Potential Prediction result specific power consumption index, as town and country classification, family age, electricity consumption classification,
Way to pay dues etc..
It is in conjunction with the anticipation function that (1), (2) formula construct:
Wherein g (θTX)=g (z), g (θTIt x) is two whether to transfer ownership in this project as a result, being the probability for taking 1,
Otherwise 0 probability is taken, therefore above formula can be converted into again:
P (y=1 | x;θ)=hθ(x) (4)
P (y=0 | x;θ)=1-hθ(x) (5)
Formula (4), (5) formula are integrated and can be write as:
P(y|x;θ)=(hθ(x))y(1-hθ(x))1-y (6)
In variable processing, two o'clock should be focused on:1, the linearisation feature of Logic Regression Models makes the influence side of continuous variable
To there are unicity, but according to actual business experience, Partial Variable and this feature of non-exhibiting, it is low that intermediate high both sides are presented
Crest feature, can be to avoid the above problem by doing artificial factor conversion or sectioning to Partial Variable, but this will necessarily
The accuracy and quantity for causing the distortion of partial data, therefore being sliced are an important factor for guaranteeing modelling effect;2, each variable
Order of magnitude very different be easy to cause some unitary variant to θTThe weighing factor of x function is excessive, therefore in model variable
Normalized is made to continuous variable before input.
Logarithm is taken to (6) formula, can derive maximal possibility estimation Cost function is:
Cost loss function maximal possibility estimation here, therefore Cost value is smaller, function is more restrained, and is estimated to model
It is better to count effect.
It asks Cost functional minimum value that gradient descent method can be used, the renewal process of θ can be obtained according to gradient descent method:
By gradient descent method, (7) formula can be write as:
By (9) formula it is found that when Cost minimum, as residual error (hθ(x(i))-y(i)) and minimum, entire pattern function and reality
Border result fitting effect is best.
2.2 model foundations and verifying
Using the client of Ningbo City's in June, 2017 generation change of title as target sample, and randomly select a part of common
Client constitutes sample set as check sample, and logic-based regression algorithm excavates the characteristic feature of transfer client, constructs potential change
More Electricity customers prediction model.70% is taken for training set to sample set data in this modeling, 30% is test set.
The variable that model finally screens can be not quite similar with initial input variable, used when this is because returning and gradually returned
Return principle automatic screening independent variable.Method of gradual regression is according to AIC akaike information criterion to determine whether retaining independent variable, the party
Method can be improved the goodness of fit while solve Problems of Multiple Synteny between independent variable.
2.2.1 model coefficient exports result
For the output of potential change Electricity customers discrimination model coefficient as a result, all variables influence all more to show on model result
Write (P<=0.05), wherein whether contact method changes and whether preceding 3 months electricity are 0 maximum on transfer model influence, connection
It is that mode changes and continuous 3 moonsets have the client of electricity consumption that may have occurred that change of title.
6 model coefficient of table exports result table
Some features that the high user of change of title potentiality occurs can be refined by the parameter of models fitting:
(1) the corresponding way to pay dues in same family number, gathering department's information can change;
(2) the client family age that change of title occurs does not exceed largely 15 years;
(3) client occur change of title it is previous as can empty window phase for some time;
(4) it is bigger that a possibility that change of title occurs in town dweller.
2.2.2 variable importance exports result
Variable importance result figure 5 is it is found that whether contact method changes, whether preceding 3 months electricity are 0 this 2 indexs pair
Model is affected, in addition, whether family age is also larger on model influence less than 15 years.Binding model coefficient results are it is found that connection
Mode changes, the town users of continuous 3 months or longer time without electricity consumption and family age no more than 15 years have occurred and that production
Adaptability in tactics more a possibility that it is larger.
2.2.3 sample set interpretation of result
Sample set output the results are shown in Table 7.
7 sample set of table predicts confusion matrix table
Wherein, row value is actual value, and train value is predicted value, and by confusion matrix, it can be concluded that, practical sample set is transfer visitor
The quantity at family is 10547 families, be the transfer ownership amount of client is 9622 families wherein correctly predicted, error prediction for normal client family
Number is 925 families, the specific recall rate of sample set and accuracy rate such as table 8:
8 sample set recall rate of table and accuracy rate table
2.2.4 verification result is analyzed
Based on the prediction model of sample set building, model is applied to Ningbo City's in June, 2017 whole Electricity customers number
According to being verified, full data verification output the results are shown in Table 9.
The verifying collection prediction confusion matrix table of table 9
Wherein, row value is actual value, and train value is predicted value, and by confusion matrix, it can be concluded that, practical is the number of transfer client
Amount be 10547 families, wherein it is correctly predicted be transfer client amount be 9622 families, error prediction is that the amount of normal client is
925 families, the specific recall rate of full data verification collection and accuracy rate such as table 10:
The verifying of table 10 collection recall rate and accuracy rate table
Model is applied to when being predicted in Ningbo City whole resident, the accuracy rate of model is 20.78%;Ningbo City
In June, 2017 whole Electricity customers data are 2868118 families, wherein the quantity that reality is transfer client is 10547 families, transfer
Client's accounting is only 0.37%, about 56 times of the promotion degree of model.
For practical the reason of being common non-transfer client, being but predicted to be potential change client:(1) because of common non-mistake
Family client includes change of title client and non-change of title client, this portions of client for being predicted to be change of title is likely to practical
It exactly has occurred that change of title but does not handle the client of electric power transfer in electric system, this part is exactly that we are logical in fact
Cross the potential change of title client that model is excavated;(2) this portions of client may be rental housing client, i.e. property right registration people and reality
Border Electricity customers are not the same object, so the electricity consumption behavior of this portions of client, feature of paying dues etc. and change of title customer class
Seemingly, such as way to pay dues it is changeable, will appear the sky window phase.Can consider that house region position is added in next step model optimization
It sets, combines each region practical development situation to be further analyzed according to house region position.
For practical the reason of being change of title user, being but predicted to be normal client:This portions of client may be family
Transfer is granted or is inherited in house between inside, this portions of client is with there is no become in fact in electrical feature, behavior of paying dues
Change, so the client for having occurred and that change of title practical to this part is mistakenly predicted as normal client.
The technical program is enable to respond quickly business demand according to potential change Electricity customers prediction model, helps business people
Member filters out high potentiality transfer client from mass data and formulates corresponding Policies for development, carries out routine work targeted specifically.When
So can also work be further optimized to model, variable index and model parameter are further improved according to model result, is such as examined
Consider and house region position is added, combines each region practical development situation further to be divided according to house region position
Analysis, improves the accuracy rate and recall rate of model, on the basis of model optimization, the range of in due course expansion activity operation, using more
Kind marketing mode, improves Application effect.Meanwhile feature tag is generated in conjunction with change of title client's actual conditions, it is marked using derivative
Information is signed, is supported for the precision marketing activity of other subject scenes.
Change of title Electricity customers localization method shown in figure 1 above for power marketing is specific implementation of the invention
Example, has embodied substantive distinguishing features of the present invention and progress, can be under the inspiration of the present invention, right according to actual using needs
It carries out equivalent modifications, the column in the protection scope of this programme.