CN106651024A - Tariff recovery prediction model construction method - Google Patents

Tariff recovery prediction model construction method Download PDF

Info

Publication number
CN106651024A
CN106651024A CN201611180509.4A CN201611180509A CN106651024A CN 106651024 A CN106651024 A CN 106651024A CN 201611180509 A CN201611180509 A CN 201611180509A CN 106651024 A CN106651024 A CN 106651024A
Authority
CN
China
Prior art keywords
arrearage
forecast model
year
electricity
electricity consumption
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611180509.4A
Other languages
Chinese (zh)
Inventor
刘刚
付薇薇
高迪
李钢
张雪梅
潘林利
董振祥
刘洋
王建华
王亚强
杨峰
王骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING BOWANG HUAKE TECHNOLOGY Co Ltd
State Grid North Hebei Power Co Ltd Operations Monitoring (control) Center
University of Science and Technology Beijing USTB
Original Assignee
BEIJING BOWANG HUAKE TECHNOLOGY Co Ltd
State Grid North Hebei Power Co Ltd Operations Monitoring (control) Center
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING BOWANG HUAKE TECHNOLOGY Co Ltd, State Grid North Hebei Power Co Ltd Operations Monitoring (control) Center, University of Science and Technology Beijing USTB filed Critical BEIJING BOWANG HUAKE TECHNOLOGY Co Ltd
Priority to CN201611180509.4A priority Critical patent/CN106651024A/en
Publication of CN106651024A publication Critical patent/CN106651024A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a tariff recovery prediction model construction method. The method can predicate the arrearage time of the next time of users who have arrears. The method comprises: determining the index system of a first predication model; obtaining the data of users who have arrears as the training test set of the first predication model, wherein the training test set comprises a training set and a test set; performing training of the first predication model according to the training set of the first predication model; and predicating the arrearage time of the next time of users who have arrears according to the trained first predication model. The tariff recovery prediction model construction method is suitable for the electric power system technology field.

Description

A kind of construction method of tariff recovery forecast model
Technical field
The present invention relates to technical field of power systems, particularly relates to a kind of construction method of tariff recovery forecast model.
Background technology
Electric company's tariff recovery management work is to ensure that electric company's electricity charge are normally reclaimed, realize company's sustainable development A vital task.As power customer power consumption quickly increases and outside bad border is continually changing, what electric company faced Tariff recovery risk and uncertainty is also being increased year by year.The research that Accurate Prediction demand charge reclaims information not in time is remained unchanged Rest on unilateral prediction, be only capable of predicting user's whether arrearage.
The content of the invention
The technical problem to be solved in the present invention is to provide a kind of construction method of tariff recovery forecast model, existing to solve Being only capable of existing for technology predicts user's whether problem of arrearage.
To solve above-mentioned technical problem, the embodiment of the present invention provides a kind of construction method of tariff recovery forecast model, bag Include:
Determine the index system of the first forecast model;
According to determine the first forecast model index system, obtain defaulting subscriber's data as the first forecast model Training test set, wherein, the training test set includes:Training set and test set;
The first forecast model is trained according to the training set of first forecast model;
The first forecast model after according to training predicts the next arrearage time of defaulting subscriber.
Further, the index system of first forecast model includes:Superior unit, power supply unit, category of employment, use Electric classification, electric pressure, load important level, contract capacity, working capacity, proof cycle, have a power failure mark, duration of registering for a household residence card, give Electric duration, stoichiometric point number, isoelectric point number, power supply point number, whether tripartite agreement, frequency of power cut, year power consumption, year loss on transmission Electricity, year is idle power consumption rate, year line loss per unit, year arrearage total degree, average annual arrearage electricity consumption interval, year minimum arrearage electricity consumption interval, The one kind of the last arrearage electricity consumption in modern interval, year arrearage electricity consumption accounting, the monthly arrearage electricity charge, 2.5 years arrearage total degrees Or it is various.
Further, the training set according to first forecast model the first forecast model is trained including:
According to the training set of first forecast model, the first forecast model is trained using SVM algorithm.
Further, methods described also includes:
Determine the index system of the second forecast model;
According to determine the second forecast model index system, obtain defaulting subscriber's data as the second forecast model Training test set, wherein, the training test set includes:Training set and test set;
The second forecast model is trained according to the training set of second forecast model;
The second forecast model after according to training predicts the next arrearage amount of money of defaulting subscriber.
Further, the index system of second forecast model includes:Nearest annual moon arrearage power consumption, most Nearly annual moon arrearage electricity consumption number of times, average annual arrearage electricity consumption interval, minimum arrearage electricity consumption interval, average year arrearage in nearest 1 year Electricity consumption interval, a nearest arrearage electricity consumption are used away from modern interval, arrearage electricity consumption accounting, the average arrearage electricity consumption electricity charge, first time arrearage The open an account duration on date, first time arrearage electricity consumption, the nearest electricity consumption electricity charge of arrearage in a year, nearest arrearage in three months of electrical distance is used The electricity charge, the nearest electricity consumption electricity charge of arrearage in 2 years, the nearest electricity consumption electricity charge of arrearage in 6 months, year arrearage electricity consumption electricity charge growth rate, moon arrearage Electricity consumption electricity charge growth rate chain rate, season arrearage electricity consumption electricity charge growth rate, the year maximum arrearage electricity consumption electricity charge, year minimum arrearage electricity consumption electricity Expense, the month of the year maximum arrearage electricity consumption electricity charge, the month of the year minimum arrearage electricity consumption electricity charge, user's classification, whether tripartite agreement, be No gradation is transferred, potential safety hazard interval, moon power consumption, season power consumption, year power consumption, duration of registering for a household residence card, power transmission duration, contract hold Amount, electricity consumption classification, highly energy-consuming trade classification, metering are counted out, power supply number, moon line loss growth rate, annual line loss per unit, average Month idle utilization rate, active utilization rate of the current moon, it is current it is idle with active accounting rate, when peak bottom/ordinary telegram ratio, year peak month in and month out/ Paddy/ordinary telegram ratio, year collected charges for electricity in advance the amount of money, frequency of power cut, off interval, average off interval, year power failure frequency, last time in this year One or more in the arrearage electricity charge, last time in this year arrearage electricity charge sequential growth rate, this arrearage electricity charge.
Further, methods described also includes:
Determine the index system of the 3rd forecast model;
According to the index system of the 3rd forecast model for determining, all customer data is obtained as the instruction of the 3rd forecast model Practice test set, wherein, the training test set includes:Training set and test set;
The 3rd forecast model is trained according to the training set of the 3rd forecast model;
The 3rd forecast model prediction user's next month whether arrearage after according to training.
Further, the index system of the 3rd forecast model includes:Category of employment, location, user classify, are No tripartite agreement, whether transfer by several times, potential safety hazard number of times, year power consumption, duration of registering for a household residence card, power transmission duration, contract capacity, electricity consumption Classification, highly energy-consuming trade classification, metering count out, power supply number, moon line loss growth rate, annual line loss number of times, average moon nothing Work(utilization rate, active utilization rate of the current moon, it is idle with active accounting rate, moon peak bottom/ordinary telegram ratio, year peak bottom/ordinary telegram ratio, Collect charges for electricity in advance in year the amount of money, frequency of power cut, off interval, average off interval, year power failure frequency, whether in arrearage excessively one kind or It is various.
Further, before generating training test set, methods described also includes:
The data of defaulting subscriber or all customer data to obtaining carry out data cleansing, data conversion and data normalization Process.
Further, methods described also includes:
According to the data after normalized, according to this arrearage electricity charge/the last time arrearage electricity consumption away from modern interval/whether Stratified Sampling is crossed in arrearage, generates the first forecast model, the second forecast model, the training test set of the 3rd forecast model.
The above-mentioned technical proposal of the present invention has the beneficial effect that:
In such scheme, the index system of the first forecast model by determining, obtain defaulting subscriber's data as the The training test set of one forecast model, wherein, the training test set includes:Training set and test set;It is pre- according to described first The training set for surveying model is trained to the first forecast model;First forecast model prediction after according to training defaulting subscriber The next arrearage time, so, the next arrearage time of defaulting subscriber can be predicted by the first forecast model.
Description of the drawings
Fig. 1 is the schematic flow sheet of the construction method of tariff recovery forecast model provided in an embodiment of the present invention;
Fig. 2 is the principle schematic of the construction method of tariff recovery forecast model provided in an embodiment of the present invention;
Fig. 3 is the schematic flow sheet of user data pretreatment provided in an embodiment of the present invention.
Specific embodiment
To make the technical problem to be solved in the present invention, technical scheme and advantage clearer, below in conjunction with accompanying drawing and tool Body embodiment is described in detail.
The present invention is only capable of predicting user's whether problem of arrearage for existing, there is provided a kind of tariff recovery forecast model Construction method.
Referring to shown in Fig. 1, the construction method of tariff recovery forecast model provided in an embodiment of the present invention, including:
S101, determines the index system of the first forecast model;
S102, according to the index system of the first forecast model for determining, obtains defaulting subscriber's data and predicts as first The training test set of model, wherein, the training test set includes:Training set and test set;
S103, is trained according to the training set of first forecast model to the first forecast model;
S104, according to training after the first forecast model prediction defaulting subscriber the next arrearage time.
The construction method of the tariff recovery forecast model described in the embodiment of the present invention, by the first forecast model of determination Index system, obtain defaulting subscriber's data as the first forecast model training test set, wherein, it is described training test set bag Include:Training set and test set;The first forecast model is trained according to the training set of first forecast model;According to training The first forecast model afterwards predicts the next arrearage time of defaulting subscriber, so, can be predicted by the first forecast model The next arrearage time of defaulting subscriber.
As shown in Fig. 2 in the present embodiment, whether having arrearage according to user, user can be divided into defaulting subscriber and non- Defaulting subscriber/non-defaulting subscriber this two class, to defaulting subscriber, can be by the first forecast model prediction defaulting subscriber's next time When arrearage can be, and the promise breaking user in Fig. 2 refers to defaulting subscriber.
In the present embodiment, in order to obtain the first forecast model, need to first determine that time taking direct efficiency index is owed in prediction.By In the user of, arrearage have greatly may arrearage again, in the present embodiment, to defaulting subscriber, user will be predicted When there is arrearage and be converted to the time interval for predicting adjacent arrearage twice.
In the present embodiment, prediction can thus utilize its adjacent twice for the user's design objective for having had arrearage The time interval of arrearage (for example, second arrearage and first time arrearage) does a checking.Here only once arrearage record User arranged its promise breaking at intervals of 30 days.The user data of whole arrearages is taken, is taken and is related to arrearage and basic electricity consumption behavioral indicator Index as the first forecast model index system, the index system of first forecast model can more accurately find out use The electricity consumption arrearage behavioural characteristic at family, and then the time interval of the user of more scientific prediction arrearage arrearage again, so as to be inferred to Defaulting subscriber next time can be in when arrearage again.
In the present embodiment, the index system of first forecast model includes:Superior unit, power supply unit, category of employment, Electricity consumption classification, electric pressure, load important level, contract capacity, working capacity, proof cycle, have a power failure mark, duration of registering for a household residence card, Power transmission duration, stoichiometric point number, isoelectric point number, power supply point number, whether tripartite agreement, frequency of power cut, year power consumption, annual variation Damage electricity, year is idle between power consumption rate, year line loss per unit, year arrearage total degree, average annual arrearage electricity consumption interval, year minimum arrearage electricity consumption Every, the last arrearage electricity consumption in modern interval, year arrearage electricity consumption accounting, the monthly arrearage electricity charge, 2.5 years arrearage total degrees one Plant or various.For example, if current time is in July, 2016, the index system of the first forecast model as shown in table 1, is used with year As a example by electricity, year power consumption data be 3, respectively:, the year electricity consumption in this 3 adjacent times in 2016 in 2015 in 2014 Amount.
The index system of the forecast model of table 1 first
In the present embodiment, some index time cycles inapplicable situation is run into, supplied with average.For example, year power consumption This index, if user opened an account less than 1 year, by the power consumption of currently opening an account of the part less than a year/currently open an account duration generation Replace.
It is further, described according to institute in the specific embodiment of the construction method of aforementioned tariff recovery forecast model State the first forecast model training set the first forecast model is trained including:
According to the training set of first forecast model, the first forecast model is trained using SVM algorithm.
In the present embodiment, according to the training set of first forecast model, it is possible to use SVMs (Support Vector Machine, SVM) algorithm is trained to the first forecast model, the first forecast model after being trained, Jin Ergen According to after the training after training the first forecast model predict the first forecast model test set in defaulting subscriber or other The next arrearage time of defaulting subscriber.
In the present embodiment, the first forecast model after the test set of the first forecast model is used for training is tested, examined The accuracy that survey predicts the outcome.
In the specific embodiment of the construction method of aforementioned tariff recovery forecast model, further, methods described is also Including:
Determine the index system of the second forecast model;
According to determine the second forecast model index system, obtain defaulting subscriber's data as the second forecast model Training test set, wherein, the training test set includes:Training set and test set;
The second forecast model is trained according to the training set of second forecast model;
The second forecast model after according to training predicts the next arrearage amount of money of defaulting subscriber.
As shown in Fig. 2 in the present embodiment, to defaulting subscriber, defaulting subscriber can be predicted by the second forecast model Next time may owe how much amount of money.
In the present embodiment, the amount of money of subscriber arrearage generally believes with the amount of money of history arrearage it is have very big in empirical method Association.In the present embodiment, prediction is directed to user's design objective of arrearage, using the electricity consumption behavior on historical data and basis Index system of the index as the second forecast model.
In the present embodiment, the index system of second forecast model includes:Nearest annual moon arrearage power consumption, Nearest annual moon arrearage electricity consumption number of times, average annual arrearage electricity consumption interval, nearest 1 year minimum arrearage electricity consumption interval, average year are owed Expense electricity interval, a nearest arrearage electricity consumption are away from modern interval, arrearage electricity consumption accounting, the average arrearage electricity consumption electricity charge, first time arrearage Opened an account with electrical distance duration, first time arrearage electricity consumption, the nearest electricity consumption electricity charge of arrearage in a year, the nearest arrearage in three months on date The electricity consumption electricity charge, the nearest electricity consumption electricity charge of arrearage in 2 years, the nearest electricity consumption electricity charge of arrearage in 6 months, year arrearage electricity consumption electricity charge growth rate, the moon are owed Expense electricity electricity charge growth rate chain rate, season arrearage electricity consumption electricity charge growth rate, the year maximum arrearage electricity consumption electricity charge, year minimum arrearage electricity consumption electricity Expense, the month of the year maximum arrearage electricity consumption electricity charge, the month of the year minimum arrearage electricity consumption electricity charge, user's classification, whether tripartite agreement, be No gradation is transferred, potential safety hazard interval, moon power consumption, season power consumption, year power consumption, duration of registering for a household residence card, power transmission duration, contract hold Amount, electricity consumption classification, highly energy-consuming trade classification, metering are counted out, power supply number, moon line loss growth rate, annual line loss per unit, average Month idle utilization rate, active utilization rate of the current moon, it is current it is idle with active accounting rate, when peak bottom/ordinary telegram ratio, year peak month in and month out/ Paddy/ordinary telegram ratio, year collected charges for electricity in advance the amount of money, frequency of power cut, off interval, average off interval, year power failure frequency, last time in this year One or more in the arrearage electricity charge, last time in this year arrearage electricity charge sequential growth rate, this arrearage electricity charge, the first prediction mould The index system of type is as shown in table 2.
The index system of the forecast model of table 2 second
In the present embodiment, after determining the index system of the second forecast model;According to the index of the second forecast model for determining System, obtain defaulting subscriber's data as the second forecast model training test set, wherein, it is described training test set include: Training set and test set;According to the training set of second forecast model, it is possible to use SVM algorithm is carried out to the second forecast model Training, obtains the second forecast model, so according to training after the second forecast model predict in the test set of the second forecast model Defaulting subscriber or other defaulting subscribers the next arrearage amount of money.
In the present embodiment, the second forecast model after the test set of the second forecast model is used for training is tested, examined The accuracy that survey predicts the outcome.
In the specific embodiment of the construction method of aforementioned tariff recovery forecast model, further, methods described is also Including:
Determine the index system of the 3rd forecast model;
According to the index system of the 3rd forecast model for determining, all customer data is obtained as the instruction of the 3rd forecast model Practice test set, wherein, the training test set includes:Training set and test set;
The 3rd forecast model is trained according to the training set of the 3rd forecast model;
The 3rd forecast model prediction user's next month whether arrearage after according to training.
As shown in Fig. 2 in the present embodiment, to all users, can predict whether user can owe by the 3rd forecast model Take.
In the present embodiment, to all users, the state of user's arrearage next month is predicted.In view of the user without arrearage not There is arrearage index of correlation, in order to find out correlated characteristic pattern, take whole user data, and take the index conduct for being not related to arrearage The index system of the 3rd forecast model, the index system of the 3rd forecast model can more objectively find out the electricity consumption of user and owe Take behavioural characteristic, and then whether the user of more scientific prediction never arrearage has the possibility of arrearage.
In the present embodiment, the index system of the 3rd forecast model can include:Category of employment, location, user Classification, whether tripartite agreement, whether transfer by several times, potential safety hazard number of times, year power consumption, duration of registering for a household residence card, power transmission duration, contract hold Amount, electricity consumption classification, highly energy-consuming trade classification, metering are counted out, power supply number, moon line loss growth rate, annual line loss number of times, flat The moon idle utilization rate, active utilization rate of the current moon, idle with active accounting rate, moon peak bottom/ordinary telegram ratio, year peak bottom/flat Collect charges for electricity in advance in electric ratio, year the amount of money, frequency of power cut, off interval, average off interval, year power failure frequency, whether in arrearage excessively One or more, as shown in table 3.
The index system of the forecast model of table 3 the 3rd
In the present embodiment, after determining the index system of the 3rd forecast model, can be according to the 3rd forecast model for determining Index system, obtains all customer data as the training test set of the 3rd forecast model, wherein, the training test set bag Include:Training set and test set;According to the training set of the 3rd forecast model, it is possible to use SVM algorithm is to the 3rd forecast model Be trained, obtain the 3rd forecast model, so according to training after the 3rd forecast model prediction the 3rd forecast model of prediction User or other users next month in test set whether arrearage.
In the present embodiment, the 3rd forecast model after the test set of the 3rd forecast model is used for training is tested, examined The accuracy that survey predicts the outcome.
In the specific embodiment of the construction method of aforementioned tariff recovery forecast model, further, generate training and survey Before examination collection, methods described also includes:
The data of defaulting subscriber or all customer data to obtaining carry out data cleansing, data conversion and data normalization Process.
In the present embodiment, according to the index system and user type of different forecast models, can be from marketing set in advance The training test set that corresponding user data forms different forecast models is extracted in system.
In the present embodiment, after corresponding original user data is extracted from marketing system, the original user to extracting also is needed Data carry out pretreatment and training set, the division work of test set.
In the present embodiment, based on the original user data for obtaining, analyzing or forecast model unrelated with target needs place The user data of reason, is pre-processed for such user data, and the pretreatment includes:Data cleansing, data conversion and number According to normalized, as shown in Figure 3.
In the present embodiment, in data analysis process, it is found that the data unrelated with analysis target carry out data cleansing, such as scheme Shown in 3, the data cleansing includes:Duplicate data is removed, the data of sky is cleared to, is removed other types data.
In the present embodiment, as shown in figure 3, data conversion includes:Character type data discretization, data type conversion, NA data are converted to into 0.For example, the arrearage amount of money of in March, 2014 all users is 0, belongs to and is recorded as sky data;Average year There are many NA values (because the user having does not have arrearage) in arrearage electricity consumption interval (2.5 years/total degree -1), NA values are substituted for 0;Again for example, the user data of importing is made into data type conversion, is converted to num/int types, transcode is as follows:
AllData $ superior units<- as.numeric (as.factor (allData $ superior units))
AllData $ categorys of employment<- as.numeric (as.factor (allData $ categorys of employment))
AllData $ power supply units<- as.numeric (as.factor (allData $ power supply units))
AllData $ electricity consumption classifications<- as.numeric (as.factor (allData $ electricity consumption classifications))
AllData $ highly energy-consuming electricity consumption categorys of employment<- as.numeric (as.factor (allData $ highly energy-consuming electricity consumption rows Industry classification))
AllData $ whether tripartite agreements<- as.numeric (as.factor (allData $ whether tripartite agreement))
AllData $ electric pressures<- as.numeric (as.factor (allData $ electric pressures))
In the present embodiment, data normalization is referred to data are limited in the range of needs by certain algorithm.Return first One change is for the convenience of subsequent data process, next to that convergence quickening when ensureing that program is run.It is normalized it is concrete effect be Conclude the statistical distribution of unified samples.Normalized is done to data, the impact of different pieces of information magnitude can be eliminated.With R languages As a example by the normalization that carries of speech, wherein, what R language was carried is normalized to center normalizing, can be by data normalizing into (- 1:1) between, AllData1=as.data.frame (scale (allData)).
In order to eliminate the reverse action that negative value is caused, it is possible to use maximin normalization method, wherein, maximin normalizing The code of method is as follows:
In the specific embodiment of the construction method of aforementioned tariff recovery forecast model, further, methods described is also Including:
According to the data after normalized, according to this arrearage electricity charge/the last time arrearage electricity consumption away from modern interval/whether Stratified Sampling is crossed in arrearage, generates the first forecast model, the second forecast model, the training test set of the 3rd forecast model.
In the present embodiment, after pre-processing to user data, three data sets are obtained, each forecast model correspondence one Data set, used as a training test set, each training test set includes each data set:Training set and test set;Will training Collection and the ratio of test set are set to predetermined value, for example, 0.10, and according to this arrearage electricity charge/the last time arrearage electricity consumption away from The present, interval/whether these three index Stratified Samplings were crossed in arrearage, generate the training test set of three kinds of forecast models.
In the present embodiment, in order to more fully understand based on the forecast model of SVM algorithm, with to defaulting subscriber, under prediction When secondary arrearage can be example, that is, predict arrearage at intervals of example, and how explanation in detail specifically generates the first prediction mould Type.
S1, arranges SVM parameters
S11, arranges kernel function
The kernel parameter kernel=" radial " of SVM functions is set, wherein, radial represents RBF, Radial can be used as " radial " to use as a kernel function by setting kernel parameter kernel.When using one During the individual kernel with " radial ", as a result in hyperplane avoid the need for being one linear.Generally define a bending Region defining the separation between classification, this also tends to cause identical training data, the higher degree of accuracy.S12, classification is defeated Go out to arrange
Classification type=" C-classification " of SVM functions is set, and this represents and comes result as a classification Judge rather than predict a numerical value.
The selection of S13, cost, gamma parameter
The optimized parameter of svm () function is found through tune.svm () function, wherein, tune.svm () function is to use In one of e1071 bags of svm algorithms from tape function, it is used to find the optimized parameter of svm () function, through different parameters The setting of scope, can obtain the different error rates that predict the outcome, and then determine optimized parameter.Through screening set cost= 10000, gamma=1e-5, wherein, gamma be select RBF radial as kernel after, RBF The parameter that radial is carried, gamma impliedly determines the distribution that data are mapped to after new feature space.Cost is represented Penalty factor, cost is bigger, and mistake point sample is fewer, and gap width diminishes, and generalization ability is weak;Cost is less, and mistake point sample is bigger, Gap width is bigger, and generalization ability is strong;, the selection process and code of cost, gamma parameter be as follows:
tuned<- tune.svm (last time arrearage interval~., data=train, gamma=10^ (- 10:-1), Cost=10^ (1:4)), wherein,<- represent assignment;Code implementing result is as shown in table 4.
The code implementing result of table 4
In table 4, error represents error rate, it can be seen that when error is minimum, obtains best parameter value, as shown in table 5.
The optimal value of the parameter of table 5gamma, cost
gamma cost
1e-07 100
S2, the bag training SVM models carried using R language
The data (training set of the first forecast model) at prediction arrearage interval are imported into the first forecast model, and (svmfit is predicted Model) in, to be trained using R language, code is as follows:
Can see that SVM classifier summary for obtaining is as follows:
Predict that arrearage interval/next deficient time taking first forecast model of defaulting subscriber is trained in this step It is good.
S3, the first forecast model for being trained the test set data input of the first forecast model using the grader for generating Predicted the outcome, and the accuracy that detection predicts the outcome.
In the svmfit forecast models that the test set data input of the first forecast model is trained, the result predicted is obtained In being stored in pred.
By code below, the result of prediction and legitimate reading are formed into mixed meat and fish dishes matrix, as shown in table 6.
>Svm1=max.col (iris $ targetsTest)
>table(svm1,pred)
Table 6 mixes meat and fish dishes matrix
pre 1 2 3 4 5 6 10 25 26 30
1 1 0 1 1 0 1 0 0 0 1
2 0 1 0 0 0 0 0 0 0 0
4 2 0 0 0 0 0 0 0 0 1
7 0 0 0 0 0 1 0 0 0 0
10 0 0 1 0 0 0 0 0 0 0
12 1 0 0 0 0 0 0 0 0 0
16 0 0 0 0 0 0 0 1 0 0
26 0 0 0 0 0 0 0 0 1 0
30 0 0 0 0 0 0 2 0 0 60
In table 6, the cell part with underscore be predict the outcome with legitimate reading overlap part, the number in cell Word represents the correct user's number of prediction.The row coordinate for assuming mixed meat and fish dishes matrix is x, i.e., the numerical value at former arrearage interval, ordinate For y, that is, the numerical value at the arrearage interval for predicting.
The value of the cell in the mixed meat and fish dishes matrix of order is Px,y, wherein, x represents that this cell belonged to what interval, y originally Value represent it is predicted go out what interval.For example in the matrix of neutral net, P1,1=2 represent that original arrearage is finally pre- at intervals of 1 Survey and claim 1 user there are 2;P12,30=1 represents that original arrearage is 1 at intervals of 12 users for being predicted to 30, wherein, Px,y≠0 And x=y is precisely to predict correct number of users.
Calculate the accuracy of this SVM:
>mean(svm2!=pred)
[1]0.1710526
Can see that grader predicts altogether 6 kinds of results and corresponds to arrearage respectively at intervals of 1,2,3,4,5,6,15,26,30 Can see that predictablity rate is 82.89474%.
In the present embodiment, next arrearage time (the next arrearage that the first forecast model predicts defaulting subscriber can be passed through With the time interval between last time arrearage), by the next arrearage amount of money of the second forecast model prediction defaulting subscriber, by the The prediction user's next month whether arrearage of three forecast models, according to predicting the outcome, to forecast model of the accuracy rate less than 70% him is adjusted Arrange parameter and training set, predict again until rate of accuracy reached to 70% and more than.Consolidated forecast result, can obtain:
1st, whether user next month can arrearage
When 2nd, for defaulting subscriber, next arrearage is
3rd, to defaulting subscriber, the next arrearage of prediction may arrearage how much amount of money
So, the future condition of the power recovery of power grid user is predicted from Multi-orientation multi-angle, electricity can be effectively aided in Power enterprise formulates electricity consumption and electricity charge prediction policy.
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art For, on the premise of without departing from principle of the present invention, some improvements and modifications can also be made, these improvements and modifications Should be regarded as protection scope of the present invention.

Claims (9)

1. a kind of construction method of tariff recovery forecast model, it is characterised in that include:
Determine the index system of the first forecast model;
According to the index system of the first forecast model for determining, training of defaulting subscriber's data as the first forecast model is obtained Test set, wherein, the training test set includes:Training set and test set;
The first forecast model is trained according to the training set of first forecast model;
The first forecast model after according to training predicts the next arrearage time of defaulting subscriber.
2. the construction method of tariff recovery forecast model according to claim 1, it is characterised in that the first prediction mould The index system of type includes:Superior unit, power supply unit, category of employment, electricity consumption classification, electric pressure, load important level, conjunction Same capacity, working capacity, proof cycle, power failure mark, duration of registering for a household residence card, power transmission duration, stoichiometric point number, isoelectric point number, electricity Source point number, whether tripartite agreement, frequency of power cut, year power consumption, year loss on transmission electricity, year is idle power consumption rate, year line loss per unit, Nian Qian Take total degree, average annual arrearage electricity consumption interval, year minimum arrearage electricity consumption interval, the last arrearage electricity consumption away from modern interval, year arrearage One or more in electricity consumption accounting, the monthly arrearage electricity charge, 2.5 years arrearage total degrees.
3. the construction method of tariff recovery forecast model according to claim 1, it is characterised in that described according to described The training set of one forecast model the first forecast model is trained including:
According to the training set of first forecast model, the first forecast model is trained using SVM algorithm.
4. the construction method of tariff recovery forecast model according to claim 1, it is characterised in that methods described is also wrapped Include:
Determine the index system of the second forecast model;
According to the index system of the second forecast model for determining, training of defaulting subscriber's data as the second forecast model is obtained Test set, wherein, the training test set includes:Training set and test set;
The second forecast model is trained according to the training set of second forecast model;
The second forecast model after according to training predicts the next arrearage amount of money of defaulting subscriber.
5. the construction method of tariff recovery forecast model according to claim 4, it is characterised in that the second prediction mould The index system of type includes:Nearest annual moon arrearage power consumption, nearest annual moon arrearage electricity consumption number of times, average annual owe Expense electricity interval, minimum arrearage electricity consumption interval, average year arrearage electricity consumption interval, a nearest arrearage electricity consumption in nearest 1 year are away between the present Open an account every, arrearage electricity consumption accounting, the average arrearage electricity consumption electricity charge, first time arrearage electrical distance duration, the first time arrearage on date Electricity consumption, the nearest electricity consumption electricity charge of arrearage in a year, the nearest electricity consumption electricity charge of arrearage in three months, the nearest electricity consumption electricity charge of arrearage in 2 years, recently The arrearage electricity consumption electricity charge in 6 months, year arrearage electricity consumption electricity charge growth rate, moon arrearage electricity consumption electricity charge growth rate chain rate, season arrearage electricity consumption electricity Take growth rate, the year maximum arrearage electricity consumption electricity charge, the year minimum arrearage electricity consumption electricity charge, the year maximum arrearage electricity consumption electricity charge month, year most Month of the little arrearage electricity consumption electricity charge, user's classification, whether tripartite agreement, whether transfer by several times, potential safety hazard interval, moon electricity consumption Amount, season power consumption, year power consumption, duration of registering for a household residence card, power transmission duration, contract capacity, electricity consumption classification, highly energy-consuming trade classification, metering Count out, power supply number, moon line loss growth rate, annual line loss per unit, the average moon is idle utilization rate, active utilization rate of the current moon, It is current it is idle with active accounting rate, when peak bottom/ordinary telegram ratio, year peak bottom/ordinary telegram ratio, the amount of money that collects charges for electricity in advance in year month in and month out, stop Electric number of times, off interval, average off interval, year power failure frequency, the arrearage electricity charge of last time in this year, last time in this year arrearage electricity charge chain rate One or more in growth rate, this arrearage electricity charge.
6. the construction method of the tariff recovery forecast model according to claim 1 or 4, it is characterised in that methods described is also Including:
Determine the index system of the 3rd forecast model;
According to the index system of the 3rd forecast model for determining, the training for obtaining all customer data as the 3rd forecast model is surveyed Examination collection, wherein, the training test set includes:Training set and test set;
The 3rd forecast model is trained according to the training set of the 3rd forecast model;
The 3rd forecast model prediction user's next month whether arrearage after according to training.
7. the construction method of tariff recovery forecast model according to claim 6, it is characterised in that the 3rd prediction mould The index system of type includes:Category of employment, location, user's classification, whether tripartite agreement, whether transfer by several times, safety it is hidden Suffer from number of times, year power consumption, duration of registering for a household residence card, power transmission duration, contract capacity, electricity consumption classification, highly energy-consuming trade classification, metering points Mesh, power supply number, moon line loss growth rate, annual line loss number of times, the average moon is idle utilization rate, active utilization rate of the current moon, nothing Work(and active accounting rate, moon peak bottom/ordinary telegram ratio, year peak bottom/ordinary telegram ratio, collect charges for electricity in advance in year the amount of money, frequency of power cut, power failure Interval, average off interval, year power failure frequency, whether in arrearage excessively one or more.
8. the construction method of tariff recovery forecast model according to claim 6, it is characterised in that generate training test set Before, methods described also includes:
The data of defaulting subscriber or all customer data to obtaining are carried out at data cleansing, data conversion and data normalization Reason.
9. the construction method of tariff recovery forecast model according to claim 8, it is characterised in that methods described is also wrapped Include:
According to the data after normalized, according to this arrearage electricity charge/the last time arrearage electricity consumption away from modern interval/whether arrearage Stratified Sampling is crossed, the first forecast model, the second forecast model, the training test set of the 3rd forecast model is generated.
CN201611180509.4A 2016-12-19 2016-12-19 Tariff recovery prediction model construction method Pending CN106651024A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611180509.4A CN106651024A (en) 2016-12-19 2016-12-19 Tariff recovery prediction model construction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611180509.4A CN106651024A (en) 2016-12-19 2016-12-19 Tariff recovery prediction model construction method

Publications (1)

Publication Number Publication Date
CN106651024A true CN106651024A (en) 2017-05-10

Family

ID=58833907

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611180509.4A Pending CN106651024A (en) 2016-12-19 2016-12-19 Tariff recovery prediction model construction method

Country Status (1)

Country Link
CN (1) CN106651024A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107180392A (en) * 2017-05-18 2017-09-19 北京科技大学 A kind of electric power enterprise tariff recovery digital simulation method
CN111199493A (en) * 2018-11-19 2020-05-26 国家电网有限公司客户服务中心 Arrearage risk identification method based on customer payment information and credit investigation information
CN111198907A (en) * 2019-12-24 2020-05-26 深圳供电局有限公司 Method and device for identifying potential defaulting user, computer equipment and storage medium
CN112488421A (en) * 2020-12-15 2021-03-12 国网雄安金融科技集团有限公司 Tracking and predicting method and device for electric charge account receivable
CN113592140A (en) * 2021-06-22 2021-11-02 国网宁夏电力有限公司吴忠供电公司 Electric charge payment prediction model training system and electric charge payment prediction model

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107180392A (en) * 2017-05-18 2017-09-19 北京科技大学 A kind of electric power enterprise tariff recovery digital simulation method
CN111199493A (en) * 2018-11-19 2020-05-26 国家电网有限公司客户服务中心 Arrearage risk identification method based on customer payment information and credit investigation information
CN111198907A (en) * 2019-12-24 2020-05-26 深圳供电局有限公司 Method and device for identifying potential defaulting user, computer equipment and storage medium
CN112488421A (en) * 2020-12-15 2021-03-12 国网雄安金融科技集团有限公司 Tracking and predicting method and device for electric charge account receivable
CN112488421B (en) * 2020-12-15 2023-04-28 国网雄安金融科技集团有限公司 Tracking and predicting method and device for accounts receivable of electric charge
CN113592140A (en) * 2021-06-22 2021-11-02 国网宁夏电力有限公司吴忠供电公司 Electric charge payment prediction model training system and electric charge payment prediction model

Similar Documents

Publication Publication Date Title
CN106651024A (en) Tariff recovery prediction model construction method
CN106127363A (en) A kind of user credit appraisal procedure and device
CN109360084A (en) Appraisal procedure and device, storage medium, the computer equipment of reference default risk
CN107392479A (en) The power customer power failure susceptibility scorecard implementation of logic-based regression model
CN106339942A (en) Financial information processing method and system
CN106095639A (en) A kind of cluster subhealth state method for early warning and system
CN104424598A (en) Cash demand quantity predicating device and method
CN104750861B (en) A kind of energy-accumulating power station mass data cleaning method and system
CN106251049A (en) A kind of electricity charge risk model construction method of big data
US20060100957A1 (en) Electronic data processing system and method of using an electronic data processing system for automatically determining a risk indicator value
CN109376924A (en) A kind of method, apparatus, equipment and the readable storage medium storing program for executing of material requirements prediction
CN111178675A (en) LR-Bagging algorithm-based electric charge recycling risk prediction method, system, storage medium and computer equipment
CN104766144A (en) Order forecasting method and system
CN103995899A (en) Analysis system for KPI
CN104376418A (en) System alteration risk control method based on business
CN108389069A (en) Top-tier customer recognition methods based on random forest and logistic regression and device
CN102081781A (en) Finance modeling optimization method based on information self-circulation
CN101567069A (en) Processing method of evaluation data of legal risk and query system
CN102262664A (en) Quality estimating method and quality estimating device
CN104574141A (en) Service influence degree analysis method
CN105447082A (en) Distributed clustering method for mass load curves
CN106447198A (en) Power consumption checking method based on business expanding installation data
CN110532301A (en) Auditing method, system and readable storage medium storing program for executing
CN109428760B (en) User credit evaluation method based on operator data
CN110363384A (en) Exception electric detection method based on depth weighted neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170510