CN106651024A - Tariff recovery prediction model construction method - Google Patents
Tariff recovery prediction model construction method Download PDFInfo
- Publication number
- CN106651024A CN106651024A CN201611180509.4A CN201611180509A CN106651024A CN 106651024 A CN106651024 A CN 106651024A CN 201611180509 A CN201611180509 A CN 201611180509A CN 106651024 A CN106651024 A CN 106651024A
- Authority
- CN
- China
- Prior art keywords
- arrearage
- forecast model
- year
- electricity
- electricity consumption
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000011084 recovery Methods 0.000 title claims abstract description 29
- 238000010276 construction Methods 0.000 title claims abstract description 26
- 238000012549 training Methods 0.000 claims abstract description 98
- 238000012360 testing method Methods 0.000 claims abstract description 61
- 238000000034 method Methods 0.000 claims abstract description 20
- 230000005611 electricity Effects 0.000 claims description 158
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 claims description 31
- 230000005540 biological transmission Effects 0.000 claims description 10
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 4
- 238000012546 transfer Methods 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 9
- 239000011159 matrix material Substances 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 241000251468 Actinopterygii Species 0.000 description 4
- 235000013372 meat Nutrition 0.000 description 4
- 230000003542 behavioural effect Effects 0.000 description 3
- FNMKZDDKPDBYJM-UHFFFAOYSA-N 3-(1,3-benzodioxol-5-yl)-7-(3-methylbut-2-enoxy)chromen-4-one Chemical compound C1=C2OCOC2=CC(C2=COC=3C(C2=O)=CC=C(C=3)OCC=C(C)C)=C1 FNMKZDDKPDBYJM-UHFFFAOYSA-N 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000004836 empirical method Methods 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention provides a tariff recovery prediction model construction method. The method can predicate the arrearage time of the next time of users who have arrears. The method comprises: determining the index system of a first predication model; obtaining the data of users who have arrears as the training test set of the first predication model, wherein the training test set comprises a training set and a test set; performing training of the first predication model according to the training set of the first predication model; and predicating the arrearage time of the next time of users who have arrears according to the trained first predication model. The tariff recovery prediction model construction method is suitable for the electric power system technology field.
Description
Technical field
The present invention relates to technical field of power systems, particularly relates to a kind of construction method of tariff recovery forecast model.
Background technology
Electric company's tariff recovery management work is to ensure that electric company's electricity charge are normally reclaimed, realize company's sustainable development
A vital task.As power customer power consumption quickly increases and outside bad border is continually changing, what electric company faced
Tariff recovery risk and uncertainty is also being increased year by year.The research that Accurate Prediction demand charge reclaims information not in time is remained unchanged
Rest on unilateral prediction, be only capable of predicting user's whether arrearage.
The content of the invention
The technical problem to be solved in the present invention is to provide a kind of construction method of tariff recovery forecast model, existing to solve
Being only capable of existing for technology predicts user's whether problem of arrearage.
To solve above-mentioned technical problem, the embodiment of the present invention provides a kind of construction method of tariff recovery forecast model, bag
Include:
Determine the index system of the first forecast model;
According to determine the first forecast model index system, obtain defaulting subscriber's data as the first forecast model
Training test set, wherein, the training test set includes:Training set and test set;
The first forecast model is trained according to the training set of first forecast model;
The first forecast model after according to training predicts the next arrearage time of defaulting subscriber.
Further, the index system of first forecast model includes:Superior unit, power supply unit, category of employment, use
Electric classification, electric pressure, load important level, contract capacity, working capacity, proof cycle, have a power failure mark, duration of registering for a household residence card, give
Electric duration, stoichiometric point number, isoelectric point number, power supply point number, whether tripartite agreement, frequency of power cut, year power consumption, year loss on transmission
Electricity, year is idle power consumption rate, year line loss per unit, year arrearage total degree, average annual arrearage electricity consumption interval, year minimum arrearage electricity consumption interval,
The one kind of the last arrearage electricity consumption in modern interval, year arrearage electricity consumption accounting, the monthly arrearage electricity charge, 2.5 years arrearage total degrees
Or it is various.
Further, the training set according to first forecast model the first forecast model is trained including:
According to the training set of first forecast model, the first forecast model is trained using SVM algorithm.
Further, methods described also includes:
Determine the index system of the second forecast model;
According to determine the second forecast model index system, obtain defaulting subscriber's data as the second forecast model
Training test set, wherein, the training test set includes:Training set and test set;
The second forecast model is trained according to the training set of second forecast model;
The second forecast model after according to training predicts the next arrearage amount of money of defaulting subscriber.
Further, the index system of second forecast model includes:Nearest annual moon arrearage power consumption, most
Nearly annual moon arrearage electricity consumption number of times, average annual arrearage electricity consumption interval, minimum arrearage electricity consumption interval, average year arrearage in nearest 1 year
Electricity consumption interval, a nearest arrearage electricity consumption are used away from modern interval, arrearage electricity consumption accounting, the average arrearage electricity consumption electricity charge, first time arrearage
The open an account duration on date, first time arrearage electricity consumption, the nearest electricity consumption electricity charge of arrearage in a year, nearest arrearage in three months of electrical distance is used
The electricity charge, the nearest electricity consumption electricity charge of arrearage in 2 years, the nearest electricity consumption electricity charge of arrearage in 6 months, year arrearage electricity consumption electricity charge growth rate, moon arrearage
Electricity consumption electricity charge growth rate chain rate, season arrearage electricity consumption electricity charge growth rate, the year maximum arrearage electricity consumption electricity charge, year minimum arrearage electricity consumption electricity
Expense, the month of the year maximum arrearage electricity consumption electricity charge, the month of the year minimum arrearage electricity consumption electricity charge, user's classification, whether tripartite agreement, be
No gradation is transferred, potential safety hazard interval, moon power consumption, season power consumption, year power consumption, duration of registering for a household residence card, power transmission duration, contract hold
Amount, electricity consumption classification, highly energy-consuming trade classification, metering are counted out, power supply number, moon line loss growth rate, annual line loss per unit, average
Month idle utilization rate, active utilization rate of the current moon, it is current it is idle with active accounting rate, when peak bottom/ordinary telegram ratio, year peak month in and month out/
Paddy/ordinary telegram ratio, year collected charges for electricity in advance the amount of money, frequency of power cut, off interval, average off interval, year power failure frequency, last time in this year
One or more in the arrearage electricity charge, last time in this year arrearage electricity charge sequential growth rate, this arrearage electricity charge.
Further, methods described also includes:
Determine the index system of the 3rd forecast model;
According to the index system of the 3rd forecast model for determining, all customer data is obtained as the instruction of the 3rd forecast model
Practice test set, wherein, the training test set includes:Training set and test set;
The 3rd forecast model is trained according to the training set of the 3rd forecast model;
The 3rd forecast model prediction user's next month whether arrearage after according to training.
Further, the index system of the 3rd forecast model includes:Category of employment, location, user classify, are
No tripartite agreement, whether transfer by several times, potential safety hazard number of times, year power consumption, duration of registering for a household residence card, power transmission duration, contract capacity, electricity consumption
Classification, highly energy-consuming trade classification, metering count out, power supply number, moon line loss growth rate, annual line loss number of times, average moon nothing
Work(utilization rate, active utilization rate of the current moon, it is idle with active accounting rate, moon peak bottom/ordinary telegram ratio, year peak bottom/ordinary telegram ratio,
Collect charges for electricity in advance in year the amount of money, frequency of power cut, off interval, average off interval, year power failure frequency, whether in arrearage excessively one kind or
It is various.
Further, before generating training test set, methods described also includes:
The data of defaulting subscriber or all customer data to obtaining carry out data cleansing, data conversion and data normalization
Process.
Further, methods described also includes:
According to the data after normalized, according to this arrearage electricity charge/the last time arrearage electricity consumption away from modern interval/whether
Stratified Sampling is crossed in arrearage, generates the first forecast model, the second forecast model, the training test set of the 3rd forecast model.
The above-mentioned technical proposal of the present invention has the beneficial effect that:
In such scheme, the index system of the first forecast model by determining, obtain defaulting subscriber's data as the
The training test set of one forecast model, wherein, the training test set includes:Training set and test set;It is pre- according to described first
The training set for surveying model is trained to the first forecast model;First forecast model prediction after according to training defaulting subscriber
The next arrearage time, so, the next arrearage time of defaulting subscriber can be predicted by the first forecast model.
Description of the drawings
Fig. 1 is the schematic flow sheet of the construction method of tariff recovery forecast model provided in an embodiment of the present invention;
Fig. 2 is the principle schematic of the construction method of tariff recovery forecast model provided in an embodiment of the present invention;
Fig. 3 is the schematic flow sheet of user data pretreatment provided in an embodiment of the present invention.
Specific embodiment
To make the technical problem to be solved in the present invention, technical scheme and advantage clearer, below in conjunction with accompanying drawing and tool
Body embodiment is described in detail.
The present invention is only capable of predicting user's whether problem of arrearage for existing, there is provided a kind of tariff recovery forecast model
Construction method.
Referring to shown in Fig. 1, the construction method of tariff recovery forecast model provided in an embodiment of the present invention, including:
S101, determines the index system of the first forecast model;
S102, according to the index system of the first forecast model for determining, obtains defaulting subscriber's data and predicts as first
The training test set of model, wherein, the training test set includes:Training set and test set;
S103, is trained according to the training set of first forecast model to the first forecast model;
S104, according to training after the first forecast model prediction defaulting subscriber the next arrearage time.
The construction method of the tariff recovery forecast model described in the embodiment of the present invention, by the first forecast model of determination
Index system, obtain defaulting subscriber's data as the first forecast model training test set, wherein, it is described training test set bag
Include:Training set and test set;The first forecast model is trained according to the training set of first forecast model;According to training
The first forecast model afterwards predicts the next arrearage time of defaulting subscriber, so, can be predicted by the first forecast model
The next arrearage time of defaulting subscriber.
As shown in Fig. 2 in the present embodiment, whether having arrearage according to user, user can be divided into defaulting subscriber and non-
Defaulting subscriber/non-defaulting subscriber this two class, to defaulting subscriber, can be by the first forecast model prediction defaulting subscriber's next time
When arrearage can be, and the promise breaking user in Fig. 2 refers to defaulting subscriber.
In the present embodiment, in order to obtain the first forecast model, need to first determine that time taking direct efficiency index is owed in prediction.By
In the user of, arrearage have greatly may arrearage again, in the present embodiment, to defaulting subscriber, user will be predicted
When there is arrearage and be converted to the time interval for predicting adjacent arrearage twice.
In the present embodiment, prediction can thus utilize its adjacent twice for the user's design objective for having had arrearage
The time interval of arrearage (for example, second arrearage and first time arrearage) does a checking.Here only once arrearage record
User arranged its promise breaking at intervals of 30 days.The user data of whole arrearages is taken, is taken and is related to arrearage and basic electricity consumption behavioral indicator
Index as the first forecast model index system, the index system of first forecast model can more accurately find out use
The electricity consumption arrearage behavioural characteristic at family, and then the time interval of the user of more scientific prediction arrearage arrearage again, so as to be inferred to
Defaulting subscriber next time can be in when arrearage again.
In the present embodiment, the index system of first forecast model includes:Superior unit, power supply unit, category of employment,
Electricity consumption classification, electric pressure, load important level, contract capacity, working capacity, proof cycle, have a power failure mark, duration of registering for a household residence card,
Power transmission duration, stoichiometric point number, isoelectric point number, power supply point number, whether tripartite agreement, frequency of power cut, year power consumption, annual variation
Damage electricity, year is idle between power consumption rate, year line loss per unit, year arrearage total degree, average annual arrearage electricity consumption interval, year minimum arrearage electricity consumption
Every, the last arrearage electricity consumption in modern interval, year arrearage electricity consumption accounting, the monthly arrearage electricity charge, 2.5 years arrearage total degrees one
Plant or various.For example, if current time is in July, 2016, the index system of the first forecast model as shown in table 1, is used with year
As a example by electricity, year power consumption data be 3, respectively:, the year electricity consumption in this 3 adjacent times in 2016 in 2015 in 2014
Amount.
The index system of the forecast model of table 1 first
In the present embodiment, some index time cycles inapplicable situation is run into, supplied with average.For example, year power consumption
This index, if user opened an account less than 1 year, by the power consumption of currently opening an account of the part less than a year/currently open an account duration generation
Replace.
It is further, described according to institute in the specific embodiment of the construction method of aforementioned tariff recovery forecast model
State the first forecast model training set the first forecast model is trained including:
According to the training set of first forecast model, the first forecast model is trained using SVM algorithm.
In the present embodiment, according to the training set of first forecast model, it is possible to use SVMs (Support
Vector Machine, SVM) algorithm is trained to the first forecast model, the first forecast model after being trained, Jin Ergen
According to after the training after training the first forecast model predict the first forecast model test set in defaulting subscriber or other
The next arrearage time of defaulting subscriber.
In the present embodiment, the first forecast model after the test set of the first forecast model is used for training is tested, examined
The accuracy that survey predicts the outcome.
In the specific embodiment of the construction method of aforementioned tariff recovery forecast model, further, methods described is also
Including:
Determine the index system of the second forecast model;
According to determine the second forecast model index system, obtain defaulting subscriber's data as the second forecast model
Training test set, wherein, the training test set includes:Training set and test set;
The second forecast model is trained according to the training set of second forecast model;
The second forecast model after according to training predicts the next arrearage amount of money of defaulting subscriber.
As shown in Fig. 2 in the present embodiment, to defaulting subscriber, defaulting subscriber can be predicted by the second forecast model
Next time may owe how much amount of money.
In the present embodiment, the amount of money of subscriber arrearage generally believes with the amount of money of history arrearage it is have very big in empirical method
Association.In the present embodiment, prediction is directed to user's design objective of arrearage, using the electricity consumption behavior on historical data and basis
Index system of the index as the second forecast model.
In the present embodiment, the index system of second forecast model includes:Nearest annual moon arrearage power consumption,
Nearest annual moon arrearage electricity consumption number of times, average annual arrearage electricity consumption interval, nearest 1 year minimum arrearage electricity consumption interval, average year are owed
Expense electricity interval, a nearest arrearage electricity consumption are away from modern interval, arrearage electricity consumption accounting, the average arrearage electricity consumption electricity charge, first time arrearage
Opened an account with electrical distance duration, first time arrearage electricity consumption, the nearest electricity consumption electricity charge of arrearage in a year, the nearest arrearage in three months on date
The electricity consumption electricity charge, the nearest electricity consumption electricity charge of arrearage in 2 years, the nearest electricity consumption electricity charge of arrearage in 6 months, year arrearage electricity consumption electricity charge growth rate, the moon are owed
Expense electricity electricity charge growth rate chain rate, season arrearage electricity consumption electricity charge growth rate, the year maximum arrearage electricity consumption electricity charge, year minimum arrearage electricity consumption electricity
Expense, the month of the year maximum arrearage electricity consumption electricity charge, the month of the year minimum arrearage electricity consumption electricity charge, user's classification, whether tripartite agreement, be
No gradation is transferred, potential safety hazard interval, moon power consumption, season power consumption, year power consumption, duration of registering for a household residence card, power transmission duration, contract hold
Amount, electricity consumption classification, highly energy-consuming trade classification, metering are counted out, power supply number, moon line loss growth rate, annual line loss per unit, average
Month idle utilization rate, active utilization rate of the current moon, it is current it is idle with active accounting rate, when peak bottom/ordinary telegram ratio, year peak month in and month out/
Paddy/ordinary telegram ratio, year collected charges for electricity in advance the amount of money, frequency of power cut, off interval, average off interval, year power failure frequency, last time in this year
One or more in the arrearage electricity charge, last time in this year arrearage electricity charge sequential growth rate, this arrearage electricity charge, the first prediction mould
The index system of type is as shown in table 2.
The index system of the forecast model of table 2 second
In the present embodiment, after determining the index system of the second forecast model;According to the index of the second forecast model for determining
System, obtain defaulting subscriber's data as the second forecast model training test set, wherein, it is described training test set include:
Training set and test set;According to the training set of second forecast model, it is possible to use SVM algorithm is carried out to the second forecast model
Training, obtains the second forecast model, so according to training after the second forecast model predict in the test set of the second forecast model
Defaulting subscriber or other defaulting subscribers the next arrearage amount of money.
In the present embodiment, the second forecast model after the test set of the second forecast model is used for training is tested, examined
The accuracy that survey predicts the outcome.
In the specific embodiment of the construction method of aforementioned tariff recovery forecast model, further, methods described is also
Including:
Determine the index system of the 3rd forecast model;
According to the index system of the 3rd forecast model for determining, all customer data is obtained as the instruction of the 3rd forecast model
Practice test set, wherein, the training test set includes:Training set and test set;
The 3rd forecast model is trained according to the training set of the 3rd forecast model;
The 3rd forecast model prediction user's next month whether arrearage after according to training.
As shown in Fig. 2 in the present embodiment, to all users, can predict whether user can owe by the 3rd forecast model
Take.
In the present embodiment, to all users, the state of user's arrearage next month is predicted.In view of the user without arrearage not
There is arrearage index of correlation, in order to find out correlated characteristic pattern, take whole user data, and take the index conduct for being not related to arrearage
The index system of the 3rd forecast model, the index system of the 3rd forecast model can more objectively find out the electricity consumption of user and owe
Take behavioural characteristic, and then whether the user of more scientific prediction never arrearage has the possibility of arrearage.
In the present embodiment, the index system of the 3rd forecast model can include:Category of employment, location, user
Classification, whether tripartite agreement, whether transfer by several times, potential safety hazard number of times, year power consumption, duration of registering for a household residence card, power transmission duration, contract hold
Amount, electricity consumption classification, highly energy-consuming trade classification, metering are counted out, power supply number, moon line loss growth rate, annual line loss number of times, flat
The moon idle utilization rate, active utilization rate of the current moon, idle with active accounting rate, moon peak bottom/ordinary telegram ratio, year peak bottom/flat
Collect charges for electricity in advance in electric ratio, year the amount of money, frequency of power cut, off interval, average off interval, year power failure frequency, whether in arrearage excessively
One or more, as shown in table 3.
The index system of the forecast model of table 3 the 3rd
In the present embodiment, after determining the index system of the 3rd forecast model, can be according to the 3rd forecast model for determining
Index system, obtains all customer data as the training test set of the 3rd forecast model, wherein, the training test set bag
Include:Training set and test set;According to the training set of the 3rd forecast model, it is possible to use SVM algorithm is to the 3rd forecast model
Be trained, obtain the 3rd forecast model, so according to training after the 3rd forecast model prediction the 3rd forecast model of prediction
User or other users next month in test set whether arrearage.
In the present embodiment, the 3rd forecast model after the test set of the 3rd forecast model is used for training is tested, examined
The accuracy that survey predicts the outcome.
In the specific embodiment of the construction method of aforementioned tariff recovery forecast model, further, generate training and survey
Before examination collection, methods described also includes:
The data of defaulting subscriber or all customer data to obtaining carry out data cleansing, data conversion and data normalization
Process.
In the present embodiment, according to the index system and user type of different forecast models, can be from marketing set in advance
The training test set that corresponding user data forms different forecast models is extracted in system.
In the present embodiment, after corresponding original user data is extracted from marketing system, the original user to extracting also is needed
Data carry out pretreatment and training set, the division work of test set.
In the present embodiment, based on the original user data for obtaining, analyzing or forecast model unrelated with target needs place
The user data of reason, is pre-processed for such user data, and the pretreatment includes:Data cleansing, data conversion and number
According to normalized, as shown in Figure 3.
In the present embodiment, in data analysis process, it is found that the data unrelated with analysis target carry out data cleansing, such as scheme
Shown in 3, the data cleansing includes:Duplicate data is removed, the data of sky is cleared to, is removed other types data.
In the present embodiment, as shown in figure 3, data conversion includes:Character type data discretization, data type conversion,
NA data are converted to into 0.For example, the arrearage amount of money of in March, 2014 all users is 0, belongs to and is recorded as sky data;Average year
There are many NA values (because the user having does not have arrearage) in arrearage electricity consumption interval (2.5 years/total degree -1), NA values are substituted for
0;Again for example, the user data of importing is made into data type conversion, is converted to num/int types, transcode is as follows:
AllData $ superior units<- as.numeric (as.factor (allData $ superior units))
AllData $ categorys of employment<- as.numeric (as.factor (allData $ categorys of employment))
AllData $ power supply units<- as.numeric (as.factor (allData $ power supply units))
AllData $ electricity consumption classifications<- as.numeric (as.factor (allData $ electricity consumption classifications))
AllData $ highly energy-consuming electricity consumption categorys of employment<- as.numeric (as.factor (allData $ highly energy-consuming electricity consumption rows
Industry classification))
AllData $ whether tripartite agreements<- as.numeric (as.factor (allData $ whether tripartite agreement))
AllData $ electric pressures<- as.numeric (as.factor (allData $ electric pressures))
In the present embodiment, data normalization is referred to data are limited in the range of needs by certain algorithm.Return first
One change is for the convenience of subsequent data process, next to that convergence quickening when ensureing that program is run.It is normalized it is concrete effect be
Conclude the statistical distribution of unified samples.Normalized is done to data, the impact of different pieces of information magnitude can be eliminated.With R languages
As a example by the normalization that carries of speech, wherein, what R language was carried is normalized to center normalizing, can be by data normalizing into (- 1:1) between,
AllData1=as.data.frame (scale (allData)).
In order to eliminate the reverse action that negative value is caused, it is possible to use maximin normalization method, wherein, maximin normalizing
The code of method is as follows:
In the specific embodiment of the construction method of aforementioned tariff recovery forecast model, further, methods described is also
Including:
According to the data after normalized, according to this arrearage electricity charge/the last time arrearage electricity consumption away from modern interval/whether
Stratified Sampling is crossed in arrearage, generates the first forecast model, the second forecast model, the training test set of the 3rd forecast model.
In the present embodiment, after pre-processing to user data, three data sets are obtained, each forecast model correspondence one
Data set, used as a training test set, each training test set includes each data set:Training set and test set;Will training
Collection and the ratio of test set are set to predetermined value, for example, 0.10, and according to this arrearage electricity charge/the last time arrearage electricity consumption away from
The present, interval/whether these three index Stratified Samplings were crossed in arrearage, generate the training test set of three kinds of forecast models.
In the present embodiment, in order to more fully understand based on the forecast model of SVM algorithm, with to defaulting subscriber, under prediction
When secondary arrearage can be example, that is, predict arrearage at intervals of example, and how explanation in detail specifically generates the first prediction mould
Type.
S1, arranges SVM parameters
S11, arranges kernel function
The kernel parameter kernel=" radial " of SVM functions is set, wherein, radial represents RBF,
Radial can be used as " radial " to use as a kernel function by setting kernel parameter kernel.When using one
During the individual kernel with " radial ", as a result in hyperplane avoid the need for being one linear.Generally define a bending
Region defining the separation between classification, this also tends to cause identical training data, the higher degree of accuracy.S12, classification is defeated
Go out to arrange
Classification type=" C-classification " of SVM functions is set, and this represents and comes result as a classification
Judge rather than predict a numerical value.
The selection of S13, cost, gamma parameter
The optimized parameter of svm () function is found through tune.svm () function, wherein, tune.svm () function is to use
In one of e1071 bags of svm algorithms from tape function, it is used to find the optimized parameter of svm () function, through different parameters
The setting of scope, can obtain the different error rates that predict the outcome, and then determine optimized parameter.Through screening set cost=
10000, gamma=1e-5, wherein, gamma be select RBF radial as kernel after, RBF
The parameter that radial is carried, gamma impliedly determines the distribution that data are mapped to after new feature space.Cost is represented
Penalty factor, cost is bigger, and mistake point sample is fewer, and gap width diminishes, and generalization ability is weak;Cost is less, and mistake point sample is bigger,
Gap width is bigger, and generalization ability is strong;, the selection process and code of cost, gamma parameter be as follows:
tuned<- tune.svm (last time arrearage interval~., data=train, gamma=10^ (- 10:-1),
Cost=10^ (1:4)), wherein,<- represent assignment;Code implementing result is as shown in table 4.
The code implementing result of table 4
In table 4, error represents error rate, it can be seen that when error is minimum, obtains best parameter value, as shown in table 5.
The optimal value of the parameter of table 5gamma, cost
gamma | cost |
1e-07 | 100 |
S2, the bag training SVM models carried using R language
The data (training set of the first forecast model) at prediction arrearage interval are imported into the first forecast model, and (svmfit is predicted
Model) in, to be trained using R language, code is as follows:
Can see that SVM classifier summary for obtaining is as follows:
Predict that arrearage interval/next deficient time taking first forecast model of defaulting subscriber is trained in this step
It is good.
S3, the first forecast model for being trained the test set data input of the first forecast model using the grader for generating
Predicted the outcome, and the accuracy that detection predicts the outcome.
In the svmfit forecast models that the test set data input of the first forecast model is trained, the result predicted is obtained
In being stored in pred.
By code below, the result of prediction and legitimate reading are formed into mixed meat and fish dishes matrix, as shown in table 6.
>Svm1=max.col (iris $ targetsTest) |
>table(svm1,pred) |
Table 6 mixes meat and fish dishes matrix
pre | 1 | 2 | 3 | 4 | 5 | 6 | 10 | 25 | 26 | 30 |
1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 |
2 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
7 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
10 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
12 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
26 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
30 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 60 |
In table 6, the cell part with underscore be predict the outcome with legitimate reading overlap part, the number in cell
Word represents the correct user's number of prediction.The row coordinate for assuming mixed meat and fish dishes matrix is x, i.e., the numerical value at former arrearage interval, ordinate
For y, that is, the numerical value at the arrearage interval for predicting.
The value of the cell in the mixed meat and fish dishes matrix of order is Px,y, wherein, x represents that this cell belonged to what interval, y originally
Value represent it is predicted go out what interval.For example in the matrix of neutral net, P1,1=2 represent that original arrearage is finally pre- at intervals of 1
Survey and claim 1 user there are 2;P12,30=1 represents that original arrearage is 1 at intervals of 12 users for being predicted to 30, wherein, Px,y≠0
And x=y is precisely to predict correct number of users.
Calculate the accuracy of this SVM:
>mean(svm2!=pred) |
[1]0.1710526 |
Can see that grader predicts altogether 6 kinds of results and corresponds to arrearage respectively at intervals of 1,2,3,4,5,6,15,26,30
Can see that predictablity rate is 82.89474%.
In the present embodiment, next arrearage time (the next arrearage that the first forecast model predicts defaulting subscriber can be passed through
With the time interval between last time arrearage), by the next arrearage amount of money of the second forecast model prediction defaulting subscriber, by the
The prediction user's next month whether arrearage of three forecast models, according to predicting the outcome, to forecast model of the accuracy rate less than 70% him is adjusted
Arrange parameter and training set, predict again until rate of accuracy reached to 70% and more than.Consolidated forecast result, can obtain:
1st, whether user next month can arrearage
When 2nd, for defaulting subscriber, next arrearage is
3rd, to defaulting subscriber, the next arrearage of prediction may arrearage how much amount of money
So, the future condition of the power recovery of power grid user is predicted from Multi-orientation multi-angle, electricity can be effectively aided in
Power enterprise formulates electricity consumption and electricity charge prediction policy.
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art
For, on the premise of without departing from principle of the present invention, some improvements and modifications can also be made, these improvements and modifications
Should be regarded as protection scope of the present invention.
Claims (9)
1. a kind of construction method of tariff recovery forecast model, it is characterised in that include:
Determine the index system of the first forecast model;
According to the index system of the first forecast model for determining, training of defaulting subscriber's data as the first forecast model is obtained
Test set, wherein, the training test set includes:Training set and test set;
The first forecast model is trained according to the training set of first forecast model;
The first forecast model after according to training predicts the next arrearage time of defaulting subscriber.
2. the construction method of tariff recovery forecast model according to claim 1, it is characterised in that the first prediction mould
The index system of type includes:Superior unit, power supply unit, category of employment, electricity consumption classification, electric pressure, load important level, conjunction
Same capacity, working capacity, proof cycle, power failure mark, duration of registering for a household residence card, power transmission duration, stoichiometric point number, isoelectric point number, electricity
Source point number, whether tripartite agreement, frequency of power cut, year power consumption, year loss on transmission electricity, year is idle power consumption rate, year line loss per unit, Nian Qian
Take total degree, average annual arrearage electricity consumption interval, year minimum arrearage electricity consumption interval, the last arrearage electricity consumption away from modern interval, year arrearage
One or more in electricity consumption accounting, the monthly arrearage electricity charge, 2.5 years arrearage total degrees.
3. the construction method of tariff recovery forecast model according to claim 1, it is characterised in that described according to described
The training set of one forecast model the first forecast model is trained including:
According to the training set of first forecast model, the first forecast model is trained using SVM algorithm.
4. the construction method of tariff recovery forecast model according to claim 1, it is characterised in that methods described is also wrapped
Include:
Determine the index system of the second forecast model;
According to the index system of the second forecast model for determining, training of defaulting subscriber's data as the second forecast model is obtained
Test set, wherein, the training test set includes:Training set and test set;
The second forecast model is trained according to the training set of second forecast model;
The second forecast model after according to training predicts the next arrearage amount of money of defaulting subscriber.
5. the construction method of tariff recovery forecast model according to claim 4, it is characterised in that the second prediction mould
The index system of type includes:Nearest annual moon arrearage power consumption, nearest annual moon arrearage electricity consumption number of times, average annual owe
Expense electricity interval, minimum arrearage electricity consumption interval, average year arrearage electricity consumption interval, a nearest arrearage electricity consumption in nearest 1 year are away between the present
Open an account every, arrearage electricity consumption accounting, the average arrearage electricity consumption electricity charge, first time arrearage electrical distance duration, the first time arrearage on date
Electricity consumption, the nearest electricity consumption electricity charge of arrearage in a year, the nearest electricity consumption electricity charge of arrearage in three months, the nearest electricity consumption electricity charge of arrearage in 2 years, recently
The arrearage electricity consumption electricity charge in 6 months, year arrearage electricity consumption electricity charge growth rate, moon arrearage electricity consumption electricity charge growth rate chain rate, season arrearage electricity consumption electricity
Take growth rate, the year maximum arrearage electricity consumption electricity charge, the year minimum arrearage electricity consumption electricity charge, the year maximum arrearage electricity consumption electricity charge month, year most
Month of the little arrearage electricity consumption electricity charge, user's classification, whether tripartite agreement, whether transfer by several times, potential safety hazard interval, moon electricity consumption
Amount, season power consumption, year power consumption, duration of registering for a household residence card, power transmission duration, contract capacity, electricity consumption classification, highly energy-consuming trade classification, metering
Count out, power supply number, moon line loss growth rate, annual line loss per unit, the average moon is idle utilization rate, active utilization rate of the current moon,
It is current it is idle with active accounting rate, when peak bottom/ordinary telegram ratio, year peak bottom/ordinary telegram ratio, the amount of money that collects charges for electricity in advance in year month in and month out, stop
Electric number of times, off interval, average off interval, year power failure frequency, the arrearage electricity charge of last time in this year, last time in this year arrearage electricity charge chain rate
One or more in growth rate, this arrearage electricity charge.
6. the construction method of the tariff recovery forecast model according to claim 1 or 4, it is characterised in that methods described is also
Including:
Determine the index system of the 3rd forecast model;
According to the index system of the 3rd forecast model for determining, the training for obtaining all customer data as the 3rd forecast model is surveyed
Examination collection, wherein, the training test set includes:Training set and test set;
The 3rd forecast model is trained according to the training set of the 3rd forecast model;
The 3rd forecast model prediction user's next month whether arrearage after according to training.
7. the construction method of tariff recovery forecast model according to claim 6, it is characterised in that the 3rd prediction mould
The index system of type includes:Category of employment, location, user's classification, whether tripartite agreement, whether transfer by several times, safety it is hidden
Suffer from number of times, year power consumption, duration of registering for a household residence card, power transmission duration, contract capacity, electricity consumption classification, highly energy-consuming trade classification, metering points
Mesh, power supply number, moon line loss growth rate, annual line loss number of times, the average moon is idle utilization rate, active utilization rate of the current moon, nothing
Work(and active accounting rate, moon peak bottom/ordinary telegram ratio, year peak bottom/ordinary telegram ratio, collect charges for electricity in advance in year the amount of money, frequency of power cut, power failure
Interval, average off interval, year power failure frequency, whether in arrearage excessively one or more.
8. the construction method of tariff recovery forecast model according to claim 6, it is characterised in that generate training test set
Before, methods described also includes:
The data of defaulting subscriber or all customer data to obtaining are carried out at data cleansing, data conversion and data normalization
Reason.
9. the construction method of tariff recovery forecast model according to claim 8, it is characterised in that methods described is also wrapped
Include:
According to the data after normalized, according to this arrearage electricity charge/the last time arrearage electricity consumption away from modern interval/whether arrearage
Stratified Sampling is crossed, the first forecast model, the second forecast model, the training test set of the 3rd forecast model is generated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611180509.4A CN106651024A (en) | 2016-12-19 | 2016-12-19 | Tariff recovery prediction model construction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611180509.4A CN106651024A (en) | 2016-12-19 | 2016-12-19 | Tariff recovery prediction model construction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106651024A true CN106651024A (en) | 2017-05-10 |
Family
ID=58833907
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611180509.4A Pending CN106651024A (en) | 2016-12-19 | 2016-12-19 | Tariff recovery prediction model construction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106651024A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107180392A (en) * | 2017-05-18 | 2017-09-19 | 北京科技大学 | A kind of electric power enterprise tariff recovery digital simulation method |
CN111199493A (en) * | 2018-11-19 | 2020-05-26 | 国家电网有限公司客户服务中心 | Arrearage risk identification method based on customer payment information and credit investigation information |
CN111198907A (en) * | 2019-12-24 | 2020-05-26 | 深圳供电局有限公司 | Method and device for identifying potential defaulting user, computer equipment and storage medium |
CN112488421A (en) * | 2020-12-15 | 2021-03-12 | 国网雄安金融科技集团有限公司 | Tracking and predicting method and device for electric charge account receivable |
CN113592140A (en) * | 2021-06-22 | 2021-11-02 | 国网宁夏电力有限公司吴忠供电公司 | Electric charge payment prediction model training system and electric charge payment prediction model |
-
2016
- 2016-12-19 CN CN201611180509.4A patent/CN106651024A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107180392A (en) * | 2017-05-18 | 2017-09-19 | 北京科技大学 | A kind of electric power enterprise tariff recovery digital simulation method |
CN111199493A (en) * | 2018-11-19 | 2020-05-26 | 国家电网有限公司客户服务中心 | Arrearage risk identification method based on customer payment information and credit investigation information |
CN111198907A (en) * | 2019-12-24 | 2020-05-26 | 深圳供电局有限公司 | Method and device for identifying potential defaulting user, computer equipment and storage medium |
CN112488421A (en) * | 2020-12-15 | 2021-03-12 | 国网雄安金融科技集团有限公司 | Tracking and predicting method and device for electric charge account receivable |
CN112488421B (en) * | 2020-12-15 | 2023-04-28 | 国网雄安金融科技集团有限公司 | Tracking and predicting method and device for accounts receivable of electric charge |
CN113592140A (en) * | 2021-06-22 | 2021-11-02 | 国网宁夏电力有限公司吴忠供电公司 | Electric charge payment prediction model training system and electric charge payment prediction model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106651024A (en) | Tariff recovery prediction model construction method | |
CN106127363A (en) | A kind of user credit appraisal procedure and device | |
CN109360084A (en) | Appraisal procedure and device, storage medium, the computer equipment of reference default risk | |
CN107392479A (en) | The power customer power failure susceptibility scorecard implementation of logic-based regression model | |
CN106339942A (en) | Financial information processing method and system | |
CN106095639A (en) | A kind of cluster subhealth state method for early warning and system | |
CN104424598A (en) | Cash demand quantity predicating device and method | |
CN104750861B (en) | A kind of energy-accumulating power station mass data cleaning method and system | |
CN106251049A (en) | A kind of electricity charge risk model construction method of big data | |
US20060100957A1 (en) | Electronic data processing system and method of using an electronic data processing system for automatically determining a risk indicator value | |
CN109376924A (en) | A kind of method, apparatus, equipment and the readable storage medium storing program for executing of material requirements prediction | |
CN111178675A (en) | LR-Bagging algorithm-based electric charge recycling risk prediction method, system, storage medium and computer equipment | |
CN104766144A (en) | Order forecasting method and system | |
CN103995899A (en) | Analysis system for KPI | |
CN104376418A (en) | System alteration risk control method based on business | |
CN108389069A (en) | Top-tier customer recognition methods based on random forest and logistic regression and device | |
CN102081781A (en) | Finance modeling optimization method based on information self-circulation | |
CN101567069A (en) | Processing method of evaluation data of legal risk and query system | |
CN102262664A (en) | Quality estimating method and quality estimating device | |
CN104574141A (en) | Service influence degree analysis method | |
CN105447082A (en) | Distributed clustering method for mass load curves | |
CN106447198A (en) | Power consumption checking method based on business expanding installation data | |
CN110532301A (en) | Auditing method, system and readable storage medium storing program for executing | |
CN109428760B (en) | User credit evaluation method based on operator data | |
CN110363384A (en) | Exception electric detection method based on depth weighted neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170510 |