CN111178957A - Method for early warning sudden increase of electric quantity of electricity consumption customer - Google Patents

Method for early warning sudden increase of electric quantity of electricity consumption customer Download PDF

Info

Publication number
CN111178957A
CN111178957A CN201911341505.3A CN201911341505A CN111178957A CN 111178957 A CN111178957 A CN 111178957A CN 201911341505 A CN201911341505 A CN 201911341505A CN 111178957 A CN111178957 A CN 111178957A
Authority
CN
China
Prior art keywords
electric quantity
data
power
ratio
sudden increase
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911341505.3A
Other languages
Chinese (zh)
Other versions
CN111178957B (en
Inventor
杨倩
黄梦喜
农惠清
李娟娟
陈巧
韦瑜君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Power Grid Co Ltd
Original Assignee
Guangxi Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Power Grid Co Ltd filed Critical Guangxi Power Grid Co Ltd
Priority to CN201911341505.3A priority Critical patent/CN111178957B/en
Publication of CN111178957A publication Critical patent/CN111178957A/en
Application granted granted Critical
Publication of CN111178957B publication Critical patent/CN111178957B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Human Resources & Organizations (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for early warning sudden increase of electric quantity of a power consumption customer, which comprises the following steps: acquiring electric quantity data, customer service work order data and weather data, and performing data cleaning on the acquired electric quantity data, customer service work order data and weather data to obtain data after data cleaning; constructing an electric quantity level angle characteristic, an electric quantity ring ratio and same ratio angle characteristic, an air temperature influence degree characteristic and a historical appeal condition characteristic by using the data after data cleaning; after the characteristics are constructed, screening the constructed electric quantity level angle characteristics, electric quantity ring ratio and same ratio angle characteristics, air temperature influence degree characteristics and historical appeal condition characteristics to obtain screened characteristics; carrying out data balancing processing based on the screened features; constructing an Xgboost model based on the features after the data balancing processing; and early warning the sudden increase of the electric quantity of the electricity utilization customer based on the Xgboost model. In the implementation of the invention, the sudden increase of the electric quantity of the electricity consumption customer can be warned.

Description

Method for early warning sudden increase of electric quantity of electricity consumption customer
Technical Field
The invention relates to the technical field of electricity utilization early warning, in particular to a method for early warning sudden increase of electric quantity of electricity utilization customers.
Background
Under the background of the innovation of the power system, the distribution and sale power service is gradually released, so that marketization is realized, and customer resources become objects of competition in the distribution and sale power market; for power supply companies, improving customer service quality is a powerful means for competing for customer resources. With the service requirements of power customers becoming more diversified, the traditional 'passive' customer service mode cannot adapt to the changing service requirements of the customers, and the power customers increasingly need power grid enterprises to provide diversified and differentiated services; therefore, the expectation of the power supply service is higher, and the requirement of providing the service for the power grid enterprise is higher. At present, the method includes the steps of mining client appeal multidimensional analysis parameters, constructing multidimensional characteristic indexes, considering the size and the relevance of information contained in each characteristic index and the influence degree on appeal, and establishing a client appeal multidimensional analysis and early warning mathematical model, and is a problem to be solved urgently by power grid enterprises.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method for early warning the sudden increase of the electric quantity of a power consumption customer, which can be used for early warning the sudden increase of the electric quantity of the power consumption customer.
In order to solve the above technical problem, an embodiment of the present invention provides a method for warning sudden increase in power consumption of a power consumer, where the method includes:
acquiring electric quantity data, customer service work order data and weather data, and performing data cleaning on the acquired electric quantity data, customer service work order data and weather data to obtain data after data cleaning;
constructing an electric quantity level angle characteristic, an electric quantity ring ratio and same ratio angle characteristic, an air temperature influence degree characteristic and a historical appeal condition characteristic by using the data after data cleaning;
after the characteristics are constructed, screening the constructed electric quantity level angle characteristics, electric quantity ring ratio and same ratio angle characteristics, air temperature influence degree characteristics and historical appeal condition characteristics to obtain screened characteristics;
carrying out data balancing processing based on the screened features;
constructing an Xgboost model based on the features after the data balancing processing;
and early warning the sudden increase of the electric quantity of the electricity utilization customer based on the Xgboost model.
Optionally, the acquiring the electric quantity data, the customer service work order data and the weather data, and performing data cleaning on the acquired electric quantity data, the customer service work order data and the weather data to obtain data after data cleaning includes:
acquiring electric quantity data and customer service work order data based on a layered sampling mode, and acquiring weather data by a technical means;
performing data cleaning on the acquired electric quantity data, the customer service work order data and the weather data, wherein the data cleaning comprises filling of a monthly electric quantity missing value, identification of abnormal small monthly electric quantity, filling of incoming call missing and filling of a monthly average air temperature missing value;
and after the data cleaning is finished, obtaining the data after the data cleaning.
Optionally, the power level angle characteristic includes: the average power consumption level, the fluctuation of the power and the variation of the power change establish monthly average power, monthly power variance, the variation coefficient of the power, the ratio of the maximum value to the minimum value of the monthly power, the ratio of the maximum value to the monthly average power, the ratio of the monthly average power to the monthly minimum value of the monthly power and the ratio of the monthly average power in the current month.
Optionally, the characteristics of the power loop ratio and the angle of the same ratio include: the average value of the power ring ratio/same ratio, the variance of the power ring ratio/same ratio, the variation coefficient of the power ring ratio/same ratio, the maximum value of the power ring ratio/same ratio, the minimum value of the power ring ratio/same ratio, the current power ring ratio/same ratio, the previous power ring ratio/same ratio, the difference value of the current power ring ratio/same ratio and the average value of the power ring ratio, and the difference value of the previous power ring ratio and the average value of the power ring ratio.
Optionally, the temperature influence degree characteristic is calculated through a pearson correlation coefficient to obtain a correlation between the electricity quantity in the rest months and the average temperature in the corresponding month, so as to obtain a correlation degree between the temperature and the electricity consumption; wherein, the specific calculation formula of the Pearson correlation coefficient is as follows:
Figure BDA0002331485670000021
where ρ isx,yIs the Pearson correlation coefficient, xjTo remove the power of the j month after the abnormal power of the monthjIs equal to xjAnd the average air temperature of the month k is the number of the electric quantity of the rest months after the abnormal electric quantity of the small month is removed.
Optionally, the historical appeal characteristics include: counting the number of incoming calls of each user due to sudden increase of electric quantity in the last year, counting the total number of incoming calls in the average temperature interval of each month, counting whether the incoming calls are suddenly increased due to the electric quantity in the last month, counting the maximum number of the incoming calls due to sudden increase of the electric quantity in each month in the last year and counting whether the phenomenon of the incoming calls due to sudden increase of the electric quantity occurs in history.
Optionally, after the feature is constructed, the constructed electric quantity level angle feature, electric quantity ring ratio and same ratio angle feature, air temperature influence degree feature and historical appeal condition feature are screened, and the obtained screened features include:
after the characteristics are constructed, screening discrete variables in the constructed electric quantity level angle characteristics, electric quantity ring ratio and same ratio angle characteristics, air temperature influence degree characteristics and historical appeal condition characteristics based on a zero variance algorithm;
after the screening, excluding high correlation between the discrete variables by using an algorithm of a Pearson correlation coefficient;
after the high correlation among the discrete variables is eliminated, utilizing multiple collinearity to check, and eliminating the variables causing the multiple collinearity;
and after the variables causing the multiple collinearity are eliminated, selecting important features in the electric quantity level angle feature, the electric quantity ring ratio and same ratio angle feature, the air temperature influence degree feature and the historical appeal condition feature based on a machine learning algorithm of a random forest.
Optionally, the performing data balancing processing based on the screened features includes:
based on the screened features, calculating the distance from each sample a in the electric quantity sudden increase incoming call user to all samples in the electric quantity sudden increase incoming call user sample set by taking the Euclidean distance as a standard to obtain k neighbors of the electric quantity sudden increase incoming call user;
setting a sampling ratio according to the sample unbalance ratio to determine a sampling multiplying power N, and randomly selecting a plurality of samples b from k neighbors of the obtained electric quantity sudden increase incoming call user;
constructing a new sample c with each sample a in the power surge incoming call users respectively based on each randomly selected neighbor b; the new sample c is the result of the data balancing process, and the specific formula is as follows:
c=a+rand(0,1)*|a-b|。
optionally, an Xgboost model is constructed based on the features after the data balancing processing, wherein a specific formula of an objective function of the Xgboost is as follows:
Obj(θ)=L(θ)+Ω(θ);
can be converted into:
Figure BDA0002331485670000041
wherein Obj (θ) is an objective function of Xgboost, L (θ) is an error function, and Ω (θ) is a regular term.
Optionally, the pre-warning of the sudden increase of the power consumption of the electricity consumer based on the Xgboost model includes:
training based on the Xgboost model, and obtaining a training result;
and early warning is carried out on the sudden increase of the electric quantity of the electricity consumption customer by combining the training result and the evaluation index AUC.
In the implementation of the invention, on the basis of the power consumption of a client and the customer service work order data, external weather data is introduced, information such as the monthly power, whether incoming call appeal is met, the average temperature of the monthly power and the like is obtained through cleaning, multi-dimensional characteristic indexes are constructed from the aspects of fluctuation and seasonality of the power, the frequency of incoming call appeal, the power change level during appeal and the like by using a big data analysis technical means, the size and the relevance of information contained in each characteristic index and the influence degree on whether the appeal is met or not are considered, the information of the characteristic indexes is fully utilized, and an Xgboost model is used for early warning whether the power consumption client meets the requirements or not. The method can be effectively applied to the early warning of sudden increase of the electric quantity of the electricity consumption customers, so that relevant business personnel can do corresponding work in advance, and complaints of users caused by the sudden increase of the electric quantity are reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart illustrating a method for warning sudden increase in power consumption of a power consumer according to an embodiment of the present invention;
fig. 2 is a boxline graph of AUC evaluation indicators of the electric quantity sudden appeal warning results of different models in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
Referring to fig. 1, fig. 1 is a schematic flow chart illustrating a method for warning sudden increase in power consumption of a power consumption customer according to an embodiment of the present invention.
As shown in fig. 1, a method for early warning of sudden power increase of a power consumption customer includes:
s11: acquiring electric quantity data, customer service work order data and weather data, and performing data cleaning on the acquired electric quantity data, customer service work order data and weather data to obtain data after data cleaning;
in a specific implementation process of the present invention, the acquiring of the electric quantity data, the customer service work order data, and the weather data, as shown in table 1, table 1 shows a telephone traffic prediction source table, and the data cleaning of the acquired electric quantity data, the customer service work order data, and the weather data, to obtain the data after the data cleaning, includes: acquiring electric quantity data and customer service work order data based on a layered sampling mode, and acquiring weather data by a technical means; performing data cleaning on the acquired electric quantity data, the customer service work order data and the weather data, wherein the data cleaning comprises filling of a monthly electric quantity missing value, identification of abnormal small monthly electric quantity, filling of incoming call missing and filling of a monthly average air temperature missing value; and after the data cleaning is finished, obtaining the data after the data cleaning.
Specifically, the obtained electric quantity data, the customer service work order data and the weather data are counted to obtain a missing value accounting for 5.3%, the missing value of each month is shown in table 2, and the statistical result of the month electric quantity missing value is shown in table 2. Setting the monthly electric quantity lower than 3 degrees as NA, and deleting the users with all the monthly electric quantity missing; filling by adopting a corresponding method according to the actual degree of the electric quantity, wherein when the monthly electric quantity missing proportion of a user is more than 0.5, historical average electric quantity is used for filling, otherwise, a vector matrix is constructed by the historical electric quantity according to a historical time sequence, and then a K neighbor algorithm is used for completely filling the missing value of the historical electric quantity; identifying abnormally small monthly electric quantity and setting the monthly electric quantity as a missing value, namely calculating the monthly average electric quantity of each user, and setting the monthly electric quantity as the missing value if the monthly electric quantity is less than 0.1 time of the monthly average electric quantity; in addition, for the external air temperature data, the daily average air temperature is obtained by using the average value of the daily maximum air temperature and the daily minimum air temperature, the missing value in the daily average air temperature is removed, and the average value of the daily average air temperature of each month is calculated; for the appeal data, if the user calls in a certain month due to sudden increase of the electric quantity and the electric charge, the data is recorded as 1, and if not, the data is recorded as 0.
Table 1 traffic prediction source table
Source meter Monthly electric quantity data Customer service worker singular number Climate data (highest/low temperature day)
Number of users 110128 110128 730 strips
Time span 2017.06.01-2019.05.31 2017.06.01-2019.05.31 2017.06.01-2019.05.31
TABLE 2 statistical results of power deficiency values in months
Year and month of electricity Number of deletions Rate of absence
201706 0 0.00
201707 0 0.00
201708 9618 0.09
201709 0 0.00
201710 0 0.00
201711 0 0.00
201712 0 0.00
201801 8990 0.08
201802 8476 0.08
201803 7981 0.07
201804 8235 0.07
201805 0 0.00
201806 8475 0.08
201807 8431 0.08
201808 8294 0.07
201809 8410 0.08
201810 8594 0.08
201811 9105 0.08
201812 8982 0.08
201901 8622 0.08
201902 8300 0.07
201903 8448 0.08
201904 9049 0.08
201905 9053 0.08
S12: constructing an electric quantity level angle characteristic, an electric quantity ring ratio and same ratio angle characteristic, an air temperature influence degree characteristic and a historical appeal condition characteristic by using the data after data cleaning;
in a specific implementation process of the present invention, the electric quantity level angle characteristics include: the average power consumption level, the fluctuation of the power and the variation of the power change establish monthly average power, monthly power variance, the variation coefficient of the power, the ratio of the maximum value to the minimum value of the monthly power, the ratio of the maximum value to the monthly average power, the ratio of the monthly average power to the monthly minimum value of the monthly power and the ratio of the monthly average power in the current month.
In the specific implementation process of the present invention, the characteristics of the electric quantity loop ratio and the angle of the same ratio include: the average value of the power ring ratio/same ratio, the variance of the power ring ratio/same ratio, the variation coefficient of the power ring ratio/same ratio, the maximum value of the power ring ratio/same ratio, the minimum value of the power ring ratio/same ratio, the current power ring ratio/same ratio, the previous power ring ratio/same ratio, the difference value of the current power ring ratio/same ratio and the average value of the power ring ratio, and the difference value of the previous power ring ratio and the average value of the power ring ratio.
In the specific implementation process of the invention, the temperature influence degree characteristic calculates the correlation between the electricity quantity in the rest months and the average temperature in the corresponding month through the Pearson correlation coefficient, and further obtains the correlation degree between the temperature and the electricity consumption; wherein, the specific calculation formula of the Pearson correlation coefficient is as follows:
Figure BDA0002331485670000071
where ρ isx,yIs the Pearson correlation coefficient, xjTo remove the power of the j month after the abnormal power of the monthjIs equal to xjAnd the average air temperature of the month k is the number of the electric quantity of the rest months after the abnormal electric quantity of the small month is removed. Specifically, ρx,yIs taken to be [ -1, 1 [)]If px,yIf the temperature is less than 0, the electric quantity is reduced when the gas temperature is increased; if ρx,yIf the temperature is equal to 0, the air temperature and the electric quantity are independent; if ρx,yIf the temperature rise electric quantity is greater than 0, the air temperature rise electric quantity is also increased.
In consideration of the seasonal characteristics of the change in the amount of power consumption due to the change in temperature, the monthly average air temperature is divided into three sections (— infinity, 24), [24, 25], (25, + ∞), and the three sections are sequentially changed into a low season, a moderate season, and a high season, and the power cycle ratio average value of the monthly average air temperature is calculated.
In the specific implementation process of the invention, the historical appeal condition characteristics include: counting the number of incoming calls caused by sudden increase of the electric quantity of each user in the last year, counting the total number of incoming calls in the average temperature interval (- ∞, 24), [24, 25], (25, + ∞) of each month, counting whether the incoming call is caused by sudden increase of the electric quantity in the last month, counting the maximum number of incoming calls caused by sudden increase of the electric quantity in each month in the last year and counting whether the phenomenon of the incoming call caused by sudden increase of the electric quantity appears in history.
S13: after the characteristics are constructed, screening the constructed electric quantity level angle characteristics, electric quantity ring ratio and same ratio angle characteristics, air temperature influence degree characteristics and historical appeal condition characteristics to obtain screened characteristics;
in a specific implementation process of the present invention, after the feature is constructed, the constructed electric quantity level angle feature, electric quantity ring ratio and same ratio angle feature, air temperature influence degree feature and historical appeal condition feature are screened, and the obtained screened features include: after the characteristics are constructed, screening discrete variables in the constructed electric quantity level angle characteristics, electric quantity ring ratio and same ratio angle characteristics, air temperature influence degree characteristics and historical appeal condition characteristics based on a zero variance algorithm; after the screening, excluding high correlation between the discrete variables by using an algorithm of a Pearson correlation coefficient; after the high correlation among the discrete variables is eliminated, utilizing multiple collinearity to check, and eliminating the variables causing the multiple collinearity; and after the variables causing the multiple collinearity are eliminated, selecting important features in the electric quantity level angle feature, the electric quantity ring ratio and same ratio angle feature, the air temperature influence degree feature and the historical appeal condition feature based on a machine learning algorithm of a random forest.
S14: carrying out data balancing processing based on the screened features;
in a specific implementation process of the present invention, the performing data balancing processing based on the screened features includes: based on the screened features, calculating the distance from each sample a in the electric quantity sudden increase incoming call user to all samples in the electric quantity sudden increase incoming call user sample set by taking the Euclidean distance as a standard to obtain k neighbors of the electric quantity sudden increase incoming call user; setting a sampling ratio according to the sample unbalance ratio to determine a sampling multiplying power N, and randomly selecting a plurality of samples b from k neighbors of the obtained electric quantity sudden increase incoming call user; constructing a new sample c with each sample a in the power surge incoming call users respectively based on each randomly selected neighbor b; the new sample c is the result of the data balancing process, and the specific formula is as follows:
c=a+rand(0,1)*|a-b|。
specifically, data balancing processing is carried out based on the screened features, a SMOTE algorithm is adopted, and SMOTE synthesized by data is adopted to generate a new incoming call appeal user feature index; it should be noted that the SMOTE algorithm is a composite minority sampling technique, which is an improved scheme based on a random oversampling algorithm, and this technique is a common means for processing unbalanced data at present. The basic idea of the SMOTE algorithm is to analyze and simulate a few classes of samples and add new samples that are simulated manually into a dataset, so that the classes in the original data are not seriously unbalanced any more. The strategy of the algorithm is to randomly select a sample b from the nearest neighbor of each few class sample a, and then randomly select a point on the connecting line between a and b as a newly synthesized few class sample.
S15: constructing an Xgboost model based on the features after the data balancing processing;
in a specific implementation process of the present invention, the Xgboost model is constructed based on the features after the data balancing process, wherein a specific formula of an objective function of the Xgboost is as follows:
Obj(θ)=L(θ)+Ω(θ);
can be converted into:
Figure BDA0002331485670000091
wherein Obj (θ) is an objective function of Xgboost, L (θ) is an error function, and Ω (θ) is a regular term.
Specifically, in the boosting modeling process, the original model is kept unchanged, and then a new model is added to the prediction model, wherein the new model is as follows
Figure BDA0002331485670000092
Indicates the t-th wheelPredicting the value:
Figure BDA0002331485670000093
substituting the above equation into the original objective function, i.e.:
Figure BDA0002331485670000094
substituting the Taylor expansion into the objective function according to the definition of the Taylor expansion, the objective function is:
Figure BDA0002331485670000095
suppose that:
Figure BDA0002331485670000096
the above expression is:
Figure BDA0002331485670000097
after the constant term is removed, the optimization direction of the algorithm is only related to the first derivative and the second derivative of the error function of each point, and then the optimization direction of the algorithm is determined, namely the objective function is converted into:
Figure BDA0002331485670000101
from the prediction result, f is knownt(x)=ωq(x)Wherein f ist(x) Denotes the t-th tree, q (x) denotes the tree structure of the t-th tree, ωq(x)The weight of a leaf node represented on the tree structure is ω ═ RT,q∈RdD ∈ {1, 2,..., T }; the complexity function is generally set to
Figure BDA0002331485670000102
Substituting it into the objective function, then there are:
Figure BDA0002331485670000103
on the basis of the known tree structure, the target function is converted into the minimum value problem of a unitary quadratic function, namely, the optimal one can be obtained
Figure BDA0002331485670000104
Thereby obtaining an optimal solution of the objective function.
In a specific implementation, for the power surge demand warning of the electricity consumer, y-0 indicates that the call is not coming due to the increase of the power, y-1 indicates that the call is coming due to the increase of the power, x indicates the influence factor of whether the demand is satisfied, that is, the characteristic index x selected in the characteristic engineering is (x-x)1,x2,x3,...xm)。
It should be noted that Xgboost is a large-scale, distributed and general-purpose (Gradient Boosting) library developed by GBDT (Gradient reinforced Decision Tree), and implements GBDT and some generalized linear machine learning algorithms under the Gradient Boosting framework, and GBDT is a Decision Tree algorithm based on iterative accumulation, which constructs a set of weak learners (trees), and accumulates the results of multiple Decision trees as the final prediction output.
S16: and early warning the sudden increase of the electric quantity of the electricity utilization customer based on the Xgboost model.
In a specific implementation process of the present invention, the pre-warning of the sudden increase of the power consumption of the electricity consumer based on the Xgboost model includes: training based on the Xgboost model, and obtaining a training result; and early warning is carried out on the sudden increase of the electric quantity of the electricity consumption customer by combining the training result and the evaluation index AUC.
Specifically, the AUC value is defined as a graph area enclosed by an ROC curve and a transverse positive axis, and the ROC is a comprehensive index reflecting continuous variables of sensitivity and specificity, and is a correlation of sensitivity and specificity is disclosed by a composition method, which calculates a series of sensitivity and specificity by setting the continuous variables to a plurality of different thresholds, and then plots a curve with the sensitivity as an ordinate and the (1-specificity) as an abscissa; on the ROC curve, the point at the top left of the highest-going graph is the threshold for higher sensitivity and specificity.
In order to select a model which is good in electric quantity sudden increase demand early warning, after characteristic screening, sample balancing is carried out by utilizing a SMOTE technology, and logistic regression models, naive Bayes models, Xgboost models and combination models thereof are respectively used for training sample data of 3 months from 2019 in 1 month to 2019, so that electric quantity sudden increase demands of 2 months from 2019 in 4 months are early warned in sequence, wherein in the training and predicting process of the whole model, a certain sample quantity is randomly extracted from non-demand users on the basis of a bootstrap idea for 30 times of tests, the electric quantity sudden increase demand recognition capability of each model on different months is shown in a table 3, and the table 3 shows the AUC index distribution condition of different models for electric quantity sudden increase early warning.
TABLE 3 AUC index distribution of different models for electric quantity sudden increase warning
Figure BDA0002331485670000111
Note: the data in table 3 are AUC index values; the combined model is obtained by averaging the prediction results of the logistic regression model, the naive Bayes model and the Xgboost model.
As can be seen from table 3: (1) the standard deviation of each model is within the range of 0.0037-0.0212, so that the reasonable stability of index characteristics is embodied; (2) the Xgboost model has the most stable early warning capability on the sudden power increase appeal in each month, and the AUC of the Xgboost model is basically maintained at about 0.81, which shows that the generalization capability of the algorithm is strong. Further, in order to more clearly view the prediction effect of each model, the prediction results of the models are plotted as box line graphs, as shown in fig. 2, fig. 2 shows AUC evaluation index box line graphs of the early warning results of the different models on the electric quantity sudden appeal. Through the series of analysis, the characteristic indexes constructed from the fluctuation, the seasonality and the historical electric quantity sudden increase demand condition of the electric quantity can well measure the possibility of occurrence or non-occurrence of the electric quantity sudden increase demand, and the Xgboost model has the strongest and most stable early warning capability on the electric quantity sudden increase demand in the integral expression, can be well applied to the later electric quantity sudden increase demand early warning, so that relevant business personnel can do corresponding work in advance, and complaints of users caused by the electric quantity sudden increase are reduced.
In the implementation of the invention, on the basis of the power consumption of a client and the customer service work order data, external weather data is introduced, information such as the monthly power, whether incoming call appeal is met, the average temperature of the monthly power and the like is obtained through cleaning, multi-dimensional characteristic indexes are constructed from the aspects of fluctuation and seasonality of the power, the frequency of incoming call appeal, the power change level during appeal and the like by using a big data analysis technical means, the size and the relevance of information contained in each characteristic index and the influence degree on whether the appeal is met or not are considered, the information of the characteristic indexes is fully utilized, and an Xgboost model is used for early warning whether the power consumption client meets the requirements or not. The method can be effectively applied to the early warning of sudden increase of the electric quantity of the electricity consumption customers, so that relevant business personnel can do corresponding work in advance, and complaints of users caused by the sudden increase of the electric quantity are reduced.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
In addition, the above detailed description is provided for the method for warning sudden increase in electricity consumption of a customer according to the embodiment of the present invention, and a specific example is used herein to explain the principle and the implementation manner of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for early warning sudden increase of electric quantity of electricity customers is characterized by comprising the following steps:
acquiring electric quantity data, customer service work order data and weather data, and performing data cleaning on the acquired electric quantity data, customer service work order data and weather data to obtain data after data cleaning;
constructing an electric quantity level angle characteristic, an electric quantity ring ratio and same ratio angle characteristic, an air temperature influence degree characteristic and a historical appeal condition characteristic by using the data after data cleaning;
after the characteristics are constructed, screening the constructed electric quantity level angle characteristics, electric quantity ring ratio and same ratio angle characteristics, air temperature influence degree characteristics and historical appeal condition characteristics to obtain screened characteristics;
carrying out data balancing processing based on the screened features;
constructing an Xgboost model based on the features after the data balancing processing;
and early warning the sudden increase of the electric quantity of the electricity utilization customer based on the Xgboost model.
2. The method for warning sudden increase in electric quantity of a power consumer according to claim 1, wherein the step of acquiring electric quantity data, customer service work order data and weather data and performing data cleaning on the acquired electric quantity data, customer service work order data and weather data to obtain data after data cleaning comprises the steps of:
acquiring electric quantity data and customer service work order data based on a layered sampling mode, and acquiring weather data by a technical means;
performing data cleaning on the acquired electric quantity data, the customer service work order data and the weather data, wherein the data cleaning comprises filling of a monthly electric quantity missing value, identification of abnormal small monthly electric quantity, filling of incoming call missing and filling of a monthly average air temperature missing value;
and after the data cleaning is finished, obtaining the data after the data cleaning.
3. The method of claim 1, wherein the power consumption customer power surge pre-warning comprises: the average power consumption level, the fluctuation of the power and the variation of the power change establish monthly average power, monthly power variance, the variation coefficient of the power, the ratio of the maximum value to the minimum value of the monthly power, the ratio of the maximum value to the monthly average power, the ratio of the monthly average power to the monthly minimum value of the monthly power and the ratio of the monthly average power in the current month.
4. The method of claim 1, wherein the power loop ratio and geometric angle features comprise: the average value of the power ring ratio/same ratio, the variance of the power ring ratio/same ratio, the variation coefficient of the power ring ratio/same ratio, the maximum value of the power ring ratio/same ratio, the minimum value of the power ring ratio/same ratio, the current power ring ratio/same ratio, the previous power ring ratio/same ratio, the difference value of the current power ring ratio/same ratio and the average value of the power ring ratio, and the difference value of the previous power ring ratio and the average value of the power ring ratio.
5. The method for warning sudden increase of electric quantity of electricity customers according to claim 1, wherein the temperature influence degree characteristic is calculated through a Pearson correlation coefficient to obtain the correlation between the electric quantity of the rest months and the average temperature of the corresponding month, and further obtain the correlation degree between the temperature and the electricity consumption; wherein, the specific calculation formula of the Pearson correlation coefficient is as follows:
Figure FDA0002331485660000021
where ρ isx,yIs the Pearson correlation coefficient, xjTo remove the power of the j month after the abnormal power of the monthjIs equal to xjAnd the average air temperature of the month k is the number of the electric quantity of the rest months after the abnormal electric quantity of the small month is removed.
6. The method of claim 1, wherein the historical appeal characteristics comprise: counting the number of incoming calls of each user due to sudden increase of electric quantity in the last year, counting the total number of incoming calls in the average temperature interval of each month, counting whether the incoming calls are suddenly increased due to the electric quantity in the last month, counting the maximum number of the incoming calls due to sudden increase of the electric quantity in each month in the last year and counting whether the phenomenon of the incoming calls due to sudden increase of the electric quantity occurs in history.
7. The method for warning sudden increase in power consumption of a power consumer as claimed in claim 1, wherein after the feature is constructed, the constructed power level angle feature, power ring ratio and unity ratio angle feature, air temperature influence degree feature and historical appeal condition feature are screened, and the screened features comprise:
after the characteristics are constructed, screening discrete variables in the constructed electric quantity level angle characteristics, electric quantity ring ratio and same ratio angle characteristics, air temperature influence degree characteristics and historical appeal condition characteristics based on a zero variance algorithm;
after the screening, excluding high correlation between the discrete variables by using an algorithm of a Pearson correlation coefficient;
after the high correlation among the discrete variables is eliminated, utilizing multiple collinearity to check, and eliminating the variables causing the multiple collinearity;
and after the variables causing the multiple collinearity are eliminated, selecting important features in the electric quantity level angle feature, the electric quantity ring ratio and same ratio angle feature, the air temperature influence degree feature and the historical appeal condition feature based on a machine learning algorithm of a random forest.
8. The method of claim 1, wherein the performing data balancing based on the filtered features comprises:
based on the screened features, calculating the distance from each sample a in the electric quantity sudden increase incoming call user to all samples in the electric quantity sudden increase incoming call user sample set by taking the Euclidean distance as a standard to obtain k neighbors of the electric quantity sudden increase incoming call user;
setting a sampling ratio according to the sample unbalance ratio to determine a sampling multiplying power N, and randomly selecting a plurality of samples b from k neighbors of the obtained electric quantity sudden increase incoming call user;
constructing a new sample c with each sample a in the power surge incoming call users respectively based on each randomly selected neighbor b; the new sample c is the result of the data balancing process, and the specific formula is as follows:
c=a+rand(0,1)*|a-b|。
9. the method for early warning of sudden increase of electric quantity of electricity consumption customers according to claim 1, wherein an Xgboost model is constructed based on the features after the data balancing process, wherein an objective function of Xgboost is specifically formulated as follows:
Obj(θ)=L(θ)+Ω(θ);
can be converted into:
Figure FDA0002331485660000041
wherein Obj (θ) is an objective function of Xgboost, L (θ) is an error function, and Ω (θ) is a regular term.
10. The method for early warning of sudden power consumption customer power consumption according to claim 1, wherein the early warning of sudden power consumption customer power consumption based on the Xgboost model comprises:
training based on the Xgboost model, and obtaining a training result;
and early warning is carried out on the sudden increase of the electric quantity of the electricity consumption customer by combining the training result and the evaluation index AUC.
CN201911341505.3A 2019-12-23 2019-12-23 Method for early warning sudden increase of electric quantity of electricity consumption customer Active CN111178957B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911341505.3A CN111178957B (en) 2019-12-23 2019-12-23 Method for early warning sudden increase of electric quantity of electricity consumption customer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911341505.3A CN111178957B (en) 2019-12-23 2019-12-23 Method for early warning sudden increase of electric quantity of electricity consumption customer

Publications (2)

Publication Number Publication Date
CN111178957A true CN111178957A (en) 2020-05-19
CN111178957B CN111178957B (en) 2023-04-14

Family

ID=70657386

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911341505.3A Active CN111178957B (en) 2019-12-23 2019-12-23 Method for early warning sudden increase of electric quantity of electricity consumption customer

Country Status (1)

Country Link
CN (1) CN111178957B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035715A (en) * 2020-07-10 2020-12-04 广西电网有限责任公司 User label design method and device
CN113222245A (en) * 2021-05-11 2021-08-06 深圳供电局有限公司 Method and system for checking monthly electric quantity and electricity charge abnormity of residential user and storage medium
CN114565181A (en) * 2022-03-18 2022-05-31 广西电网有限责任公司南宁供电局 Method and device for predicting electric charge abnormal complaint risk
CN115456210A (en) * 2022-08-22 2022-12-09 国网浙江省电力有限公司杭州市临安区供电公司 Power utilization complaint early warning method based on cascade logistic regression Bayesian algorithm

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008430A (en) * 2014-05-29 2014-08-27 华北电力大学 Method for establishing virtual reality excavation dynamic smart load prediction models
CN105512768A (en) * 2015-12-14 2016-04-20 上海交通大学 User electricity consumption relevant factor identification and electricity consumption quantity prediction method under environment of big data
CN109165763A (en) * 2018-06-13 2019-01-08 广西电网有限责任公司电力科学研究院 A kind of potential complained appraisal procedure and device of 95598 customer service work order
CN109410089A (en) * 2018-12-29 2019-03-01 广州供电局有限公司 Low-voltage tripping and customer complaint prediction technique, device and storage medium
CN109727066A (en) * 2018-12-27 2019-05-07 浙江华云信息科技有限公司 A kind of big industrial electricity consumers load forecasting method based on XGBoost algorithm
CN109858674A (en) * 2018-12-27 2019-06-07 国网浙江省电力有限公司 Monthly load forecasting method based on XGBoost algorithm

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008430A (en) * 2014-05-29 2014-08-27 华北电力大学 Method for establishing virtual reality excavation dynamic smart load prediction models
CN105512768A (en) * 2015-12-14 2016-04-20 上海交通大学 User electricity consumption relevant factor identification and electricity consumption quantity prediction method under environment of big data
CN109165763A (en) * 2018-06-13 2019-01-08 广西电网有限责任公司电力科学研究院 A kind of potential complained appraisal procedure and device of 95598 customer service work order
CN109727066A (en) * 2018-12-27 2019-05-07 浙江华云信息科技有限公司 A kind of big industrial electricity consumers load forecasting method based on XGBoost algorithm
CN109858674A (en) * 2018-12-27 2019-06-07 国网浙江省电力有限公司 Monthly load forecasting method based on XGBoost algorithm
CN109410089A (en) * 2018-12-29 2019-03-01 广州供电局有限公司 Low-voltage tripping and customer complaint prediction technique, device and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
于巧梅等: "气温变化对用电负荷影响的分析", 《浙江气象》 *
余建平等: "基于大数据分析的电网增量负荷预测研究", 《机电信息》 *
王旭强等: "基于时序分解的用电负荷分析与预测", 《计算机工程与应用》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035715A (en) * 2020-07-10 2020-12-04 广西电网有限责任公司 User label design method and device
CN112035715B (en) * 2020-07-10 2023-04-14 广西电网有限责任公司 User label design method and device
CN113222245A (en) * 2021-05-11 2021-08-06 深圳供电局有限公司 Method and system for checking monthly electric quantity and electricity charge abnormity of residential user and storage medium
CN114565181A (en) * 2022-03-18 2022-05-31 广西电网有限责任公司南宁供电局 Method and device for predicting electric charge abnormal complaint risk
CN115456210A (en) * 2022-08-22 2022-12-09 国网浙江省电力有限公司杭州市临安区供电公司 Power utilization complaint early warning method based on cascade logistic regression Bayesian algorithm
CN115456210B (en) * 2022-08-22 2024-04-12 国网浙江省电力有限公司杭州市临安区供电公司 Power consumption complaint early warning method based on cascading logistic regression Bayesian algorithm

Also Published As

Publication number Publication date
CN111178957B (en) 2023-04-14

Similar Documents

Publication Publication Date Title
CN111178957B (en) Method for early warning sudden increase of electric quantity of electricity consumption customer
TWI759562B (en) Method and apparatus for identifying abnormal trading communities
CN107220732B (en) Power failure complaint risk prediction method based on gradient lifting tree
CN107563705A (en) Household electrical appliances product safety stock and the system and method ordered goods again are analyzed using big data
CN110738435A (en) distribution network project investment decision evaluation method
CN111898839B (en) Importance degree classification method and device for power users
CN111724039B (en) Recommendation method for recommending customer service personnel to power users
CN111178585A (en) Fault reporting amount prediction method based on multi-algorithm model fusion
CN109993380A (en) A kind of information processing method, device and computer readable storage medium
CN112884590A (en) Power grid enterprise financing decision method based on machine learning algorithm
CN117674119A (en) Power grid operation risk assessment method, device, computer equipment and storage medium
CN114862229A (en) Power quality evaluation method and device, computer equipment and storage medium
CN113902181A (en) Short-term prediction method and equipment for common variable heavy overload
CN117273457B (en) Method and system for carrying out month load prediction based on client image
CN112330030B (en) System and method for predicting requirements of expansion materials
Azadnia et al. Integration model of Fuzzy C means clustering algorithm and TOPSIS Method for Customer Lifetime Value Assessment
CN113919763A (en) Power grid disaster analysis method and device based on fuzzy evaluation matrix
CN113793170A (en) Second-hand car price prediction method based on neural network and LightGBM algorithm
CN117578434A (en) Power distribution network flexibility evaluation method and device considering flexible resource adjustability
Pessanha et al. Combining statistical clustering techniques and exploratory data analysis to compute typical daily load profiles-Application to the expansion and operational planning in Brazil
CN116402528A (en) Power data processing system
CN112039111A (en) Method and system for participating in peak regulation capacity of power grid by new energy microgrid
CN115796585A (en) Enterprise operation risk assessment method and system
Li et al. Distribution transformer mid-term heavy load and overload pre-warning based on logistic regression
CN113589034A (en) Electricity stealing detection method, device, equipment and medium for power distribution system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant