CN114139960A - Work order complaint risk pre-control method - Google Patents
Work order complaint risk pre-control method Download PDFInfo
- Publication number
- CN114139960A CN114139960A CN202111457735.3A CN202111457735A CN114139960A CN 114139960 A CN114139960 A CN 114139960A CN 202111457735 A CN202111457735 A CN 202111457735A CN 114139960 A CN114139960 A CN 114139960A
- Authority
- CN
- China
- Prior art keywords
- variables
- regression model
- data set
- work order
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000012360 testing method Methods 0.000 claims abstract description 19
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000012216 screening Methods 0.000 claims abstract description 11
- 238000006243 chemical reaction Methods 0.000 claims abstract description 10
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 238000000926 separation method Methods 0.000 claims description 15
- 230000002159 abnormal effect Effects 0.000 claims description 4
- 238000007477 logistic regression Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 230000000717 retained effect Effects 0.000 claims description 2
- 230000035945 sensitivity Effects 0.000 abstract description 6
- 230000007547 defect Effects 0.000 abstract description 3
- 238000012954 risk control Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 description 4
- 238000012423 maintenance Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06311—Scheduling, planning or task assignment for a person or group
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06395—Quality analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
- G06Q30/015—Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
- G06Q30/016—After-sales
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Theoretical Computer Science (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Complex Calculations (AREA)
Abstract
The invention relates to risk control, in particular to a work order complaint risk pre-control method, which comprises the steps of inputting a training data set and a testing data set, detecting data distribution conditions and counting information, preprocessing data, screening variables, deleting variables which have insignificant influence on a regression model or strong autocorrelation, binning the variables by using a chi-square method, carrying out WOE conversion on the variables, judging the stability of the variables, establishing the regression model, evaluating the accuracy of the regression model by using the training data set and the testing data set, converting the regression model into a standard score card, and outputting scores of all the variables after binning; the technical scheme provided by the invention can effectively overcome the defect that the client sensitivity cannot be distinguished according to the work order content in the prior art, so that targeted differentiated services cannot be provided for the client.
Description
Technical Field
The invention relates to risk control, in particular to a work order complaint risk pre-control method.
Background
The power grid is used for supplying power to users and plays an important role in safely and reliably supplying power to the users. Therefore, the method can distribute the power grid maintenance work orders in time and effectively process the work orders in time, and becomes an important factor for improving the maintenance efficiency and the power supply reliability of the power distribution network. The power grid maintenance work order is generated by the national power grid customer service system according to fault information provided by users after a power grid fails, and is finally distributed to operation and maintenance personnel at corresponding residence points of each region for guiding and recording documents of fault processing.
With the continuous development of national grid customer service services and the gradual increase of the artificial telephone traffic intensity, in order to further enhance understanding and analysis of the implicit characteristics and the requirements of users and improve the service level of a national grid customer service system, the requirements of customers and the requirements under typical scenes need to be effectively analyzed, so that the purposes of predicting the requirements of sensitive users and providing more attentive services are achieved.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects in the prior art, the invention provides a work order complaint risk pre-control method, which can effectively overcome the defect that the client sensitivity cannot be distinguished according to the content of the work order in the prior art, so that the targeted differentiated service cannot be provided for the client.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
a work order complaint risk pre-control method comprises the following steps:
s1, inputting a training data set and a testing data set, detecting the data distribution condition, counting information, and preprocessing the data;
s2, screening the variables, and deleting the variables which have no obvious influence on the regression model or have strong autocorrelation;
s3, carrying out box separation on the variables by using a chi-square method, carrying out WOE conversion on the variables, and judging the stability of the variables;
s4, establishing a regression model, and evaluating the accuracy of the regression model by using the training data set and the test data set;
and S5, converting the regression model into a standard score card, and outputting the scores of all the variables after being subjected to binning.
Preferably, in S1, the data distribution and statistics are detected, and the data is preprocessed, including:
and counting the relevant information of the missing proportion, the maximum value and the minimum value of the data, determining the length and the type of each field, and processing the field with abnormal data into normal data.
Preferably, the screening of the variables in S2 is performed to remove variables that have insignificant or strong autocorrelation on the regression model, including:
and calculating the information value of each variable, screening the variables according to the autocorrelation between the information value and the variables, and deleting the variables which have no significant influence on the regression model or strong autocorrelation by combining stepwise regression.
Preferably, the screening of the variables according to the information value and the autocorrelation among the variables includes:
variables with information value greater than 0.01 and autocorrelation coefficient between variables greater than 0.7 are retained.
Preferably, the step of binning the variables in S3 by using the chi-square method includes:
and observing the box separation result after box separation, and manually adjusting the box separation when the box separation result is not ideal.
Preferably, WOE conversion of the variables in S3 includes:
WOE conversion of variables was performed using the following formula:
wherein i is the ith bin of a certain characteristic, BadiBad tags in the ith bin, Bad tagsTTotal number of bad tags, GoodiGood for the number of Good tags in the ith binTIs the total number of good tags.
Preferably, the determining the stability of the variable in S3 includes:
calculating PSI value of each column of characteristics, and judging stability of the variable based on the PSI value.
Preferably, in S4, a regression model is established, and the accuracy of the regression model is evaluated using the training data set and the test data set, including:
and (3) establishing a regression model by using logistic regression, predicting the training data set and the test data set, respectively calculating a KS value and an AUC value of the two data sets, and evaluating the accuracy of the regression model.
(III) advantageous effects
Compared with the prior art, the work order complaint risk pre-control method provided by the invention is based on data of an electric power service work order system, a rating card model of each variable is established through variable screening and logistic regression modeling, the interpretation degree of different interval attributes of each field on the sensitivity of users is scientifically evaluated, and the method has good interpretation and prediction capabilities for new customers and new work orders, so that the sensitivity of each user can be scored and identified, service resource distribution and scheduling can be performed pertinently, timely response measures can be taken in case of emergency, the work order complaint pressure can be reduced, and the service quality can be effectively improved finally.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A work order complaint risk pre-control method is disclosed, as shown in figure 1, a training data set and a testing data set are input, data distribution conditions are detected, information is counted, and meanwhile data are pre-processed.
Wherein, detect data distribution condition and statistics information, carry out the preliminary treatment to data simultaneously, include:
and counting the relevant information of the missing proportion, the maximum value and the minimum value of the data, determining the length and the type of each field, and processing the field with abnormal data (abnormal value/missing value) into normal data.
According to the technical scheme, data of the electric power service work order system are divided into a training data set and a testing data set according to the proportion of 4:1, wherein the training data set is responsible for training a model, and the testing data set is responsible for testing the fitting effect of the trained model.
Screening variables, and deleting the variables which have no obvious influence on the regression model or have strong autocorrelation, wherein the variables specifically comprise the following steps:
and calculating the information value of each variable, screening the variables according to the information value and the autocorrelation among the variables, reserving the variables of which the information value is more than 0.01 and the autocorrelation coefficient among the variables is more than 0.7, and deleting the variables which have no obvious influence on the regression model or have strong autocorrelation by combining stepwise regression (bidirectional).
In the technical scheme of the application, the information value (namely, the IV value) is used for measuring the prediction capability of the variable, and the larger the IV value is, the stronger the prediction capability of the variable is.
And (4) carrying out box separation on the variable by using a chi-square method, carrying out WOE conversion on the variable, and judging the stability of the variable.
The method for binning variables by using a chi-square method comprises the following steps:
and observing the box separation result after box separation, and manually adjusting the box separation when the box separation result is not ideal.
Performing WOE conversion (namely evidence weight) on the variable, comprising the following steps:
WOE conversion of variables was performed using the following formula:
wherein i is the ith bin of a certain characteristic, BadiFor bad tags in the ith sub-boxAmount of (B), BadTTotal number of bad tags, GoodiGood for the number of Good tags in the ith binTIs the total number of good tags.
WOE is a form of encoding of the original arguments, requiring WOE transformation of both the training data set and the test data set.
Judging the stability of the variable, including:
calculating PSI value of each column of characteristics, and judging stability of the variable based on the PSI value.
The smaller the PSI value is, the smaller the difference of distribution between the two characteristics is, the more stable the variable is, and the PSI values of the variables in the model of the technical scheme are all smaller than 0.001.
Establishing a regression model, and evaluating the accuracy of the regression model by using a training data set and a test data set, wherein the method specifically comprises the following steps:
and (3) establishing a regression model by using logistic regression, predicting the training data set and the test data set, respectively calculating a KS value and an AUC value of the two data sets, and evaluating the accuracy of the regression model.
Wherein, the value range of KS is [0, 1], and the general convention is to multiply by 100%. Generally speaking, the larger the KS value is, the better the positive and negative sample distinguishing degree is, and the KS value in 41-50% represents that the model has good and bad sample distinguishing capability; KS values between 51% and 60% represent a strong ability of the model to discriminate between good and bad samples. The KS value of the model obtained by utilizing the training data set is 49.92%, the KS value of the model obtained by utilizing the testing data set is 53.65%, and the sample distinguishing degree of the model is reflected to be strong.
The AUC means the probability that when a positive sample and a negative sample are randomly chosen, the positive sample is ranked before the negative sample according to the score calculated by the current classifier. The AUC value is generally between 0.5 and 1, the closer the AUC value is to 1, the higher the authenticity of the detection method is, and the better the fitting effect of the model is. The AUC value of the model obtained by utilizing the training data set is 81.63%, and the AUC value of the model obtained by utilizing the testing data set is 81.92%, which shows that the model has a better classification effect.
And converting the regression model into a standard score card, and outputting the scores of all the variables after binning. The interpretation degree of different interval attributes of each field on the sensitivity of the user is scientifically evaluated, and the interpretation and prediction capabilities are good for new customers and new work orders, so that the sensitivity of each user can be scored and identified, service resource allocation and scheduling can be performed pertinently, response measures can be made in time in case of emergency, the complaint pressure of the work orders can be reduced, and the service quality can be effectively improved finally.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.
Claims (8)
1. A work order complaint risk pre-control method is characterized by comprising the following steps: the method comprises the following steps:
s1, inputting a training data set and a testing data set, detecting the data distribution condition, counting information, and preprocessing the data;
s2, screening the variables, and deleting the variables which have no obvious influence on the regression model or have strong autocorrelation;
s3, carrying out box separation on the variables by using a chi-square method, carrying out WOE conversion on the variables, and judging the stability of the variables;
s4, establishing a regression model, and evaluating the accuracy of the regression model by using the training data set and the test data set;
and S5, converting the regression model into a standard score card, and outputting the scores of all the variables after being subjected to binning.
2. The work order complaint risk prediction method of claim 1, characterized in that: in S1, detecting data distribution and counting information, and preprocessing data, including:
and counting the relevant information of the missing proportion, the maximum value and the minimum value of the data, determining the length and the type of each field, and processing the field with abnormal data into normal data.
3. The work order complaint risk prediction method of claim 1, characterized in that: in S2, the variables are screened, and the variables that do not significantly affect the regression model or have strong autocorrelation are deleted, including:
and calculating the information value of each variable, screening the variables according to the autocorrelation between the information value and the variables, and deleting the variables which have no significant influence on the regression model or strong autocorrelation by combining stepwise regression.
4. The work order complaint risk prediction method of claim 3, characterized in that: the screening of the variables according to the autocorrelation between the information value and the variables comprises the following steps:
variables with information value greater than 0.01 and autocorrelation coefficient between variables greater than 0.7 are retained.
5. The work order complaint risk prediction method of claim 1, characterized in that: in S3, binning variables using a chi-square method includes:
and observing the box separation result after box separation, and manually adjusting the box separation when the box separation result is not ideal.
6. The work order complaint risk prediction method of claim 5, characterized in that: in S3, WOE conversion is performed on variables including:
WOE conversion of variables was performed using the following formula:
wherein i is the ith bin of a certain characteristic, BadiBad tags in the ith bin, Bad tagsTTotal number of bad tags, GoodiIs at the ith minuteNumber of Good labels in the box, GoodTIs the total number of good tags.
7. The work order complaint risk prediction method of claim 6, characterized in that: and S3, judging the stability of the variables, including:
calculating PSI value of each column of characteristics, and judging stability of the variable based on the PSI value.
8. The work order complaint risk prediction method of claim 1, characterized in that: establishing a regression model in S4, and evaluating the accuracy of the regression model by using the training data set and the test data set, wherein the method comprises the following steps:
and (3) establishing a regression model by using logistic regression, predicting the training data set and the test data set, respectively calculating a KS value and an AUC value of the two data sets, and evaluating the accuracy of the regression model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111457735.3A CN114139960A (en) | 2021-12-01 | 2021-12-01 | Work order complaint risk pre-control method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111457735.3A CN114139960A (en) | 2021-12-01 | 2021-12-01 | Work order complaint risk pre-control method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114139960A true CN114139960A (en) | 2022-03-04 |
Family
ID=80386615
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111457735.3A Pending CN114139960A (en) | 2021-12-01 | 2021-12-01 | Work order complaint risk pre-control method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114139960A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106600455A (en) * | 2016-11-25 | 2017-04-26 | 国网河南省电力公司电力科学研究院 | Electric charge sensitivity assessment method based on logistic regression |
CN107392479A (en) * | 2017-07-27 | 2017-11-24 | 国网河南省电力公司电力科学研究院 | The power customer power failure susceptibility scorecard implementation of logic-based regression model |
CN109636591A (en) * | 2018-12-28 | 2019-04-16 | 浙江工业大学 | A kind of credit scoring card development approach based on machine learning |
CN113435627A (en) * | 2021-05-27 | 2021-09-24 | 国网冀北电力有限公司计量中心 | Work order track information-based electric power customer complaint prediction method and device |
CN113469536A (en) * | 2021-07-06 | 2021-10-01 | 云南电网有限责任公司 | Power supply service customer complaint risk grade identification method |
-
2021
- 2021-12-01 CN CN202111457735.3A patent/CN114139960A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106600455A (en) * | 2016-11-25 | 2017-04-26 | 国网河南省电力公司电力科学研究院 | Electric charge sensitivity assessment method based on logistic regression |
CN107392479A (en) * | 2017-07-27 | 2017-11-24 | 国网河南省电力公司电力科学研究院 | The power customer power failure susceptibility scorecard implementation of logic-based regression model |
CN109636591A (en) * | 2018-12-28 | 2019-04-16 | 浙江工业大学 | A kind of credit scoring card development approach based on machine learning |
CN113435627A (en) * | 2021-05-27 | 2021-09-24 | 国网冀北电力有限公司计量中心 | Work order track information-based electric power customer complaint prediction method and device |
CN113469536A (en) * | 2021-07-06 | 2021-10-01 | 云南电网有限责任公司 | Power supply service customer complaint risk grade identification method |
Non-Patent Citations (2)
Title |
---|
崔成泉等: "《大数据 大文化》", vol. 1, 31 October 2014, 云南大学出版社, pages: 122 - 126 * |
王青天等: "《Python大数据分析与机器学习商业案例实战》", vol. 1, 30 June 2020, 机械工业出版社, pages: 136 - 140 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Feature trend extraction and adaptive density peaks search for intelligent fault diagnosis of machines | |
CN109583680B (en) | Power stealing identification method based on support vector machine | |
CN111738364A (en) | Electricity stealing detection method based on combination of user load and electricity consumption parameter | |
CN114235825B (en) | Steel wire rope quality detection method based on computer vision | |
CN107862467A (en) | A kind of electric network synthetic data target monitoring method and system based on big data platform | |
CN110930198A (en) | Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment | |
CN111539585B (en) | Random forest-based power customer appeal sensitivity supervision and early warning method | |
CN116382217A (en) | Intelligent operation and maintenance monitoring system for production line | |
CN113434575B (en) | Data attribution processing method, device and storage medium based on data warehouse | |
CN113869721A (en) | Substation equipment health state classification method and apparatus | |
CN114638688A (en) | Interception strategy derivation method and system for credit anti-fraud | |
CN112417763A (en) | Defect diagnosis method, device and equipment for power transmission line and storage medium | |
CN116796271A (en) | Resident energy abnormality identification method | |
CN113362044B (en) | Method for improving approval efficiency process based on automobile retail | |
CN112488361B (en) | Transformer area low voltage prediction method and device based on big data | |
CN110320337A (en) | A kind of automatic evaluating system of strip surface quality and method | |
CN114169709A (en) | State evaluation method and device for secondary equipment of transformer substation, storage medium and equipment | |
CN114022205A (en) | Power consumer payment channel preference matching method and system based on improved clustering method | |
CN114139960A (en) | Work order complaint risk pre-control method | |
CN117196630A (en) | Transaction risk prediction method, device, terminal equipment and storage medium | |
CN116502084A (en) | Risk identification method and device, storage medium and electronic equipment | |
CN115760246A (en) | Storefront satisfaction intelligent analysis method and system based on Internet of things | |
CN112734208B (en) | Fire coal acceptance monitoring device, method, equipment and readable storage medium | |
CN113919937A (en) | KS monitoring system based on loan assessment wind control | |
CN112348066A (en) | Line uninterrupted power rating evaluation method based on gray clustering algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |