CN114139960A - Work order complaint risk pre-control method - Google Patents

Work order complaint risk pre-control method Download PDF

Info

Publication number
CN114139960A
CN114139960A CN202111457735.3A CN202111457735A CN114139960A CN 114139960 A CN114139960 A CN 114139960A CN 202111457735 A CN202111457735 A CN 202111457735A CN 114139960 A CN114139960 A CN 114139960A
Authority
CN
China
Prior art keywords
variables
regression model
data set
work order
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111457735.3A
Other languages
Chinese (zh)
Inventor
刘峰
罗玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Shusheng Data Technology Co ltd
Original Assignee
Anhui Shusheng Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Shusheng Data Technology Co ltd filed Critical Anhui Shusheng Data Technology Co ltd
Priority to CN202111457735.3A priority Critical patent/CN114139960A/en
Publication of CN114139960A publication Critical patent/CN114139960A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06311Scheduling, planning or task assignment for a person or group
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • G06Q30/015Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
    • G06Q30/016After-sales

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Complex Calculations (AREA)

Abstract

The invention relates to risk control, in particular to a work order complaint risk pre-control method, which comprises the steps of inputting a training data set and a testing data set, detecting data distribution conditions and counting information, preprocessing data, screening variables, deleting variables which have insignificant influence on a regression model or strong autocorrelation, binning the variables by using a chi-square method, carrying out WOE conversion on the variables, judging the stability of the variables, establishing the regression model, evaluating the accuracy of the regression model by using the training data set and the testing data set, converting the regression model into a standard score card, and outputting scores of all the variables after binning; the technical scheme provided by the invention can effectively overcome the defect that the client sensitivity cannot be distinguished according to the work order content in the prior art, so that targeted differentiated services cannot be provided for the client.

Description

Work order complaint risk pre-control method
Technical Field
The invention relates to risk control, in particular to a work order complaint risk pre-control method.
Background
The power grid is used for supplying power to users and plays an important role in safely and reliably supplying power to the users. Therefore, the method can distribute the power grid maintenance work orders in time and effectively process the work orders in time, and becomes an important factor for improving the maintenance efficiency and the power supply reliability of the power distribution network. The power grid maintenance work order is generated by the national power grid customer service system according to fault information provided by users after a power grid fails, and is finally distributed to operation and maintenance personnel at corresponding residence points of each region for guiding and recording documents of fault processing.
With the continuous development of national grid customer service services and the gradual increase of the artificial telephone traffic intensity, in order to further enhance understanding and analysis of the implicit characteristics and the requirements of users and improve the service level of a national grid customer service system, the requirements of customers and the requirements under typical scenes need to be effectively analyzed, so that the purposes of predicting the requirements of sensitive users and providing more attentive services are achieved.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects in the prior art, the invention provides a work order complaint risk pre-control method, which can effectively overcome the defect that the client sensitivity cannot be distinguished according to the content of the work order in the prior art, so that the targeted differentiated service cannot be provided for the client.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
a work order complaint risk pre-control method comprises the following steps:
s1, inputting a training data set and a testing data set, detecting the data distribution condition, counting information, and preprocessing the data;
s2, screening the variables, and deleting the variables which have no obvious influence on the regression model or have strong autocorrelation;
s3, carrying out box separation on the variables by using a chi-square method, carrying out WOE conversion on the variables, and judging the stability of the variables;
s4, establishing a regression model, and evaluating the accuracy of the regression model by using the training data set and the test data set;
and S5, converting the regression model into a standard score card, and outputting the scores of all the variables after being subjected to binning.
Preferably, in S1, the data distribution and statistics are detected, and the data is preprocessed, including:
and counting the relevant information of the missing proportion, the maximum value and the minimum value of the data, determining the length and the type of each field, and processing the field with abnormal data into normal data.
Preferably, the screening of the variables in S2 is performed to remove variables that have insignificant or strong autocorrelation on the regression model, including:
and calculating the information value of each variable, screening the variables according to the autocorrelation between the information value and the variables, and deleting the variables which have no significant influence on the regression model or strong autocorrelation by combining stepwise regression.
Preferably, the screening of the variables according to the information value and the autocorrelation among the variables includes:
variables with information value greater than 0.01 and autocorrelation coefficient between variables greater than 0.7 are retained.
Preferably, the step of binning the variables in S3 by using the chi-square method includes:
and observing the box separation result after box separation, and manually adjusting the box separation when the box separation result is not ideal.
Preferably, WOE conversion of the variables in S3 includes:
WOE conversion of variables was performed using the following formula:
Figure BDA0003387114670000021
wherein i is the ith bin of a certain characteristic, BadiBad tags in the ith bin, Bad tagsTTotal number of bad tags, GoodiGood for the number of Good tags in the ith binTIs the total number of good tags.
Preferably, the determining the stability of the variable in S3 includes:
calculating PSI value of each column of characteristics, and judging stability of the variable based on the PSI value.
Preferably, in S4, a regression model is established, and the accuracy of the regression model is evaluated using the training data set and the test data set, including:
and (3) establishing a regression model by using logistic regression, predicting the training data set and the test data set, respectively calculating a KS value and an AUC value of the two data sets, and evaluating the accuracy of the regression model.
(III) advantageous effects
Compared with the prior art, the work order complaint risk pre-control method provided by the invention is based on data of an electric power service work order system, a rating card model of each variable is established through variable screening and logistic regression modeling, the interpretation degree of different interval attributes of each field on the sensitivity of users is scientifically evaluated, and the method has good interpretation and prediction capabilities for new customers and new work orders, so that the sensitivity of each user can be scored and identified, service resource distribution and scheduling can be performed pertinently, timely response measures can be taken in case of emergency, the work order complaint pressure can be reduced, and the service quality can be effectively improved finally.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A work order complaint risk pre-control method is disclosed, as shown in figure 1, a training data set and a testing data set are input, data distribution conditions are detected, information is counted, and meanwhile data are pre-processed.
Wherein, detect data distribution condition and statistics information, carry out the preliminary treatment to data simultaneously, include:
and counting the relevant information of the missing proportion, the maximum value and the minimum value of the data, determining the length and the type of each field, and processing the field with abnormal data (abnormal value/missing value) into normal data.
According to the technical scheme, data of the electric power service work order system are divided into a training data set and a testing data set according to the proportion of 4:1, wherein the training data set is responsible for training a model, and the testing data set is responsible for testing the fitting effect of the trained model.
Screening variables, and deleting the variables which have no obvious influence on the regression model or have strong autocorrelation, wherein the variables specifically comprise the following steps:
and calculating the information value of each variable, screening the variables according to the information value and the autocorrelation among the variables, reserving the variables of which the information value is more than 0.01 and the autocorrelation coefficient among the variables is more than 0.7, and deleting the variables which have no obvious influence on the regression model or have strong autocorrelation by combining stepwise regression (bidirectional).
In the technical scheme of the application, the information value (namely, the IV value) is used for measuring the prediction capability of the variable, and the larger the IV value is, the stronger the prediction capability of the variable is.
And (4) carrying out box separation on the variable by using a chi-square method, carrying out WOE conversion on the variable, and judging the stability of the variable.
The method for binning variables by using a chi-square method comprises the following steps:
and observing the box separation result after box separation, and manually adjusting the box separation when the box separation result is not ideal.
Performing WOE conversion (namely evidence weight) on the variable, comprising the following steps:
WOE conversion of variables was performed using the following formula:
Figure BDA0003387114670000051
wherein i is the ith bin of a certain characteristic, BadiFor bad tags in the ith sub-boxAmount of (B), BadTTotal number of bad tags, GoodiGood for the number of Good tags in the ith binTIs the total number of good tags.
WOE is a form of encoding of the original arguments, requiring WOE transformation of both the training data set and the test data set.
Judging the stability of the variable, including:
calculating PSI value of each column of characteristics, and judging stability of the variable based on the PSI value.
The smaller the PSI value is, the smaller the difference of distribution between the two characteristics is, the more stable the variable is, and the PSI values of the variables in the model of the technical scheme are all smaller than 0.001.
Establishing a regression model, and evaluating the accuracy of the regression model by using a training data set and a test data set, wherein the method specifically comprises the following steps:
and (3) establishing a regression model by using logistic regression, predicting the training data set and the test data set, respectively calculating a KS value and an AUC value of the two data sets, and evaluating the accuracy of the regression model.
Wherein, the value range of KS is [0, 1], and the general convention is to multiply by 100%. Generally speaking, the larger the KS value is, the better the positive and negative sample distinguishing degree is, and the KS value in 41-50% represents that the model has good and bad sample distinguishing capability; KS values between 51% and 60% represent a strong ability of the model to discriminate between good and bad samples. The KS value of the model obtained by utilizing the training data set is 49.92%, the KS value of the model obtained by utilizing the testing data set is 53.65%, and the sample distinguishing degree of the model is reflected to be strong.
The AUC means the probability that when a positive sample and a negative sample are randomly chosen, the positive sample is ranked before the negative sample according to the score calculated by the current classifier. The AUC value is generally between 0.5 and 1, the closer the AUC value is to 1, the higher the authenticity of the detection method is, and the better the fitting effect of the model is. The AUC value of the model obtained by utilizing the training data set is 81.63%, and the AUC value of the model obtained by utilizing the testing data set is 81.92%, which shows that the model has a better classification effect.
And converting the regression model into a standard score card, and outputting the scores of all the variables after binning. The interpretation degree of different interval attributes of each field on the sensitivity of the user is scientifically evaluated, and the interpretation and prediction capabilities are good for new customers and new work orders, so that the sensitivity of each user can be scored and identified, service resource allocation and scheduling can be performed pertinently, response measures can be made in time in case of emergency, the complaint pressure of the work orders can be reduced, and the service quality can be effectively improved finally.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (8)

1. A work order complaint risk pre-control method is characterized by comprising the following steps: the method comprises the following steps:
s1, inputting a training data set and a testing data set, detecting the data distribution condition, counting information, and preprocessing the data;
s2, screening the variables, and deleting the variables which have no obvious influence on the regression model or have strong autocorrelation;
s3, carrying out box separation on the variables by using a chi-square method, carrying out WOE conversion on the variables, and judging the stability of the variables;
s4, establishing a regression model, and evaluating the accuracy of the regression model by using the training data set and the test data set;
and S5, converting the regression model into a standard score card, and outputting the scores of all the variables after being subjected to binning.
2. The work order complaint risk prediction method of claim 1, characterized in that: in S1, detecting data distribution and counting information, and preprocessing data, including:
and counting the relevant information of the missing proportion, the maximum value and the minimum value of the data, determining the length and the type of each field, and processing the field with abnormal data into normal data.
3. The work order complaint risk prediction method of claim 1, characterized in that: in S2, the variables are screened, and the variables that do not significantly affect the regression model or have strong autocorrelation are deleted, including:
and calculating the information value of each variable, screening the variables according to the autocorrelation between the information value and the variables, and deleting the variables which have no significant influence on the regression model or strong autocorrelation by combining stepwise regression.
4. The work order complaint risk prediction method of claim 3, characterized in that: the screening of the variables according to the autocorrelation between the information value and the variables comprises the following steps:
variables with information value greater than 0.01 and autocorrelation coefficient between variables greater than 0.7 are retained.
5. The work order complaint risk prediction method of claim 1, characterized in that: in S3, binning variables using a chi-square method includes:
and observing the box separation result after box separation, and manually adjusting the box separation when the box separation result is not ideal.
6. The work order complaint risk prediction method of claim 5, characterized in that: in S3, WOE conversion is performed on variables including:
WOE conversion of variables was performed using the following formula:
Figure FDA0003387114660000021
wherein i is the ith bin of a certain characteristic, BadiBad tags in the ith bin, Bad tagsTTotal number of bad tags, GoodiIs at the ith minuteNumber of Good labels in the box, GoodTIs the total number of good tags.
7. The work order complaint risk prediction method of claim 6, characterized in that: and S3, judging the stability of the variables, including:
calculating PSI value of each column of characteristics, and judging stability of the variable based on the PSI value.
8. The work order complaint risk prediction method of claim 1, characterized in that: establishing a regression model in S4, and evaluating the accuracy of the regression model by using the training data set and the test data set, wherein the method comprises the following steps:
and (3) establishing a regression model by using logistic regression, predicting the training data set and the test data set, respectively calculating a KS value and an AUC value of the two data sets, and evaluating the accuracy of the regression model.
CN202111457735.3A 2021-12-01 2021-12-01 Work order complaint risk pre-control method Pending CN114139960A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111457735.3A CN114139960A (en) 2021-12-01 2021-12-01 Work order complaint risk pre-control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111457735.3A CN114139960A (en) 2021-12-01 2021-12-01 Work order complaint risk pre-control method

Publications (1)

Publication Number Publication Date
CN114139960A true CN114139960A (en) 2022-03-04

Family

ID=80386615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111457735.3A Pending CN114139960A (en) 2021-12-01 2021-12-01 Work order complaint risk pre-control method

Country Status (1)

Country Link
CN (1) CN114139960A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106600455A (en) * 2016-11-25 2017-04-26 国网河南省电力公司电力科学研究院 Electric charge sensitivity assessment method based on logistic regression
CN107392479A (en) * 2017-07-27 2017-11-24 国网河南省电力公司电力科学研究院 The power customer power failure susceptibility scorecard implementation of logic-based regression model
CN109636591A (en) * 2018-12-28 2019-04-16 浙江工业大学 A kind of credit scoring card development approach based on machine learning
CN113435627A (en) * 2021-05-27 2021-09-24 国网冀北电力有限公司计量中心 Work order track information-based electric power customer complaint prediction method and device
CN113469536A (en) * 2021-07-06 2021-10-01 云南电网有限责任公司 Power supply service customer complaint risk grade identification method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106600455A (en) * 2016-11-25 2017-04-26 国网河南省电力公司电力科学研究院 Electric charge sensitivity assessment method based on logistic regression
CN107392479A (en) * 2017-07-27 2017-11-24 国网河南省电力公司电力科学研究院 The power customer power failure susceptibility scorecard implementation of logic-based regression model
CN109636591A (en) * 2018-12-28 2019-04-16 浙江工业大学 A kind of credit scoring card development approach based on machine learning
CN113435627A (en) * 2021-05-27 2021-09-24 国网冀北电力有限公司计量中心 Work order track information-based electric power customer complaint prediction method and device
CN113469536A (en) * 2021-07-06 2021-10-01 云南电网有限责任公司 Power supply service customer complaint risk grade identification method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
崔成泉等: "《大数据 大文化》", vol. 1, 31 October 2014, 云南大学出版社, pages: 122 - 126 *
王青天等: "《Python大数据分析与机器学习商业案例实战》", vol. 1, 30 June 2020, 机械工业出版社, pages: 136 - 140 *

Similar Documents

Publication Publication Date Title
Wang et al. Feature trend extraction and adaptive density peaks search for intelligent fault diagnosis of machines
CN109583680B (en) Power stealing identification method based on support vector machine
CN111738364A (en) Electricity stealing detection method based on combination of user load and electricity consumption parameter
CN114235825B (en) Steel wire rope quality detection method based on computer vision
CN107862467A (en) A kind of electric network synthetic data target monitoring method and system based on big data platform
CN110930198A (en) Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment
CN111539585B (en) Random forest-based power customer appeal sensitivity supervision and early warning method
CN116382217A (en) Intelligent operation and maintenance monitoring system for production line
CN113434575B (en) Data attribution processing method, device and storage medium based on data warehouse
CN113869721A (en) Substation equipment health state classification method and apparatus
CN114638688A (en) Interception strategy derivation method and system for credit anti-fraud
CN112417763A (en) Defect diagnosis method, device and equipment for power transmission line and storage medium
CN116796271A (en) Resident energy abnormality identification method
CN113362044B (en) Method for improving approval efficiency process based on automobile retail
CN112488361B (en) Transformer area low voltage prediction method and device based on big data
CN110320337A (en) A kind of automatic evaluating system of strip surface quality and method
CN114169709A (en) State evaluation method and device for secondary equipment of transformer substation, storage medium and equipment
CN114022205A (en) Power consumer payment channel preference matching method and system based on improved clustering method
CN114139960A (en) Work order complaint risk pre-control method
CN117196630A (en) Transaction risk prediction method, device, terminal equipment and storage medium
CN116502084A (en) Risk identification method and device, storage medium and electronic equipment
CN115760246A (en) Storefront satisfaction intelligent analysis method and system based on Internet of things
CN112734208B (en) Fire coal acceptance monitoring device, method, equipment and readable storage medium
CN113919937A (en) KS monitoring system based on loan assessment wind control
CN112348066A (en) Line uninterrupted power rating evaluation method based on gray clustering algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination