CN109754281A

CN109754281A - A kind of supplier's attrition prediction method

Info

Publication number: CN109754281A
Application number: CN201811397492.7A
Authority: CN
Inventors: 须峰; 张福斌; 宋安平; 施海鹰; 李传中
Original assignee: Construction Network Technology (shanghai) Co Ltd
Current assignee: Construction Network Technology (shanghai) Co Ltd
Priority date: 2018-11-22
Filing date: 2018-11-22
Publication date: 2019-05-14
Anticipated expiration: 2038-11-22
Also published as: CN109754281B

Abstract

The present invention relates to a kind of supplier's attrition prediction methods, first according to the demand of practical problem, are integrated to the data of platform itself, determine the feature for being lost supplier；Secondly unbalanced dataset is sampled using MBCDK-means lack sampling method, unbalanced data is converted into equilibrium data collection；Then Genetic Artificial Neural Network method is utilized, equilibrium data collection is predicted；Finally, output prediction result.The present invention can carry out Accurate Prediction to the loss of supplier.

Description

A kind of supplier's attrition prediction method

Technical field

The present invention relates to supplier's attrition prediction technical fields, more particularly to a kind of supplier's attrition prediction method.

Background technique

Customer churn prediction is a major issue in customer relation management.In increasingly competitive business environment now In, client can easily switch between rival.It is some studies have shown that obtain the cost of new client usually more existing than retaining There are expensive 5 to 6 times of the cost of client.Meanwhile long-term customers are lower to the susceptibility of competitive marketing activity, profit is higher.In addition, The loss of client not only results in revenue losses, also results in brand loyalty decline and influences the morale of company.Therefore, public Emphasis is transferred to reservation existing customer group from new client is developed by department.Accurate customer churn prediction, which will facilitate company, to be closed Suitable client navigates in reserved-range, therefore is acknowledged as the marketing matter of priority.

Customer churn means that client's subsidiary company cancels service, and the existing customer for retaining enterprise plays an important role, to increase Add the overall income of company, and the company status being retained in market with keen competition.The reason of leading to customer churn, has very much, So determining these the reason is that considerably complicated, because they depend on personal view and the company that is utilizing of client of client Service, but determine customer churn be necessary again.

Summary of the invention

Technical problem to be solved by the invention is to provide a kind of supplier's attrition prediction methods, can be to the stream of supplier It loses and carries out Accurate Prediction.

The technical solution adopted by the present invention to solve the technical problems is: providing a kind of supplier's attrition prediction method, wraps Include following steps:

(1) supplier of acquisition reflection platform is lost the related data of feature；

(2) collected unbalanced dataset is divided into uneven training dataset and uneven test data set；

(3) balance training data set is converted for uneven training dataset using MBCDK-means lack sampling method；

(4) prediction model is established using Genetic Artificial Neural Network method；

(5) uneven test data set collection is predicted using the prediction model, exports prediction result.

It includes: that qualification certificates, company's type, registered capital, registration information are complete that supplier, which is lost feature, in the step (1) It is whole degree, concern bidding documents number, nearest attitude, company's qualification, service quality, product quality, delivery rate, credibility, in Mark number, bid number, agreed-upon price number, login times, reasonable price degree and contract agreement fulfillment rate.

Further include that pretreated step is carried out to the data of acquisition between the step (1) and step (2), specifically include: Data integration is carried out to data；Data are cleaned, including removing noise and deleting inconsistent data；Data are become It changes, including construction new feature and data normalization.

The step (3) specifically includes following sub-step:

(31) uneven training dataset is divided into M most class samples and N number of minority class sample；

(32) initialization cluster number K；

(33) K class is polymerized to using K-means algorithm to M most class samples, a kind of composition is polymerized to N number of minority class sample Minority class subset；

(34) cluster centre of classes most for i-th, calculates the distance of its cluster centre for arriving minority classWherein, X_iIndicate the cluster center of i-th of cluster, X_NIndicate the cluster center of minority class；

(35) average distance of the most class cluster centers of calculating to minority class cluster center

(36) sample is selected to constitute most class subsets from each cluster of most classes, wherein sample size ism_iIndicate the sample number in the cluster of i-th of most class cluster；

(37) most class subsets and minority class subset are constituted into balance training data set.

The step (4) specifically includes following sub-step:

(41) n individual of the first generation is randomly generated；

(42) n neural network is initialized；

(43) training neural network；

(44) whether the neural network after training of judgement reaches setting target, goes to step (45) if not reaching, otherwise turns Step (46)；

(45) genetic operation is carried out, duplication, intersection and variation including chromosome obtain n individual of new generation, and return Step (42)；

(46) optimal neural network is selected；

(47) according to optimal neural network Genetic Neural Network Predictive Model.

Beneficial effect

Due to the adoption of the above technical solution, compared with prior art, the present invention having the following advantages that and actively imitating Fruit: the present invention is selected with most classes to the distance at minority class cluster center when sampling according to the sample distribution quantity in cluster Number of samples is taken, retains the distributed intelligence of initial data cluster and improves boundary sample sample rate simultaneously, help to improve final classification Performance；The present invention uses MBCDK-means lack sampling method, reduces time and the space of existing K-means lack sampling algorithm Complexity；Present invention introduces weights and biasing that genetic algorithm carrys out optimized artificial neural network, construct genetic neural network mould Type has better estimated performance compared to artificial neural network.

Detailed description of the invention

Fig. 1 is flow chart of the invention；

Fig. 2 is to convert balance instruction for uneven training dataset using MBCDK-means lack sampling method in the present invention Practice the flow chart of data set；

Fig. 3 is the flow chart for establishing prediction model in the present invention using Genetic Artificial Neural Network method.

Specific embodiment

Present invention will be further explained below with reference to specific examples.It should be understood that these embodiments are merely to illustrate the present invention Rather than it limits the scope of the invention.In addition, it should also be understood that, after reading the content taught by the present invention, those skilled in the art Member can make various changes or modifications the present invention, and such equivalent forms equally fall within the application the appended claims and limited Range.

Embodiments of the present invention are related to a kind of supplier's attrition prediction method, first according to the demand of practical problem, knot The data of platform itself are closed, determine the feature for being lost supplier；Secondly using MBCDK-means lack sampling method to imbalance Data set is sampled, and unbalanced data is converted to equilibrium data collection；Then Genetic Artificial Neural Network method is utilized, to flat Weighing apparatus data set is predicted；Finally, output prediction result.As shown in Figure 1, the specific steps of which are as follows:

Step A, according to the demand of practical problem, the data of platform itself are integrated to, determine the loss feature of supplier；This Embodiment is for building supplier, wherein the loss feature of determining supplier includes: qualification certificates, company's type, note Volume fund, registration information integrity degree, pay close attention to bidding documents number and nearest attitude (such as nearest two months, four months, Half a year), company's qualification, service quality, product quality, delivery rate, credibility, acceptance of the bid number, bid number, agreed-upon price number, Login times, reasonable price degree, contract agreement fulfillment rate.

Step B, reflect that supplier is lost the related data of feature on acquisition platform；

Step C, data prediction is carried out to the data of acquisition, specifically included；

C1, data integration is carried out to data；

C2, data are cleaned, including removing noise and deleting inconsistent data；

C3, data are converted, including construction new feature and data normalization.

Step D, unbalanced dataset is divided into uneven training dataset and uneven test data set；

Step E, balance training data are converted for uneven training dataset using MBCDK-means lack sampling method Collection, as shown in Fig. 2, specifically including:

E1, training set is divided into M most class samples and N number of minority class sample；

E2, initialization cluster number K；

E3, K class is polymerized to using K-means algorithm to most class samples；

E4, a kind of composition minority class subset is polymerized to minority class sample；

The cluster centre of E5, class most for i-th, calculate the distance of its cluster centre for arriving minority classWherein X_iIndicate the cluster center of i-th of cluster, X_NIndicate the cluster center of minority class；

E6, calculate most class cluster centers to minority class cluster center average distance

E7, for i-th of most class, the sample number selected in suchm_iIndicate i-th of most class Sample number in the cluster of cluster；

E8, the most class subsets of sampling composition are carried out from each most classes according to sample number；

E9, most class subsets and minority class subset are constituted into balance training collection.

Step F, using Genetic Artificial Neural Network method, prediction model is established；

F1, n individual of the first generation is randomly generated；

N F2, initialization neural network；

F3, training neural network；

F4, judge whether to reach setting target, go to step F5 if not reaching, otherwise go to step F8；

F5, chromosome replication；

F6, chiasma；

F7, chromosomal variation, go to step F2；

F8, selection optimal neural network；

F9, Genetic Neural Network Predictive Model is obtained.

Model parameter in present embodiment is provided that

The number of iterations	1000
		Learning rate	0.05
Target error	0.0001
		Population Size N	40
Evolutionary generation T	100
		Crossover probability P_c	0.8
Mutation probability P_m	0.02

Step G, test set is predicted, exports prediction result.

The present invention is when sampling, according to the distance of sample distribution quantity and most classes to minority class cluster center in cluster It chooses number of samples, retains the distributed intelligence of initial data cluster and improve boundary sample sample rate simultaneously, help to improve final Classification performance；The present invention use MBCDK-means lack sampling method, reduce existing K-means lack sampling algorithm time and Space complexity；Present invention introduces weights and biasing that genetic algorithm carrys out optimized artificial neural network, construct genetic neural network Network model has better estimated performance compared to artificial neural network.

Claims

1. a kind of supplier's attrition prediction method, which comprises the following steps:

2. supplier's attrition prediction method according to claim 1, which is characterized in that supply commodity-circulate in the step (1) Losing feature includes: qualification certificates, company's type, registered capital, registration information integrity degree, concern bidding documents number, nearest service state Degree, acceptance of the bid number, bid number, agreed-upon price number, logs in company's qualification, service quality, product quality, delivery rate, credibility Number, reasonable price degree and contract agreement fulfillment rate.

3. supplier's attrition prediction method according to claim 1, which is characterized in that the step (1) and step (2) it Between further include that pretreated step is carried out to the data of acquisition, specifically include: data integration carried out to data；Data are carried out clear It washes, including removing noise and deleting inconsistent data；Data are converted, including construction new feature and data normalizing Change.

4. supplier's attrition prediction method according to claim 1, which is characterized in that the step (3) specifically include with Lower sub-step:

(32) initialization cluster number K；

(33) K class is polymerized to using K-means algorithm to M most class samples, it is a small number of to be polymerized to a kind of composition to N number of minority class sample Class subset；

5. supplier's attrition prediction method according to claim 1, which is characterized in that the step (4) specifically include with Lower sub-step:

(41) n individual of the first generation is randomly generated；

(42) n neural network is initialized；

(43) training neural network；

(44) whether the neural network after training of judgement reaches setting target, goes to step (45) if not reaching, otherwise goes to step (46)；

(46) optimal neural network is selected；